Dictionary JSON
The dictionary JSON shape is designed to be simple enough for scripts and notebooks, but stable enough to import/export through the governance API later.
Minimal shape
Section titled “Minimal shape”{ "profile_name": "infra_incidents", "profile_description": "Infrastructure incident terminology", "terms": [ { "canonical_value": "kubernetes", "slot": "TOOL", "aliases": [ "k8s", { "value": "kube", "confidence": 0.95 } ] } ], "profile_stop_list": [ { "value": "tmp", "target": "alias", "reason": "too generic" } ], "global_stop_list": [ { "value": "unknown", "target": "both", "reason": "global noise" } ]}A term defines a canonical value and its semantic slot.
{ "canonical_value": "postgresql", "slot": "DATABASE", "aliases": ["pg", "postgres", "psql"]}Aliases
Section titled “Aliases”Aliases can be strings or objects with metadata.
"k8s"{ "value": "kube", "confidence": 0.95 }Stop lists
Section titled “Stop lists”Stop lists reduce noisy matches. Use them for short values, generic words, or terms that look like aliases but are too ambiguous in your corpus.
{ "value": "tmp", "target": "alias", "reason": "too generic" }Recommended workflow
Section titled “Recommended workflow”- Start with a small dictionary.
- Validate it locally.
- Run extraction on representative documents.
- Add stop-list entries for false positives.
- Import the dictionary into the governance platform when the UI/API workflow is ready.