Audit the boundary
Confirm normalization, grapheme handling, private-use rejection, evidence traces, and public sequence validity.
Implementation plan
A phased path from English-first approximate conversion toward a governed, glyph-aware semantic interpreter.
The converter already has a coherent public-symbol boundary and evidence-first posture. The next weakness is meaning that depends too much on registry rows, lexical descriptors, and sparse fallback concepts.
A glyph semantics layer can deepen interpretation without relaxing public governance.
Do not replace the public-symbol rule. Add an internal kernel that reasons over geometry, composition, embeddings, ontology constraints, and validation traces before rendering public output.
| Failure mode | Mechanism | Upgrade response |
|---|---|---|
| Token registry sparsity | Meanings depend on available rows or seed concepts. | Add visual, structural, and semantic retrieval lanes with fallback provenance. |
| Rendering coupling | Public symbol selection dominates internal interpretation. | Keep public output separate from internal SVG and ontology evidence. |
| Lexical bias | Pipeline starts from English segments before glyph-first reasoning. | Support glyph-in, meaning-out parsing with primitive decomposition. |
| Vector compression | One embedding cannot carry all visual, structural, and semantic signals. | Use multiple vector spaces and late fusion. |
| Observability gap | Scores explain nearest neighbors, not structural reasoning. | Expose primitive ablations, ontology checks, and drift metrics. |
Confirm normalization, grapheme handling, private-use rejection, evidence traces, and public sequence validity.
Define public-token operators, roles, arity, precedence, glosses, and warnings for compact shorthand.
Add records for surface, SVG hash, render profile, primitives, ontology tags, and provenance.
Store separate visual, structure, semantic, and ontology vectors. Fuse late under constraints.
Track phase-lock, drift, human comprehension, regression tests, and review budget.
Keep the existing converter facade stable, then add one internal interpretation service and one diagnostics object.
{
"input": "?=汝⟡→=",
"mode": "glyph-first",
"returnEvidence": true
}
{
"canonical": "Question(Becomes(Modified(YOU), UNKNOWN))",
"bestGloss": "What are you becoming?",
"confidence": 0.68,
"warnings": [
"right-side target omitted",
"state modifier ambiguous"
]
}