Self-Governing Codebases
In modern engineering, documentation is a promise that is rarely kept. Architectural models and wiki pages look pristine on Day One, but the moment developers merge the first pull request, physical code paths shift, signatures change, and the map decays. Here is how we solved document drift by engineering a self-documenting system driven by **Recursive Language Models (RLMs)**.
The Three Pillars of Self-Governance
The Ontology
A strongly-typed, formal model representing the codebase's entities, properties, and relationships. It enforces semantic rules (e.g. which files implement which concepts), serving as the authoritative ruleset of your software architecture.
The RLM
An autonomous, recursive agent loop executing dynamically inside a sandboxed VM. Instead of making blind, one-shot predictions, the RLM writes modular inspection scripts to explore files and auto-correct errors based on compiler feedback.
Combined Synergy
Ontology feeds the RLM its structural reality, and the RLM acts as the automated curator. Validations check code against active policies, and local, zero-token AST sync hooks instantly record file line movements at zero cost.
The Power of Recursive Language Models (RLMs)
Traditional codebase ingestion strategies scale poorly. Ingesting an entire repository of raw source files into an LLM window quickly exhausts context budgets, incurs high API costs, and limits structural reasoning.
A Recursive Language Model (RLM) changes this paradigm by running a closed execution loop inside an isolated environment. Instead of predicting everything in a single, blind shot, the RLM harness executes bare Go statements dynamically inside a sandboxed interpreter (Yaegi). It uses system code-analysis helpers (like ListFiles(), ParseFileSymbols(), and FindReferences()) to investigate the filesystem iteratively and construct its database step-by-step.
The orchestrator calls recursive sub-queries using Query() for complex tasks, executes edits, matches schemas, runs validations (DroverFsck()), and terminates cleanly only when the validation returns zero violations.
đź”’ Secure Sandboxing & Execution Safety
Executing code generated dynamically by an AI can introduce host vulnerabilities. To make this production-ready, we implemented a layered execution sandbox:
- Unsafe Package Stripping: Standard packages that allow direct access to disk or host environments (
os,os/exec,syscall, andnet/*network calls) are completely removed from the Yaegi standard catalog. The AI can only call sandboxed data mutation helpers. - Deterministic Timeouts: Every VM call and query is capped via strict timeouts to guarantee the execution loop never hangs or blocks indefinitely.
- Delta Ingestion Mode: In Delta Mode, the system runs a local
git status --porcelainscanner, loading *only* modified or newly introduced file comparison blocks. This reduces prompt footprints by **99%** (down to 61 KB) and saves massive API costs.
đź’° Zero-Token AST Synchronization
AI calls are expensive. We don't query an LLM when developer code simply shifts around. We engineered a lightweight, local JS syncing utility (sync-ast-lineages.js) and a Git pre-commit hook that parses symbols locally, matches modified signatures, and automatically synchronizes codebase lineages inside graph.jsonl in under 0.2 seconds—costing absolutely nothing.
🤖 Continuous Integration Policy Gates
Documentation is only valuable if it is strictly enforced. We built a reusable composite GitHub Action CI check. Every time a developer opens a Pull Request, the validator tests the graph constraints:
- Are all new
Termobjects governed by a curation role (governed_by)? - Does the taxonomy hierarchy remain acyclic (
broader_term)? - Do approved plans have the required reviews mapped in the SQLite projection?
If any constraint is violated, the check fails, PR annotations are raised, and merging is blocked—preventing architectural decay before code ever reaches production.
Deploy Governed Ingestion Loops
Ready to eliminate codebase drift and enforce architectural policies at scale? Deploy the local visualizer and deep-link your design models directly into VS Code or Cursor natively.
BOOK_FREE_CONSULTATION