~35%
Fewer tokens than Rust for AI generation
Architecture Deep Dive
This is the technical architecture document for FORMA. It covers the contract system, MIR intermediate representation, verification toolchain integration, and the roadmap to formal verification for critical systems.
Build Provenance
FORMA's implementation was largely implemented with Claude Code & Opus 4.6, with code review, regression-oriented evaluation, and architectural critique performed using OpenAI GPT-5.3-Codex. This project intentionally uses a multi-model workflow: high-throughput generation paired with independent review and validation.
~35%
Fewer tokens than Rust for AI generation
Zero
Lifetime annotations—memory safety without complexity
@pre/@post
First-class contracts with quantifiers
SMT-ready
MIR architecture designed for formal proofs
The Problem
This is the core engineering challenge FORMA was built to solve. When AI generates thousands of lines of systems code, traditional code review becomes a bottleneck. You need machine-verifiable guarantees.
AI can generate syntactically correct code that passes type checking but violates semantic invariants. You can't review 10,000 lines of generated code line-by-line. You need to specify what the code should do, then verify that it does.
FORMA solution: Contract expressions that capture intent, verification tools that check behavior.
In repository-level translation benchmarks, 94.8% of AI failures targeting Rust are compilation errors—with dependency resolution (unresolved imports, missing methods) accounting for 61.9% of failures. Rust's strict compiler catches errors early, but its complexity—lifetimes, trait bounds, module paths—creates a high barrier for AI-generated code.
FORMA solution: Second-class references (no lifetime syntax), batteries-included stdlib (no dependency maze), JSON error output for AI self-repair.
When AI-generated code fails, the error messages need to be machine-readable for automated correction. Human-readable diagnostics don't help an AI agent self-repair.
FORMA solution: JSON error output, structured diagnostics, grammar export for constrained generation.
Runtime testing catches bugs after the fact. For critical systems—medical, financial, aerospace—you need static guarantees before deployment. Property-based testing helps, but formal verification is the goal.
FORMA solution: Contract system designed for eventual SMT solver integration and formal proofs.
Contract System Architecture
FORMA's contract system is the foundation of its verification story. Contracts capture what code should do in a form that's both human-readable and machine-checkable.
FORMA contracts support a rich expression language designed for specifying program behavior precisely.
# Basic preconditions and postconditions
@pre(n >= 0)
@post(result >= 1)
f factorial(n: Int) -> Int
# Old expressions - capture values at function entry
@post(result == old(balance) + amount)
f deposit(amount: Int) -> Int
# Quantifiers - express properties over collections
@pre(forall i in 0..len(arr)-1: arr[i] <= arr[i+1]) # sorted input
@post(exists i in 0..len(arr): arr[i] == target) # element found
# Multiple preconditions for conditional constraints
@pre(amount > 0)
@pre(balance >= amount)
f withdraw(amount: Int) -> Int!Str
# Membership testing
@pre(status in ["active", "pending", "closed"])
f update_status(status: Str) -> Bool
Common verification patterns are captured as reusable named contracts. This reduces specification boilerplate and establishes a shared vocabulary for common guarantees.
# Built-in patterns (35 total, 6 categories)
# Numeric: @positive @nonnegative @nonzero @even @odd @divisible @bounded @in_range
# Collection: @nonempty @contains @all_positive @all_nonnegative @all_nonzero @valid_index @valid_range
# Set: @subset @superset @disjoint @equals @same_length @permutation
# Sequence: @prefix @suffix @reversed @rotated @unique
# Ordering: @sorted @sorted_desc @strictly_sorted @strictly_sorted_desc @sorted_by @partitioned @stable
# State: @unchanged @pure
# Usage
@nonempty(items)
@sorted(result)
@permutation(items, result)
f verified_sort(items: [Int]) -> [Int]
sort_ints(items)
Current implementation: Runtime contract checking with property-based test generation.
How it works:
forma verify generates random inputs satisfying preconditions--seed) for reproducible CI runsWhy runtime first: Property-based testing provides immediate value while the formal verification infrastructure is built. The contract language is designed to be forward-compatible with SMT solver backends.
# Property-based verification with 1000 test cases
forma verify src/math.forma --examples 1000 --seed 42
# JSON output for CI integration
forma verify src --report --format json
Contracts are first-class AST nodes, not comments or annotations that get stripped. This enables:
forma fmt# Internal AST representation (simplified)
Contract {
kind: Pre | Post | Invariant,
expression: Expr,
pattern_name: Option<String>, // For @sorted, @pure, etc.
span: SourceSpan,
}
MIR Architecture
FORMA's Mid-level Intermediate Representation (MIR) is a control-flow graph designed for both execution and analysis. It's the foundation for future formal verification integration.
Why MIR matters for verification: High-level source code is too complex for direct analysis. MIR normalizes control flow into a form suitable for both interpretation and symbolic execution.
# MIR for a simple function (conceptual representation)
f abs(x: Int) -> Int:
bb0:
_1 = x
switchInt(_1 < 0) -> [true: bb1, false: bb2]
bb1:
_2 = Neg(_1)
goto -> bb3
bb2:
_2 = _1
goto -> bb3
bb3:
return _2
Key properties:
FORMA's compilation pipeline produces a single MIR representation that serves both execution and verification:
┌─────────────────────────────────────────────────────────────────────┐
│ FORMA Source Code │
│ (with @pre/@post contracts) │
└───────────────────────────────┬─────────────────────────────────────┘
│ Parse
▼
┌─────────────────────────────────────────────────────────────────────┐
│ Typed AST │
│ (contracts are first-class AST nodes) │
└───────────────────────────────┬─────────────────────────────────────┘
│ Lower
▼
┌─────────────────────────────────────────────────────────────────────┐
│ MIR (CFG) │
│ (explicit control flow, ownership tracking) │
└───────────────────────────────┬─────────────────────────────────────┘
│ Optimize
▼
┌─────────────────────────────────────────────────────────────────────┐
│ Optimized MIR │
│ (constant fold, copy propagation, dead blocks, peephole) │
└──────────────┬────────────────────────────────────┬─────────────────┘
│ │
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ Interpreter │ │ VCGen (planned) │
│ (forma run/verify) │ │ (SMT export) │
└──────────────────────┘ └──────────────────────┘
Key insight: The same MIR serves runtime execution today and formal verification tomorrow. An optimization pass runs between lowering and execution, eliminating redundant temporaries and simplifying control flow. No new IR needed—just a new backend that generates SMT queries instead of executing code.
Verification Toolchain
FORMA's verification toolchain is designed for integration into AI-assisted development workflows. Every tool produces machine-readable output suitable for automated pipelines.
forma explain — Contract TranslationTranslates contract expressions into human-readable or machine-readable explanations. This bridges the gap between formal specifications and code review.
# Human-readable explanation
$ forma explain math.forma --format human
┌─ factorial(n: Int) -> Int
│ Requires:
│ - n is at least 0
│ Guarantees:
│ - result is at least 1
└─
┌─ binary_search(arr: [Int], target: Int) -> Int?
│ Requires:
│ - [@sorted] for every i in 0..arr.len() - 1,
│ arr[i] is at most arr[i + 1]
└─
# JSON output for tooling integration
$ forma explain math.forma --format json
{
"functions": [{
"name": "factorial",
"signature": "factorial(n: Int) -> Int",
"preconditions": [{
"expression": "n >= 0",
"english": "n is at least 0"
}],
"postconditions": [{
"expression": "result >= 1",
"english": "result is at least 1"
}]
}]
}
Use case: AI generates code with contracts. Human reviews the explain output
to verify that the contracts match intent, without reading implementation details.
forma verify — Property-Based VerificationRuns property-based testing against contract specifications. Generates inputs satisfying preconditions, executes functions, and checks postconditions.
# Report-style verification with deterministic seed
$ forma verify src/ --report --format human --seed 42
FORMA Verification Report
src/math.forma
✓ PASS factorial contracts:2 examples:20/20
✓ PASS binary_search contracts:2 examples:20/20
✗ FAIL quicksort contracts:3 examples:18/20
Summary
Functions: 3
Verified: 2, Failures: 1
Examples passed: 58/60
# CI-friendly JSON report
$ forma verify src --report --format json --examples 100 --seed 42
{
"files": [{"path": "src/math.forma", "functions": [
{"name": "factorial", "status": "PASS",
"contract_count": 2, "examples_run": 100,
"examples_passed": 100, "issues": []},
{"name": "quicksort", "status": "FAIL",
"contract_count": 3, "examples_run": 100,
"examples_passed": 97, "issues": [...]}
]}],
"summary": {
"total_functions": 3, "verified": 2,
"failures": 1, "total_examples": 300,
"examples_passed": 297
}
}
Key features:
forma grammar — Constrained GenerationExports FORMA's grammar in formats suitable for constrained decoding in LLM pipelines.
# EBNF format for parser generators
$ forma grammar --format ebnf > forma.ebnf
# JSON format for LLM constrained decoding
$ forma grammar --format json > forma-grammar.json
# Integration with constrained decoding
# (pseudocode for LLM pipeline)
grammar = load_grammar("forma-grammar.json")
completion = llm.generate(
prompt="Write a sorting function in FORMA",
grammar_constraint=grammar
)
Why this matters: Constrained decoding ensures AI-generated code is always syntactically valid. Combined with contract verification, this creates a closed loop: generate valid syntax, verify semantic correctness.
forma check --error-format json — Machine-Readable DiagnosticsStructured error output enables AI agents to self-correct without parsing human-readable messages.
$ forma check broken.forma --error-format json
{
"success": false,
"errors": [{
"file": "broken.forma",
"line": 15,
"column": 12,
"end_line": 15,
"end_column": 18,
"severity": "error",
"code": "TYPE",
"message": "type mismatch: expected Int, found Str"
}]
}
Fields designed for AI consumption:
line/column: Exact location for targeted fixesmessage: Human-readable description with expected/found typescode: Stable error category (TYPE, PARSE, MODULE) for pattern matchingsuccess: Quick pass/fail check for automationExport grammar, feed to AI. Generation is syntactically valid by construction.
AI writes code with contracts specifying behavior.
Review contracts in English. Understand promises without reading implementation.
Run verification report. Get PASS/FAIL for every contract.
Trust the code because you verified the specification.
Path to Formal Verification
FORMA's architecture was designed from day one with formal verification as the end goal. Here's where we are, what's next, and why this matters for critical systems.
The foundation for formal verification is in place:
forall, exists), old() expressions, membership testing@sorted, @permutation, @pure—reusable specificationsStatus: The specification language and IR infrastructure are in active development. Property-based testing provides immediate value while formal verification is developed.
The path from property testing to formal proofs requires three components:
Transform MIR + contracts into logical formulas. For each function:
# Source
@pre(n >= 0)
@post(result >= 1)
f factorial(n: Int) -> Int
# Generated verification condition (simplified)
∀n: Int.
(n >= 0) → # precondition assumed
(factorial(n) >= 1) # postcondition to prove
Integration with Z3 or CVC5 to discharge verification conditions:
Embed specifications in the type system itself:
# Future syntax concept
type PosInt = Int where self > 0
type SortedList = [Int] where @sorted
f factorial(n: Nat) -> PosInt # Nat = Int where self >= 0
The architecture we've built doesn't just support runtime verification—it's structured to enable formal proofs if and when they're needed. This is a potential direction, not a promise.
The MIR representation and contract system are already compatible with SMT solver integration. If formal verification becomes a priority, the path is clear:
For most use cases, the current property-based verification is sufficient. But for teams that need mathematical guarantees—medical, financial, aerospace—the foundation is ready.
AI code generation is accelerating. The question isn't whether AI will write critical systems code—it's whether that code will be verifiable.
Core Architectural Decisions
The problem: In translation benchmarks, 94.8% of AI failures targeting Rust are compilation errors. While dependency resolution dominates (61.9%), lifetime and ownership errors add a second layer of complexity that AI systems struggle with.
The solution: References can be passed and used locally, but cannot be stored in data structures or returned from functions. This eliminates the need for lifetime annotations while preserving memory safety.
The tradeoff: Some patterns (self-referential structs, arena allocators) require different idioms. For AI-generated code, this is a net win—the common case is dramatically simpler.
The problem: Languages that treat specifications as comments lose them during compilation. This makes verification an afterthought rather than a first-class concern.
The solution: Contracts are parsed into the AST, type-checked against function signatures, and preserved through all compilation stages.
The payoff: Contracts can be extracted for documentation, checked for consistency, translated to verification conditions, and round-tripped through the formatter.
The problem: Multiple backends (interpreter, LLVM) risk semantic drift. High-level AST is too complex for direct analysis.
The solution: A single MIR representation that both backends consume. All semantic analysis (borrow checking, type checking, contract extraction) happens before MIR generation.
The payoff: One IR to analyze, one IR to verify, one IR to execute. Future formal verification operates on the same representation as the production runtime.
The problem: Native compilation is slow to iterate on. Language evolution requires fast feedback.
The solution: forma run uses the interpreter for development velocity.
forma build targets LLVM for production performance.
The payoff: Fast iteration during language development, native performance when it matters. The MIR-based architecture means both paths have identical semantics.
Development Timeline
Phase 1-2
Complete
Core syntax, type system, memory model, modules, async/await, HTTP, sockets, FFI, LSP server.
Phase 3-4
Complete
Pattern matching fixes, import resolution, formatter, REPL, MIR robustness, CLI ergonomics, LLVM pipeline.
Phase 5-6
Complete
FFI safety layer, capability security model, panic hardening, transitive imports, CI coverage expansion.
Phase 7-8
Complete
Full LSP features (formatting, symbols, signature help, references), JSON error contracts,
contract language expansion (old(), quantifiers, named patterns),
forma explain and forma verify --report.
Phase 9-10
Complete
35 named contract patterns across 6 categories, runtime contract helpers,
pattern correctness fixes (@rotated, @stable),
MIR optimization pass (constant folding, copy propagation, dead block elimination, peephole),
public repo hardening (CI supply-chain, action pinning, security docs),
stdlib completeness (higher-order builtins: map/filter/reduce/any/all,
vec_sort, str_to_float, math functions, stdlib modules).
Future
Planned
VCGen implementation, SMT solver backend (Z3/CVC5), proof certificate export, refinement types exploration.
Design Inspirations
FORMA synthesizes ideas from several language traditions, combining them in a way that's optimized for AI code generation and verification.
What we took: Memory safety without garbage collection. Ownership semantics. Algebraic data types and pattern matching. Systems-level performance through LLVM.
What we changed: Eliminated lifetime annotations entirely. Second-class references provide memory safety with dramatically simpler syntax—exactly what AI code generation needs.
What we took: Indentation-based block structure. Clean, readable syntax. Implicit variable declaration.
What we changed: Added static typing and compile-time safety. Kept the visual clarity while enabling the performance and correctness guarantees Python can't provide.
What we took: First-class contract expressions. @pre/@post annotations. The philosophy that specifications belong in the language, not in comments.
What we changed: Designed contracts for AI consumption: machine-readable output, property-based testing as immediate value, SMT integration as the future goal.
What we took: Pipeline operators for data transformation. Functional patterns. Strong type inference.
What we changed: Made pipelines the primary composition mechanism, reducing token count and making data flow explicit and left-to-right readable.
FORMA isn't just a remix of existing languages. The combination creates something new: a systems language with built-in verification that AI can actually generate correctly. Each design choice—from second-class references to pipeline syntax to contract expressions—was made with a single question in mind: does this help AI write correct code?
Competitive Landscape
| Capability | FORMA | Rust | Mojo | Dafny |
|---|---|---|---|---|
| Memory safe | ✓ | ✓ | ✓ | ✓ |
| No lifetime annotations | ✓ | ✗ | ✓ | ✓ |
| Contract expressions (@pre/@post) | ✓ | ✗ | ✗ | ✓ |
| Grammar export for constrained decoding | ✓ | ✗ | ✗ | ✗ |
| JSON diagnostics for self-correction | ✓ | ✓ | Partial | Partial |
| Contract explanation (explain command) | ✓ | ✗ | ✗ | ✗ |
| Automated verification reports | ✓ | ✗ | ✗ | Partial |
| Capability-based sandboxing | ✓ | ✗ | ✗ | ✗ |
| Native compilation | ✓ | ✓ | ✓ | ✗ |
FORMA's unique position: Combines systems-language capability (like Rust) with verification tooling (like Dafny), while being designed from the ground up for AI code generation workflows.
FORMA is not just another programming language. It's infrastructure for a future where AI generates critical systems code and humans verify it through specifications, not line-by-line review. The contract system, MIR architecture, and verification toolchain are the foundation. SMT integration is the next milestone. Formally verified AI-generated code is the destination.