Phase 3: Classifier Engine

The Classifier Engine receives parsed signatures from Phase 2 and determines what changed. It compares old vs new signatures using bucketed classification rules and assigns a severity to each change: breaking, warning, or safe.


Classification flow

For each file's ParseResult, the engine iterates every signature key across both old and new maps. There are three cases:

A
Symbol deleted

Key exists in old but not new. Always classified as BREAKING. The symbol was removed from the public API.

|
B
Symbol added

Key exists in new but not old. Always classified as SAFE. The API surface expanded.

|
C
Symbol changed

Key exists in both. Deep-equal check, then run through rule engine if signatures differ.

Deep equality short-circuit

Before running any rules, the engine performs a deepStrictEqual check on the old and new signatures. If they are identical, the symbol is skipped entirely. This is a massive performance optimization for files where only implementation changed (function body rewritten) but the API surface stayed the same.

// If signatures are structurally identical, skip all rules
if (isDeepStrictEqual(oldSig, newSig)) continue;

// Otherwise, run through the rule engine
const violations = this.runRules(key, oldSig, newSig, ruleBuckets);

Bucketed rule routing

The engine pre-computes four rule buckets at startup — one per symbol type. When a symbol change is detected, only the rules in the matching bucket execute. This is O(1) routing instead of O(n) filtering:

classifier/engine.ts
// Pre-computed ONCE per file — not per symbol
const allRules = Object.values(rules) as Rule<any>[];
const activeRules = allRules.filter(r =>
  r.languages === 'all' || r.languages.includes(language)
);

const ruleBuckets = {
  function:   activeRules.filter(r => r.target === 'function'),
  interface:  activeRules.filter(r => r.target === 'interface'),
  enum:       activeRules.filter(r => r.target === 'enum'),
  type_alias: activeRules.filter(r => r.target === 'type_alias'),
};

Routing uses the key prefix convention established by the AST Mapper:

// O(1) routing — no iteration through all rules
if (key.startsWith('interface:')) rulesToRun = buckets.interface;
else if (key.startsWith('enum:'))  rulesToRun = buckets.enum;
else if (key.startsWith('type:'))  rulesToRun = buckets.type_alias;
else                               rulesToRun = buckets.function;

The rule contract

Every classification rule implements a strict contract defined inclassifier/types.ts. The engine executes rules without knowing their internal logic:

classifier/types.ts
interface Rule<T extends AnySignature> {
  id: string;            // e.g., 'R01'
  name: string;          // e.g., 'Parameter Removed'
  description: string;   // For documentation

  // Which languages this rule applies to
  languages: Language[] | 'all';

  // Which symbol type this rule processes
  target: 'function' | 'interface' | 'enum' | 'type_alias';

  // The core logic — receives old and new signatures
  check: (oldSig: T, newSig: T) => RuleResult | RuleResult[] | null;
}

interface RuleResult {
  severity: 'breaking' | 'warning' | 'safe';
  changeType: ChangeType;
  message: string;
}

Rules can return:

  • null — the rule passed, no violation
  • A single RuleResult — one violation found
  • An array of RuleResult[] — multiple violations (e.g., R25 per-property)

Language filtering

Each rule specifies which languages it applies to. Rules like R15 (Overload Removed) only apply to TypeScript and Java because other languages do not have function overloads. The engine filters rules by language before bucketing:

// Rule definition
export const overloadRemovedRule: FunctionRule = {
  id: 'R15',
  name: 'Overload Removed',
  languages: ['typescript', 'java'],  // Only TS and Java
  target: 'function',
  // ...
};

// Engine: only loads this rule for .ts and .java files
const activeRules = allRules.filter(r =>
  r.languages === 'all' || r.languages.includes(language)
);

Output: FunctionChange[]

core/types.ts
interface FunctionChange {
  id: string;               // "src/api/payments.ts:processPayment:42"
  name: string;             // "processPayment"
  file: string;             // "src/api/payments.ts"
  lineStart: number;        // 42
  lineEnd: number;          // 42
  language: Language;        // "typescript"
  symbolType: 'function' | 'interface' | 'enum' | 'type_alias';
  severity: Severity;       // "breaking"
  changeType: ChangeType;   // "signature_change"
  breaking: boolean;        // true
  message: string;          // "Parameter 'currency' was removed."
  before: AnySignature | null;
  after: AnySignature | null;
  callers: CallerInfo[];    // Populated by Phase 4
}

Results are sorted by line number (ascending) for deterministic output.

Next phase

Breaking changes are passed to Phase 4: Call-Site Tracer, which finds every file that imports the broken symbol and traces argument counts.