Factory

The Autonomous Factory

A multi-LLM orchestrator that builds production software with zero human intervention. Not a wrapper around ChatGPT — a full development pipeline with role-based delegation, structured communication, TDD enforcement, and cross-project learning.

ClaudeKimiGeminiCodexphi4Qwen CoderPythonDaC ProtocolTDD

# Architecture

pipeline.md
USER: "Build X"
  |
  v
BLUEPRINT CREATION
  |-- Load User DNA -> skip known decisions
  |-- Domain research -> auto-apply best practices
  `-- Generate modular blueprint (CORE + PROJECT)
  |
  v
BLUEPRINT REVIEW (parallel DaC)
  |-- Kimi QC + Gemini Architecture
  |-- Auto-patch via SEARCH/REPLACE
  `-- Target: both >= 95/100
  |
  v
HUMAN GATE (Codex as Owner's Twin)
  `-- Decides using owner's DNA -> ACCEPT/CHANGE/REJECT
  |
  v
WAVE EXECUTION (TDD per task)
  |-- AC -> RED -> GREEN -> REFACTOR
  |-- GATES: Bug Capture + Schema + Security
  `-- Circuit breaker: 2 TRAP -> rollback
  |
  v
CODE QC (Kimi reviews all code)
  |-- Score 0-100, must pass >= 85%
  `-- Dead code check, contract alignment
  |
  v
FINAL VERIFICATION (Codex as Human Twin)
  |-- Tries to break the app
  |-- Tests edge cases, bad data, missing links
  `-- Score >= 90% to ship
  |
  v
LEARNING -> Save DNA + domain traps + new rules
org-chart.md
OWNER
  |  Direction, final authority
  v
CLAUDE (CEO)
  |
  |-- KIMI (QC Director)
  |   Bug capture, quality audits
  |   Score-gated: >= 95% to proceed
  |
  |-- GEMINI (Architecture Director)
  |   Structure, security, contracts
  |   Cross-reviewer (different failures)
  |
  |-- CODEX (Human Twin)
  |   Loaded with the owner's DNA:
  |   decision logic, heuristics, redlines.
  |   Decides like the owner would.
  |   Read-only. NEVER writes files.
  |
  |-- PHI4 (Local Assistant)
  |   DaC parsing, routing, summaries
  |
  `-- QWEN CODER (Junior Dev)
      Boilerplate, scaffolding

# Watch It Build

A factory run in real time — from blueprint to production.

factory run
*Loading blueprint... "PPF Workshop Monitoring"|

Each step runs autonomously. The factory decides, reviews, tests, and ships — no human input between start and finish.

# Three Modes

CREATE

Full lifecycle. Greenfield project from requirements to deployed app. Blueprint, review, TDD build, gate swarm, ship.

Trigger: "Build X", "Create X"

AUDIT

Forensic review of existing code. No code generation. Recon, self-audit, external QC, save learnings, report.

Trigger: "Audit X", "Review X"

UPDATE

Delta blueprint for existing projects. Only change what is needed. Targeted fixes, new features, cleanup.

Trigger: "Update X", "Add Y to X"

# By the Numbers

23
Projects Built
1,700+
Tests Passing
172
Factory Learnings
$0
Cost Per Project
3
Production Systems
3
Operating Modes

# Evolution

How the factory got smarter, project by project.

P1–P3 · Foundation
Core Pipeline Established

Blueprint → build → test. Raw SQL + SQLite. First learnings captured. Discovered the orchestrator was rubber-stamping its own work — introduced independent review.

P4–P6 · Maturation
Quality Scores + Multi-LLM Review

Kimi + Gemini scoring introduced. JWT/RBAC solidified. First healthcare project (MedVault). Multi-model review catches failures single models miss.

P7–P9 · Real-time
WebSocket + Search + Geolocation

LivePulse (real-time chat), RecipeForge (full-text search), FleetTracker (GPS geofencing). The factory learned to handle multiple protocol types.

P10–P12 · Scale
DLQ + E2E Testing + First Frontends

Dead letter queues, Playwright E2E, vanilla JS frontends. TeamForge: full project management with WebSocket. Test coverage grew significantly.

P13–P15 · Advanced
React + gRPC + PostgreSQL + Redis + C++

FleetCore (hot state, collision avoidance), WareFlow (full warehouse lifecycle), FleetBridge (gRPC + C++17 robot simulator). Multi-language, multi-protocol.

P16–P18 · Multi-protocol
MQTT + gRPC + WebSocket Combined

GridSense: 4 protocols in one project (REST + gRPC + MQTT + WebSocket). PostgreSQL + asyncpg. Energy metering with billing precision.

P19–P20 · Self-Audit
The Factory Audits Itself

AUDIT mode found dead code and untested modules. UPDATE mode cleaned it all up. The factory improved the factory.

P21 · Production
PPF Monitoring Goes Live

First production IoT SaaS. Real workshops, real ESP32 sensors, real MQTT, real customers. Hardware + software from one prompt.

P22–P24 · Research + Product
Semantic Gravity → io-gita Goes Live

Non-token reasoning engine built from Bhagavad Gita concepts. Hopfield attractor networks, ODE dynamics. Then shipped as a live product at io-gita.com.

# The Factory Updates the Factory

Self-healing — it audits its own code and fixes what it finds.

P19 - Self-Audit

The factory ran its own AUDIT mode on itself. Found dead code, untested modules, and critical issues. The factory that builds software couldn't pass its own quality gate.

Mode: AUDIT

P20 - Self-Update

The factory ran UPDATE mode on itself using P19's findings as input. Cleaned dead code, added new tests. The factory fixed the factory.

Mode: UPDATE

This is the moment it stopped being a tool and started being a system. A factory that can audit itself, find its own bugs, and ship its own fixes — autonomously.

# What It Learned to Never Do Again

R01Never skip review scores — checkpoints enforce quality before proceeding
R02Never trust "tests pass" = "app works" — run real server verification
R03Never claim "fixed" without showing output — CLAIMED != VERIFIED
R04Never build services without wiring them — dead code check at every gate
R05Never let Claude review its own work — Codex replaces self-review
R06Never use float/double for money — string-encoded decimals for billing