How Run The Machine Works

What it does

Run The Machineis an autonomous AI development system. It includes everything from Ship Your Product — the installer, skills, stack modules, quality gates — plus a layer of automation that turns Claude Code into a development team that runs 24/7 without you sitting at the keyboard.

The additions: agent mode (Claude Code runs with full permissions on a dedicated machine), 5 specialized agent templates (fullstack-dev, designer, content, quality, research), dual-AI planning (Claude + Codex produce independent plans that get synthesized into one), multi-AI code review (Claude + Codex + Copilot review every PR from three perspectives), scheduled overnight task processing, session management daemons that keep everything running, and Telegram/email notifications so you know what happened while you were away.

Why it works

A solo developer's biggest constraint isn't skill — it's time and context-switching. You can only work on one thing at a time. You lose hours switching between projects. You can't ship while you sleep. This system removes those constraints.

You define the work — GitHub Issues, feature descriptions, bug reports — and the system executes it autonomously. Features get built, reviewed by three AIs, tested, and delivered as pull requests while you sleep, commute, or work on something else. Multiple projects run simultaneously without conflicts because each runs in its own isolated session with its own git worktree.

In the morning, you review and merge the PRs. You're the technical lead, not the line developer. Your job is direction and approval, not implementation. The machine handles the rest.

How it works

Phase 1: Run the full installer

The same installer from Ship Your Product, extended with a third phase. It asks if this is a personal machine or an agent machine. For agent mode, the installer does several additional things.

It enables bypass permissionsso Claude can execute commands without asking for confirmation on every action. This is what makes autonomous operation possible — the agent can build, test, commit, and create PRs without waiting for you to click "approve" on each step.

It installs 5 agent templates: fullstack-dev (writes code across the full stack), designer (visual design specs, read-only), content (marketing copy, docs, SEO content), quality (testing, code review, security audit), and research (codebase exploration, market research, read-only). Each template defines what the agent can do, what model it uses, and what files it can access.

It configures Telegram bot notifications so you get real-time alerts when tasks complete, sessions crash, or overnight processing finishes. And it installs management daemons— background services that monitor sessions, sync configuration, and deliver output. One script, fully configured.

Phase 2: Configure your projects

Your projects are defined in projects.yaml— a single file that tells the system everything it needs to know about each project you're managing. Each entry includes: the project name, local file path, GitHub repository URL, and a priority level (high, medium, or low).

Priority affects how overnight issue processing allocates time across your projects. High-priority projects get processed first and receive more agent time. Low-priority projects get handled after high-priority work is done. This means your most important project always gets attention, even when you have ten projects configured.

The system manages Claude Code sessionsfor each project via tmux — persistent, recoverable terminal sessions that survive network disconnects, SSH timeouts, and even machine restarts. If a session crashes, the watchdog daemon detects it within 5 minutes and restarts it automatically. You don't manage tmux directly — the system handles session creation, naming, recovery, and cleanup.

Phase 3: Dual-AI planning with /ax-plan

When you need to build something significant — a new feature, a major refactor, a new product — you run /ax-plan instead of jumping straight into code. This skill runs two AIs independently on the same problem to catch blind spots neither would find alone.

First, Claude enters plan mode. It explores your codebase, reads your existing architecture, understands the data model, and designs a detailed implementation plan. This includes: which files to create or modify, what the data flow looks like, where the tricky parts are, and what order to build things in. You review this plan and approve, edit, or redirect it.

Then Codex (OpenAI's code-focused model) independently analyzes the same codebase and requirements. Codex produces its own plan without seeing Claude's. It might propose a different data model, flag a performance concern Claude missed, or suggest a simpler approach to a component Claude over-engineered.

The system synthesizes both plansinto a unified spec. Where they agree, you have high confidence. Where they disagree, you have a decision point that deserves your attention. This dual-AI approach catches architectural issues before you write code — saving hours or days of rework that would happen if you discovered the problem mid-implementation.

Phase 4: Parallel execution with /ax-feature

With a unified plan approved, /ax-feature breaks the work into 2–5 independent workstreams. Each stream runs in its own git worktree — an isolated copy of your repository with its own branch, its own working directory, and its own Claude Code session.

Dedicated subagentswork on each stream simultaneously. The fullstack-dev agent handles implementation. The quality agent writes tests. The content agent handles copy and documentation. They don't step on each other because each worktree is completely isolated — no branch collisions, no merge conflicts during development, no context bleed between streams.

When all streams complete, changes are integrated and verified. The system merges the branches, runs the full test suite, and flags any integration issues. What would take one developer a week of sequential work gets done in parallel — multiple features shipping simultaneously instead of one at a time.

Phase 5: Multi-AI code review with /ax-deliver

This is the most thorough review pipeline available to a solo developer. Three AI systems review your code from different perspectives before you approve a merge.

Step 1: Claude self-review. Claude reviews its own work, looking for issues it might have introduced. Then it runs /simplify— a dedicated pass that reduces complexity, removes unnecessary abstractions, and tightens the code. The goal is to make the PR as small and clean as possible before external review.

Step 2: Codex peer review. Codex gets up to 3 roundsof peer review. It's a dialogue — Codex raises concerns, you accept them or push back with reasoning, and Codex reviews again with your feedback incorporated. This back-and-forth catches subtle issues that a single-pass review would miss.

Step 3: GitHub Copilot review. A pull request is created on GitHub, and Copilot reviews it automatically. Copilot often pushes its own fix commits — small improvements, typo corrections, edge case handling. The PR includes a detailed summary of all review rounds so you can see what was flagged and how it was addressed.

Three AI perspectives, each with different training data, different strengths, and different blind spots. By the time you review the PR, the obvious issues are already resolved. Your review focuses on business logic and product decisions— not syntax, security basics, or missing error handling.

Phase 6: Autonomous overnight operation

The overnight daemon (/ax-overnight) is where the system truly becomes autonomous. You go to sleep, and it goes to work.

It reads open GitHub Issuesacross all your configured projects. It prioritizes them based on your projects.yaml priority settings and any labels you've applied. For each issue, it: plans the solution (using the dual-AI planning workflow), builds the feature or fix (in an isolated worktree), runs the full review pipeline (self-review, simplify, Codex rounds, Copilot), runs tests, and creates a pull request with a detailed description of what was done and why.

In the morning, you open GitHub and find completed PRs waiting for your review. Each one has been through the same quality pipeline as if you'd built it yourself during the day. You review the changes, merge what looks good, and leave comments on anything that needs adjustment. The system picks up your comments in the next cycle and addresses them.

Priority-based scheduling ensures high-priority projects get processed first. If you have three projects and limited overnight hours, the system focuses on what matters most. Low-priority maintenance tasks get handled after your core product work is done.

Phase 7: Session management and monitoring

Five launchd daemonsrun in the background, keeping the entire system healthy and responsive. You don't interact with them directly — they just work.

Session Watchdog (every 5 minutes): Monitors all active Claude Code sessions. If a session crashes, hangs, or becomes unresponsive, the watchdog restarts it automatically and sends a Telegram alert so you know what happened. No manual babysitting required.

Config Sync (every 10 minutes): Pulls the latest skill and stack module updates from git. When new patterns are added or constraints are updated, your agents get them automatically without manual intervention.

Health Report(daily): A summary of system health — which sessions are running, how many tasks were completed, any errors or warnings that need attention. Delivered to Telegram so you see it first thing in the morning.

Schedule Sync(daily): Reconciles your cowork schedules and task queues, ensuring overnight processing happens at the right times and doesn't conflict with your active work sessions.

Output Watcher(every 5 minutes): Monitors output directories and delivers completed files — generated content, design specs, research reports — via notification so you can review them immediately. Use /ax-monitor from within Claude Code to check the status of everything at any time.

Phase 8: The agent templates

Five specialized agents, each with defined capabilities and access levels. They're spawned automatically by orchestration skills like /ax-feature and /ax-deliver, but you can also invoke them manually for specific tasks.

fullstack-dev— The primary implementation agent. Writes code across the full stack: frontend components, API routes, database queries, integrations. Uses the Opus model for maximum capability. Has write access to the codebase and can commit, push, and create PRs. This is the workhorse that builds your features.

designer— Produces visual design specs, UX improvement recommendations, component layouts, and color system suggestions. Read-only— it doesn't write code. It outputs specifications that the fullstack-dev agent implements. Useful when you need design direction before building.

content— Writes marketing copy, product descriptions, documentation, blog posts, SEO content, email sequences, and landing page text. Has write access to content files but not application code. The content it produces is ready to ship, not placeholder text.

quality— Handles testing, code review, and security auditing. Writes test files, reviews PRs, runs security scans, and reports findings. Can push fix commits for issues it finds. This is the agent that catches problems before they reach production.

research— Explores codebases, researches libraries, analyzes competitor implementations, and produces research reports. Read-only— it doesn't modify anything. Use it when you need to understand an unfamiliar codebase, evaluate a library, or gather information before making a technical decision.

Sample projects.yaml

# projects.yaml — Define your managed projects projects: - name: my-saas path: ~/my-saas repo: myuser/my-saas priority: high - name: landing-page path: ~/landing-page repo: myuser/landing-page priority: medium - name: side-project path: ~/side-project repo: myuser/side-project priority: low # Priority affects overnight issue processing: # high = processes up to 3 issues per night # medium = processes up to 2 issues per night # low = processes 1 issue per night

AI prompt to get started

I just installed Run The Machine. Help me: 1) Set up projects.yaml for my projects, 2) Configure Telegram notifications, 3) Run /ax-plan on my first feature to test the dual-AI planning workflow, 4) Set up the overnight daemon to process GitHub Issues while I sleep.

Mistakes to avoid

✕Running agent mode on your personal machine — use a dedicated machine so agents don't compete for your CPU, memory, and disk. Agent sessions are resource-intensive
✕Not setting up Telegram notifications — without them, you won't know when tasks complete, sessions crash, or overnight processing finishes
✕Trying to manage agents manually instead of using projects.yaml — the system handles scheduling, priority, session recovery, and cleanup. Let it
✕Skipping the dual-AI planning step — /ax-plan catches architectural issues before you write code, saving hours of rework you'd discover mid-build
✕Not reviewing overnight PRs carefully — autonomous doesn't mean unsupervised. Review every PR before merging. You're the technical lead, not the rubber stamp
✕Running too many projects at once initially — start with 1–2 projects, learn the workflow, see how overnight processing behaves, then scale up