Forge is designed for focused, well-scoped software projects — the kind of thing one to three engineers could build in a sprint. Some examples of what works well:
| ✓ | CRUD apps & internal tools | Trackers, dashboards, booking systems, admin panels |
| ✓ | Marketing & portfolio sites | Static or dynamic, responsive, with CMS |
| ✓ | Lightweight SaaS MVPs | Auth, database, basic billing — under ~5k lines |
| ✓ | API services | REST APIs, webhooks, integrations with standard services |
| ✓ | Data tools | CSV importers, report generators, simple pipelines |
| ✓ | Extending an existing codebase | Adding a feature, fixing bugs, refactoring a module |
If you're not sure whether your project fits, just submit it. The Intake Agent will assess it and tell you honestly.
Every other tool in this space lets you start building anything and discovers the limits after you've spent money. We think that's wrong. Here's what Forge is not designed for right now:
| ✕ | Complex SaaS platforms | Think Strava, Notion, Figma — multi-tenant, real-time, millions of users |
| ✕ | Native mobile apps | iOS or Android — web apps with mobile-responsive UI are supported |
| ✕ | Real-time collaboration | Shared editing, live cursors, operational transforms |
| ✕ | Regulated industry systems | HIPAA-compliant healthcare, PCI-DSS fintech — these need specialist oversight |
| ✕ | Deep enterprise integrations | Salesforce, SAP, Oracle — complex proprietary APIs with thin documentation |
| ✕ | AI/ML model training | Custom model training, fine-tuning pipelines, GPU workloads |
Before any build starts, an Intake Agent reads your project description and evaluates three things:
- Feasibility — Is this within Forge's current capability range?
- Clarity — Is there enough detail to build from, or do we need to ask a clarifying question?
- Scope — Is the scope realistic for an AI-built project, or is it likely to spiral into something much larger?
The Intake Agent can respond in three ways:
- Accepted — Build queued, you'll be notified when it starts.
- Needs clarification — One specific question sent to you before proceeding. Usually answered in 30 seconds.
- Declined — Honest explanation of why this project is outside Forge's current scope, and what to try instead.
If your project is declined, you are not charged. The intake check is always free.
Three tiers, each designed for a different type of user:
- Spark (Free) — 3 builds on Forge's API keys. No credit card. Idea and notes stages only. Best for trying Forge with no commitment.
- Builder ($29/month or $9/build) — All 4 intake stages, up to 3 parallel projects, push + email notifications. Forge manages the API keys. Best for non-technical users who want the simplest possible experience.
- Pro ($19/month + your API costs) — Bring your own API keys from Anthropic, OpenAI, or Gemini. You pay those providers directly — we never touch your tokens. We charge only for running the orchestration engine. Best for developers or anyone who already has API access and wants full transparency.
What counts as one build? One project from submission to delivery, including all agent iterations during the build. Revisions requested during the active build window don't count as a new build.
On the Pro tier, instead of paying Forge for AI token usage, you plug in your own API key from Anthropic (Claude), OpenAI, or Google (Gemini). The AI costs go directly on your own API account — we never mark them up or even see them.
We charge $19/month for running the orchestration engine: the agent coordination, async scheduling, project tracking, and the CEO dashboard.
Why BYOK?
- You see exactly what every build costs, down to the model and token count
- No anxiety about hidden markup — you pay Anthropic $X, you pay Forge $19/month, that's it
- If you already pay for Claude Pro or have an Anthropic API account, Pro tier is almost certainly cheaper than Builder for heavy usage
- You choose which models run on your projects (within Forge's supported list)
On Spark and Builder: no. You pay a fixed price per build. One project = one charge. We absorb any internal cost overruns on complex builds — that's our problem, not yours.
On Pro (BYOK): almost certainly no, but it depends on your API provider's pricing. We show you a cost estimate before each build starts (based on project complexity), and your dashboard shows real-time token usage during the build. You can set a spending cap on your API key directly with your provider.
You're right that they're different internally — starting from a raw idea requires more agent work (architecture, planning, specification, then building) than starting from a full spec (which skips straight to building). The cost difference is real.
But we made a deliberate choice: don't make users think about internal complexity. You shouldn't have to optimize your intake stage to save money. The pricing is the same regardless of stage; we absorb the variance internally through off-peak scheduling and model selection optimizations.
In practice, starting from an existing codebase (Stage 4) can be the most expensive internally because it requires reading and understanding your existing code before building. But again — that's our problem to solve, not yours to price-shop around.
When your build is delivered, you rate it. If you're not happy, you flag it — and here's what happens:
- 1. You tell us what's wrong. A brief description of what the output missed vs. what you asked for.
- 2. An AI judge reviews the case. We compare your original spec against the delivered output. If the output genuinely didn't meet the spec — not just "I changed my mind" but "this clearly missed the ask" — the judge flags it as a legitimate miss.
- 3. If the judge agrees: you get your build credit back. No questions, no support ticket queue, no "case-by-case" ambiguity. Automatic.
- 4. If the judge disagrees: We show you the reasoning. You can escalate to a human review if you still disagree.
Note: "I changed what I wanted" or "I decided I want a different feature" is not a miss — that's a new project. The judge is specifically looking for cases where the output clearly diverged from the original spec you submitted.
The AI judge is a separate LLM evaluation step (not the same agents that built your project). It receives three inputs:
- Your original specification as submitted at intake
- The delivered output (code, file structure, README)
- Your dispute description ("the login page doesn't work", "it ignored the dark mode requirement")
It evaluates whether the output materially satisfies the spec — not whether you're happy in general, but whether specific requirements were met.
Can it be gamed? We've thought about this. The spec you submit at intake is immutable — you can't edit it after the fact. That's the contract. If someone submits a vague spec intentionally to claim any output missed it, the judge is specifically trained to evaluate vagueness as a shared ambiguity, not a clear miss. The intake clarification step also exists partly to prevent dangerously underspecified builds from starting.
Those are good tools for specific use cases. Here's how the differences play out in practice:
| Others | Forge | |
| Billing model | Credits per interaction | Fixed per build |
| You must watch the build | Yes | No — async |
| Multiple projects at once | Usually no | Yes |
| Charged when AI makes errors | Yes | No |
| Honest about limitations | No — discover at cost | Yes — intake check |
| Refund if build misses spec | "Case-by-case" / none | Automatic via AI judge |
| Bring your own API keys | No | Yes (Pro tier) |
| Multi-agent verification | Single model | Build + verify + review |
Where they're better than us: Lovable and Bolt have polished real-time UIs that let you iterate interactively in the browser. If you want to watch the build happen and guide it step by step, those tools may suit you better. Forge is optimised for walk away and come back to a finished product.
Three reasons, all in your interest:
- Off-peak scheduling lowers costs. AI model APIs are cheaper and faster during low-demand windows (typically late night). We schedule builds strategically, and pass the savings to you in the flat pricing.
- Quality over speed. The agents do multiple verification passes, not just one generation. A synchronous "watch it build" model pressures the system to show you something fast. Async lets the agents take the time to do it right.
- You shouldn't have to babysit it. Watching a progress bar while your project builds is not a good use of your time. You get a notification when it's done or when you're needed.
For simple projects, builds typically complete in 15–45 minutes. Complex builds with multiple passes can take 2–4 hours. You'll always get a notification when it's done.
When you submit a project, Forge spins up a small team of AI agents — each with a specific role, much like a real software team:
- Architect — Plans the project structure, chooses the tech stack, defines the data model
- Builder — Writes the actual code, file by file
- Verifier — Reviews the output, checks for errors, inconsistencies, and gaps against the spec
They iterate with each other — Builder and Verifier can go back and forth multiple times before declaring the build complete.
On Builder tier, Forge selects the best model combination for your project type. We currently use models from Anthropic (Claude), Google (Gemini), and OpenAI. You don't need to configure this.
On Pro tier (BYOK), you choose which provider's key to use and can see exactly which model ran each agent step in your dashboard.
Forge's engine uses an active heartbeat mechanism — it continuously monitors every build, not just schedules it and steps away. If a model hits a rate limit (for example, Anthropic's Pro plan has usage windows), the engine pauses, waits, and resumes automatically when the window resets. If a provider has an outage, the engine can route to an alternative provider for that step.
Your build does not fail silently. If anything requires your input to proceed — or if a genuine blocker occurs that the engine cannot resolve automatically — you will receive a notification explaining what happened and what the options are.
This is one of the core advantages of our orchestration layer over single-model tools: multi-provider fault tolerance is built in, not bolted on.
Yes, completely. The code Forge delivers is yours. No license restrictions, no lock-in, no usage royalties. Download it, push it to your own GitHub, hand it to an engineer — it's your project.
You can also submit the delivered codebase back to Forge as a Stage 4 project to continue building. This is how you grow a project incrementally over multiple sessions: start from an idea, get an MVP, review it, then submit it back with "now add user authentication and a dashboard".
Your project is private. Each project is isolated in its own workspace. No other user can see your code, spec, or build progress.
We do not train on your projects. Your code and spec are not used to improve Forge's models. The AI models we use (Claude, GPT, Gemini) are accessed via API — subject to each provider's data usage policies, which for API usage means your data is not used for training by default.
When you delete a project, it is permanently removed from our systems within 30 days. We'll publish a full privacy policy before public launch.
During a build, agents make hundreds of small decisions autonomously — file structure, function names, database schema fields, UI component choices. None of those interrupt you.
You are notified (push or email, depending on your tier) only when an agent reaches a decision that genuinely requires human judgment:
- Your business or product name (agents won't invent this)
- A key design preference (color palette, layout style, branding tone)
- A third-party service choice ("should the app send emails via SendGrid or Resend?")
- An ambiguity in the spec that could go two very different ways
The goal is that your total active involvement in a typical project is under 5 minutes. The rest happens without you.
Still have a question?
We're in early access. If something isn't covered here, reach out directly.
hello@forge.build