// Each additional nine is an order of magnitude more expensive. A working method for deciding which nines your business can afford and which it can't.
Every founder eventually has the same conversation with an enterprise prospect: what's your uptime SLA? The honest answer is usually a shrug. The strategic answer is usually a number that bears no relationship to what the system actually does. Both are bad.
The cost of an additional nine of uptime is not linear. It's roughly an order of magnitude per nine. Three nines (99.9%) is achievable by a competent team with a decent infrastructure stack and reasonable on-call discipline. Four nines (99.99%) requires a different architecture, a different team structure, and roughly ten times the operational cost. Five nines is a different company.
This essay is about deciding which nine to commit to.
What a nine actually buys you
Uptime is measured against a window, usually a month. The downtime budgets are:
- 99% — two nines. ~7.3 hours of downtime per month. You can be down for an entire workday and still hit the SLA. Anyone with a cron job and an EC2 instance can hit this.
- 99.9% — three nines. ~43 minutes per month. Achievable with a load balancer, two instances, and a team that responds to pages within 10 minutes.
- 99.99% — four nines. ~4.3 minutes per month. Requires multi-AZ, automated failover, deploys that don't take you down, and a database that survives the loss of a node without human intervention.
- 99.999% — five nines. ~26 seconds per month. Requires multi-region active-active, eventually-consistent semantics, formally verified failover paths, and a team that owns those things as a full-time job.
The thing that bites people is that the difference between three and four nines isn't a 9× improvement. It's a 10× operational uplift for a system that's already a 9× improvement on two nines. The cost curve is brutal.
The questions to ask before you commit to a number
Before you sign an SLA, ask three questions in this order.
What's the cost of an hour of downtime to your customer? If their business processes $50k/hour through your system and you're down for an hour, you've cost them $50k and probably your contract. If their business uses your system to send marketing emails on Tuesdays, an hour of downtime costs them roughly nothing. The SLA should reflect the customer's exposure, not yours.
What's the cost of an hour of downtime to you? This is the credit you'll owe if you miss the SLA. Most SLAs are written in service-credit terms — you refund some fraction of the monthly contract per hour of breach. Run the math: if every contract has a 30% service-credit clause and you miss the SLA on 5% of months, that's a 1.5% revenue haircut you've baked into your business permanently. Sometimes that's fine. Sometimes it's not.
What's the cost of buying the nine? Multi-AZ doubles your infra bill. Multi-region triples it. Automated failover that actually works costs a senior engineer's quarter to build and an on-call rotation to operate. Adding a nine usually means adding a person — and that person doesn't ship features.
The advice we end up giving most often
For most B2B SaaS in the seed-to-Series-B range:
- Commit to three nines. It's defensible, achievable, and honest. It maps cleanly onto what a competent platform team can actually deliver without rearchitecting around the SLA.
- Don't promise four nines unless you've already hit it for two consecutive quarters without trying. If you can't hit it accidentally, you can't hit it on purpose.
- Price the nines. If a customer needs four nines, charge for them. The economics of running a four-nines system are an enterprise pricing tier, not a sales concession.
- Write a service-credit clause that doesn't bankrupt you. A 100% refund clause turns a single bad month into a lost quarter. Cap your exposure.
The frame that helps
Treat uptime as a product feature, not as a value of pride. Four nines is not a better engineering culture than three nines — it's a different cost structure for a different customer. The team that ships the right number of nines for their customer base is doing better engineering than the team that ships the maximum number of nines and goes broke doing it.
Pick the nine you can actually deliver, price it accurately, and stop treating the SLA as a marketing document.