Stack: AI Error Boundary Category: Production Infrastructure
The Problem
AI APIs fail. This isn't a hypothetical — it's a regular occurrence:
- Rate limiting (429) — You hit API limits, especially during traffic spikes
- Server errors (5xx) — Provider infrastructure has issues
- Timeouts — Long-running requests get cut off
- Content filters — Input or output triggers safety filters
- Context overflow — Conversation exceeds token limits
When these errors happen, most applications show users a generic "Something went wrong" message. Or worse, crash entirely.
Why This Matters
A recursive agent loop that keeps retrying on errors can burn through API credits quickly. An unhandled 503 during a demo can lose a customer. Rate limit errors at 3 AM mean angry support tickets in the morning.
The Options
Option 1: Basic try/catch
try {
const result = await generateText({ model, prompt })
} catch (error) {
console.error(error)
return "Something went wrong"
}Problem: No retry logic, no differentiation between error types, poor user experience.
Option 2: Manual retry logic
async function withRetry(fn, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn()
} catch (error) {
if (i === maxRetries - 1) throw error
await sleep(Math.pow(2, i) * 1000)
}
}
}Problem: Doesn't differentiate error types. Rate limits need exponential backoff; auth errors shouldn't retry at all.
Option 3: Provider fallback
const providers = [openai, anthropic, google]
for (const provider of providers) {
try {
return await generateText({ model: provider("..."), prompt })
} catch (error) {
continue
}
}Problem: Different providers have different capabilities. Fallback order matters. Need to track which provider succeeded.
The Decision
We built a comprehensive error boundary that:
- Classifies errors — Different errors need different handling
- Retries intelligently — Exponential backoff with jitter for rate limits
- Shows user-friendly messages — Not "Error 429"
- Optionally falls back — To alternative providers if configured
- Prevents runaway costs — Max retries and circuit breaker
Error Classification
| Error Type | Retriable? | User Message |
|---|---|---|
| Rate limit (429) | Yes, with backoff | "High demand. Retrying..." |
| Server error (5xx) | Yes, immediately | "Service temporarily unavailable" |
| Timeout | Yes, with backoff | "Taking longer than usual..." |
| Auth (401/403) | No | Fix configuration |
| Content filter | Maybe | "Please rephrase your message" |
| Context overflow | No | Truncate conversation |
Retry Strategy
We use exponential backoff with jitter for rate limits:
// Base delay doubles each attempt, jitter prevents thundering herd
const delay = Math.min(
baseDelay * Math.pow(2, attempt) + Math.random() * 1000,
maxDelay
)Trade-offs
| Decision | Our Choice | Why |
|---|---|---|
| Default max retries | 3 | Balance between recovery and cost |
| Default backoff | Exponential + jitter | Best for rate limits at scale |
| Auto-fallback | Opt-in | Not everyone has multiple providers |