Improve Server Response Time for Better TTFB
Time to First Byte (TTFB) is the foundation of every other web performance metric. TTFB measures how long the browser waits for the first byte of the HTML response. Every millisecond of TTFB directly delays LCP, FCP, and INP because the browser cannot begin rendering until it receives the HTML.
A TTFB over 800ms is rated "poor" by Core Web Vitals. The HTTP Archive reports that 40% of origins have TTFB exceeding this threshold, primarily due to slow database queries, uncached server-side rendering, and hosting on distant origin servers.
This guide covers five server-side optimization layers: response caching, database optimization, edge computing, CDN configuration, and rendering architecture. Together, these techniques can reduce TTFB from 1-3 seconds to under 200ms for most pages.
Expected results
Following all steps in this guide typically produces these improvements:
Before
1.8s
TTFB (Poor) -- origin server with database queries and no caching on every request
After
180ms
TTFB (Good) -- edge caching with stale-while-revalidate and optimized database queries
Step-by-step fix
Implement server-side response caching
The fastest response is one that does not hit your application server at all. HTTP caching stores rendered responses and serves them directly from memory or CDN edge nodes. For pages that do not change on every request (most content pages, product listings, blog posts), caching eliminates server processing time entirely.
# Static pages (blog posts, docs, landing pages)
# Cache at CDN edge for 1 hour, revalidate in background
Cache-Control: public, s-maxage=3600, stale-while-revalidate=86400
# Dynamic pages with user-specific content
# Cache in browser only, revalidate on every request
Cache-Control: private, no-cache, must-revalidate
# API responses (product data, search results)
# Short edge cache with background revalidation
Cache-Control: public, s-maxage=60, stale-while-revalidate=300
# Truly dynamic (shopping cart, user dashboard)
# No caching, always fresh
Cache-Control: private, no-store
# Static assets (CSS, JS, images with hashed filenames)
Cache-Control: public, max-age=31536000, immutable
// app/blog/[slug]/page.tsx
// ISR: serve cached page, revalidate in background
// Revalidate every 60 seconds
export const revalidate = 60;
// Pre-generate popular pages at build time
export async function generateStaticParams() {
const posts = await db.post.findMany({
where: { published: true },
orderBy: { views: 'desc' },
take: 100,
});
return posts.map(post => ({ slug: post.slug }));
}
export default async function BlogPost({
params,
}: {
params: { slug: string }
}) {
const post = await db.post.findUnique({
where: { slug: params.slug },
});
if (!post) notFound();
return (
<article>
<h1>{post.title}</h1>
<div dangerouslySetInnerHTML={{ __html: post.content }} />
</article>
);
}
// TTFB result:
// First request: ~200ms (SSR + cache miss)
// Subsequent: ~20ms (served from CDN edge cache)
// After revalidation: ~20ms (fresh content, still cached)
Optimize database queries and data fetching
Database queries are the most common cause of slow TTFB in server-rendered applications. A single N+1 query pattern can add 500ms+ to every page load. Optimize by adding database indexes, eliminating N+1 queries, implementing query result caching, and moving to connection pooling.
// BEFORE: N+1 query pattern (50+ queries for 50 posts)
async function getPostsWithAuthors() {
const posts = await db.post.findMany({ take: 50 });
// Each iteration triggers a separate query!
for (const post of posts) {
post.author = await db.user.findUnique({
where: { id: post.authorId },
});
}
return posts; // 51 queries, ~500ms
}
// AFTER: Single query with relation (1 query)
async function getPostsWithAuthors() {
return db.post.findMany({
take: 50,
include: {
author: {
select: { name: true, avatar: true },
},
},
}); // 1 query, ~15ms
}
// Add application-level caching for frequent queries
import { unstable_cache } from 'next/cache';
const getCachedPosts = unstable_cache(
async () => {
return db.post.findMany({
take: 50,
include: { author: true },
orderBy: { publishedAt: 'desc' },
});
},
['posts-list'], // Cache key
{ revalidate: 60 } // Refresh every 60s
);
-- Index for blog post lookups by slug (most common query)
CREATE INDEX idx_posts_slug ON posts(slug)
WHERE published = true;
-- Composite index for listing pages (order + filter)
CREATE INDEX idx_posts_published_date
ON posts(published, published_at DESC);
-- Index for author relation lookups
CREATE INDEX idx_posts_author_id ON posts(author_id);
-- Partial index for active users only
CREATE INDEX idx_users_active ON users(email)
WHERE active = true;
-- EXPLAIN ANALYZE to verify index usage
EXPLAIN ANALYZE
SELECT * FROM posts
WHERE published = true
ORDER BY published_at DESC
LIMIT 50;
Deploy to the edge with edge functions or static generation
Traditional server hosting runs your application in one region (e.g., us-east-1). Users on other continents add 100-300ms of network latency per round trip. Edge deployment runs your application on servers distributed worldwide, reducing network latency to 10-50ms for most users. For static content, this means sub-100ms TTFB globally.
// app/api/search/route.ts
// Runs at the CDN edge, closest to the user
export const runtime = 'edge'; // Deploy to 30+ edge locations
export async function GET(request: Request) {
const { searchParams } = new URL(request.url);
const query = searchParams.get('q');
// Use edge-compatible data sources
// (KV stores, edge databases, cached APIs)
const results = await fetch(
`https://api.example.com/search?q=${query}`,
{
next: { revalidate: 300 }, // Cache for 5 minutes
}
);
return Response.json(await results.json(), {
headers: {
'Cache-Control': 'public, s-maxage=300, stale-while-revalidate=600',
},
});
}
// TTFB comparison:
// Origin server (us-east-1) from Europe: ~350ms
// Edge function from Europe: ~40ms
// src/worker.ts
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const url = new URL(request.url);
// Try cache first
const cache = caches.default;
const cached = await cache.match(request);
if (cached) return cached; // ~5ms TTFB
// Generate response at the edge
const html = renderPage(url.pathname);
const response = new Response(html, {
headers: {
'Content-Type': 'text/html',
'Cache-Control': 'public, s-maxage=3600, stale-while-revalidate=86400',
},
});
// Store in edge cache for future requests
await cache.put(request, response.clone());
return response; // ~30ms TTFB (first request)
},
};
Configure CDN for optimal TTFB
A properly configured CDN can serve cached HTML directly from edge nodes with 10-30ms TTFB. The key settings are cache duration, stale-while-revalidate behavior, and cache key configuration. Most TTFB problems with CDNs come from misconfiguration that causes cache misses.
// Vercel: configure caching in middleware
// middleware.ts
import { NextResponse } from 'next/server';
export function middleware(request) {
const response = NextResponse.next();
// Cache HTML pages at the edge
if (!request.nextUrl.pathname.startsWith('/api/')) {
response.headers.set(
'Cache-Control',
'public, s-maxage=3600, stale-while-revalidate=86400'
);
// Vary on cookie for logged-in vs anonymous
// (prevents serving cached logged-in page to anonymous users)
response.headers.set('Vary', 'Cookie');
}
return response;
}
// Cloudflare: Page Rules or Cache Rules
// Rule 1: Cache HTML for /blog/*
// Cache Level: Cache Everything
// Edge Cache TTL: 1 hour
// Browser Cache TTL: 5 minutes
//
// Rule 2: Bypass cache for /dashboard/*
// Cache Level: Bypass
Choose the right rendering architecture
Your rendering architecture determines the TTFB floor -- the fastest possible response time regardless of other optimizations. Static generation (SSG) achieves the lowest TTFB because HTML is pre-built and served from CDN. Server-side rendering (SSR) adds server processing time. Client-side rendering (CSR) has fast TTFB but delays meaningful content.
Rendering Strategy | Typical TTFB | Best For
---------------------------|-------------|---------------------------
Static (SSG) | 10-50ms | Blog, docs, marketing
ISR (SSG + revalidation) | 10-50ms* | Product pages, listings
Edge SSR | 30-100ms | Personalized content
Streaming SSR | 50-200ms** | Complex pages, dashboards
Traditional SSR | 200-800ms | Legacy, heavy data
Client-side (CSR) | 20-50ms*** | SPAs, logged-in apps
* ISR serves cached HTML; revalidation happens in background
** Streaming sends initial HTML immediately, streams rest
*** Fast TTFB but empty HTML; real content requires JS execution
Recommendation: Use SSG/ISR for >80% of pages. Use Edge SSR
for pages that need real-time data. Avoid traditional SSR
unless edge deployment is not possible.
// app/dashboard/page.tsx
// Streaming SSR: send shell immediately, stream data as it loads
import { Suspense } from 'react';
export default function Dashboard() {
return (
<main>
{/* Shell renders immediately (fast TTFB) */}
<h1>Dashboard</h1>
<nav>/* navigation */</nav>
{/* Each section streams in as its data resolves */}
<Suspense fallback={<MetricsSkeleton />}>
<MetricsSection /> {/* Fetches from DB */}
</Suspense>
<Suspense fallback={<ChartSkeleton />}>
<ChartSection /> {/* Fetches from analytics API */}
</Suspense>
<Suspense fallback={<ActivitySkeleton />}>
<ActivityFeed /> {/* Fetches from activity service */}
</Suspense>
</main>
);
}
// TTFB: ~50ms (shell with loading states)
// Full content: streams in over 200-500ms
// User sees meaningful content immediately
Quick checklist
- HTTP caching headers set for all page types with appropriate TTLs
-
stale-while-revalidateenabled for content pages - Database queries optimized with indexes and eager loading
- N+1 query patterns eliminated with relation includes
- Application deployed to edge or CDN-cached at edge nodes
- CDN cache hit rate above 90% for content pages
- Rendering architecture matches content freshness requirements
Frequently asked questions
Core Web Vitals rates TTFB as good under 800ms, but aim for under 200ms for the best user experience. Static pages served from a CDN should achieve 10-50ms TTFB. Server-rendered pages with edge caching should achieve 30-100ms. If your TTFB exceeds 500ms consistently, server-side caching should be the first optimization.
TTFB is not a direct Core Web Vitals metric (Google uses LCP, CLS, and INP for ranking), but it is a strong indirect factor. Slow TTFB delays LCP because the browser cannot start rendering until it receives the HTML. A 500ms TTFB improvement typically translates to a 300-500ms LCP improvement, which directly affects rankings.
Use both in layers. CDN edge caching (Cache-Control headers) serves cached HTML without hitting your server at all -- this is the fastest option for public content. Redis caching on your application server handles cache misses and dynamic content that cannot be edge-cached. The CDN handles 90%+ of requests; Redis handles the remainder.
Streaming SSR sends the initial HTML shell immediately (fast TTFB) and then streams additional content as server-side data fetching completes. This means TTFB measures just the shell delivery time (typically 50-100ms), while the full page content arrives progressively. Users see a meaningful layout immediately, with data filling in over the next few hundred milliseconds.
Yes. The highest-impact improvements are: add Cache-Control headers with stale-while-revalidate (reduces most requests to 0ms server time), optimize database queries (often saves 200-500ms), and put a CDN in front of your origin (Cloudflare free tier works). These three changes can improve TTFB by 50-80% without changing hosts.
Related resources
Complete TTFB Guide
Deep dive into Time to First Byte -- thresholds, measurement, and optimization.
FixFix TTFB in Next.js
Next.js-specific TTFB optimizations with edge runtime and ISR.
TutorialSet Up Monitoring
Track TTFB in real-user monitoring dashboards.
ComparisonVercel vs Netlify TTFB
Edge deployment TTFB comparison between major platforms.