Production Architecture Blueprint

100K+ Concurrent
Multiplayer Quiz Platform

Complete realtime architecture for a Kahoot-scale quiz platform — WebSockets, Redis pub/sub, distributed game state, anti-cheat, and horizontal scaling.

⚡ Node.js + Socket.io
🔴 Redis Cluster
🐘 PostgreSQL
📡 100K+ Players
⏱ <50ms P99

Scale Requirements

100K+
Concurrent Players
<50ms
P99 Latency
±10ms
Timer Sync Drift
10K
Rooms Simultaneous
99.9%
Uptime SLA
<5s
Reconnect Time
Transport

WebSocket First

Socket.io with automatic fallback to long-polling. Binary message packing via msgpack for 40% smaller payloads. HTTP/3 for initial handshake.

State Store

Redis-First State

All hot game state lives in Redis Cluster with TTL. PostgreSQL for durable history. Redis Streams for event sourcing. No single point of failure.

Scaling

Horizontal Pod Scaling

Stateless Node.js pods behind HAProxy. Redis Adapter for Socket.io cross-pod pub/sub. K8s HPA triggers at 60% CPU. Zero-downtime rolling deploys.

Full Architecture Diagram

End-to-end request flow from client to database layer with all intermediate services.

CLIENT LAYER EDGE / CDN LAYER GATEWAY LAYER APPLICATION LAYER DATA LAYER MONITORING 🌐 Web Client React / Next.js 📱 Mobile App React Native / PWA 🎮 Host Client Dashboard / Controls ☁️ CDN (CloudFront) Static assets · Edge cache 🛡️ WAF / DDoS Cloudflare · Rate limit 🌍 DNS / GeoDNS Route53 · Latency routing ⚖️ Load Balancer HAProxy · sticky sessions · WS-aware 🔀 API Gateway Rate limit · Auth · Route ⚡ WS Pod 1 Socket.io · Node.js ~3K connections ⚡ WS Pod 2 Socket.io · Node.js ~3K connections ⚡ WS Pod N Auto-scale 2→30+ HPA triggered 🎮 Game Engine Room state machine Timer · Scoring · Logic 🔐 Auth Service JWT · OAuth2 Session tokens 📊 Analytics Kafka · ClickHouse Real-time metrics 🔴 Redis Cluster Pub/Sub · Game state · Leaderboard 3 primaries · 3 replicas 🐘 PostgreSQL Users · Quizzes · History Primary + 2 read replicas 📡 Redis Streams Event sourcing · Audit log Game event replay ⚡ ClickHouse OLAP analytics Leaderboard history 📈 Prometheus + Grafana 🔍 Jaeger Tracing 📝 ELK Stack (Logs) 🚨 PagerDuty Alerts Pub/Sub ↕ @socket.io/redis-adapter ☸️ Kubernetes Cluster HPA · Node pools · Namespaces 🌐 Multi-Region US-East · EU-West · AP
📡
WebSocket Affinity: HAProxy uses sticky sessions so a player's WebSocket connection always routes to the same pod. The Redis adapter syncs events across all pods — a player on Pod 1 receives broadcasts from a host on Pod 3.
⚖️
Stateless Pods: No game state is stored in pod memory. All state is in Redis. This means any pod can crash and players reconnect to a different pod with zero data loss — Redis holds the truth.

WebSocket Event Map

Complete bidirectional event contract between client and server. All events use msgpack binary encoding.

→ Client→Server
← Server→Client
⟳ Broadcast to Room

🔌 Connection & Session

EventDirectionPayloadDescription
connection C→S {token, clientVersion, region} Initial WS handshake. Server validates JWT, sets up socket metadata.
session:ack S→C {socketId, serverTime, pingInterval} Server confirms session. Client syncs clock using serverTime offset.
reconnect:attempt C→S {gameCode, playerId, sessionToken} Player reconnects mid-game. Server restores state from Redis.
reconnect:state S→C {gameState, currentQ, timeRemaining, score} Full game state snapshot sent to reconnecting player.
ping C→S {t: Date.now()} Heartbeat every 5s. Used for RTT measurement and connection health.
pong S→C {t, serverTime} Echo with server timestamp. Client calculates clock offset = (serverTime - t) / 2.

🏟️ Lobby & Room

EventDirectionPayloadDescription
room:create C→S {quizId, settings: {timePerQ, maxPlayers}} Host creates a game room. Server generates a 6-char PIN, initializes Redis state.
room:created S→C {gameCode, roomId, shareUrl} Room creation confirmed. Host displays PIN to players.
room:join C→S {gameCode, playerName, avatarId} Player joins via PIN. Server validates code, adds to Redis set, emits join broadcast.
room:join:ack S→C {playerId, gameCode, players[], quizMeta} Sent only to joining player with full lobby snapshot.
room:player:joined ⟳ Broadcast {player: {id, name, avatar}, totalCount} Broadcast to all room members when a new player joins lobby.
room:player:left ⟳ Broadcast {playerId, reason: 'disconnect'|'kick'} Broadcast when player disconnects. Host can see live roster.
room:kick C→S (host) {targetPlayerId} Host-only. Removes player from room. Validates host role server-side.

🎮 Gameplay

EventDirectionPayloadDescription
game:start C→S (host) {gameCode} Host triggers game start. Server locks room (no new joins), begins countdown.
game:countdown ⟳ Broadcast {startsAt: epochMs, count: 3} Countdown broadcast with absolute server timestamp. Clients sync to serverTime.
game:question:show ⟳ Broadcast {qIndex, question, answers[], endsAt: epochMs} Question payload. Correct answer NOT included. endsAt is absolute server time.
game:answer:submit C→S {gameCode, qIndex, answerIdx, clientTime, answerHash} Player submits answer. Server validates timing, qIndex match, and HMAC hash.
game:answer:ack S→C {received: true, serverTime} Immediate acknowledgment. Does NOT reveal correct answer yet.
game:answer:progress ⟳ Broadcast {answeredCount, totalCount, distribution: [n,n,n,n]} Throttled to 1/s. Shows how many answered without revealing which option.
game:question:reveal ⟳ Broadcast {correctIndex, pointsMap: {playerId: pts}, explanationText} Reveal broadcast after timer expires or all players answer. Includes earned points.
game:leaderboard ⟳ Broadcast {rankings: [{id,name,score,delta,streak}], yourRank} Sorted leaderboard snapshot after each question. Redis ZREVRANGE O(log N).
game:end ⟳ Broadcast {finalRankings[], xpGained:{}, badges:{}, gameId} Final results. Server persists to PostgreSQL, clears Redis room after TTL.

🎛️ Host Controls

EventDirectionPayloadDescription
host:pauseC→S (host){gameCode}Pause timer. Server stores remaining time in Redis. Broadcast pause state.
host:resumeC→S (host){gameCode}Resume with new endsAt = now + remainingMs. Clients re-sync timer.
host:skip:questionC→S (host){gameCode, reason}Skip to reveal phase immediately. Server scores based on current answers.
host:extend:timeC→S (host){gameCode, addSeconds: 10}Add time to current question. Server updates endsAt, broadcasts new value.

Scalable Room System

Room State Machine

1

CREATED LOBBY

Room initialised in Redis. PIN generated. Players joining get HSET added. Socket.io room created.

2

STARTING LOCKED

Host fires game:start. Room locked — no new joins. 3-2-1 countdown broadcast with absolute epoch timestamps.

3

ACTIVE PLAYING

Question loop running. Answers streamed to Redis. Timer tracked server-side. Anti-cheat validating each submission.

4

REVEALING REVEAL

Timer expired or all answered. Correct answer broadcast, scores calculated, leaderboard updated in Redis ZSET.

5

ENDED COMPLETE

Final results persisted to PostgreSQL. Redis TTL set to 300s for reconnect window, then purged. XP awarded.

Redis Room Schema

Redis Keys
# Room metadata (Hash)
HSET room:{gameCode}
  hostId       "uid_abc123"
  quizId       "quiz_789"
  state        "ACTIVE"          # LOBBY|STARTING|ACTIVE|REVEAL|ENDED
  currentQ     3
  totalQ       10
  questionEndsAt 1703123456789   # epoch ms
  pausedAt     ""
  podId        "pod-7f3k"        # owning pod
EXPIRE room:{gameCode} 7200

# Players in room (Set)
SADD room:{gameCode}:players uid_p1 uid_p2 ...

# Per-player metadata (Hash)
HSET room:{gameCode}:player:{uid}
  name         "Blaze99"
  avatar       "🦊"
  score        4250
  streak       3
  connected    1
  socketId     "sck_xyz"

# Live leaderboard (Sorted Set — O(log N) updates)
ZADD room:{gameCode}:lb 4250 uid_p1
ZADD room:{gameCode}:lb 3800 uid_p2

# Answers for current question (Hash)
HSET room:{gameCode}:q:{idx}:answers
  uid_p1  "2:1703123442100"   # answerIdx:submittedAt
  uid_p2  "0:1703123443200"

# Answer distribution (for progress bar)
INCR room:{gameCode}:q:{idx}:dist:2  # answer index 2

# Anti-cheat: seen hashes (prevent replay)
SETEX ac:{gameCode}:{uid}:{qIdx} 30 "hash_xyz"
Cross-Pod Broadcasting: When a player on Pod-1 submits an answer, it triggers a leaderboard update. The @socket.io/redis-adapter publishes to Redis channel socket.io#room:{gameCode}. All other pods subscribed to that channel forward the broadcast to their connected sockets. This means pod assignment doesn't matter — every player gets every event.

Timer & State Sync Logic

Clock Synchronisation (NTP-style)

Client TypeScript
// Client-side clock sync using ping/pong
class ClockSync {
  private offset = 0;    // ms delta to server
  private rtt    = 0;    // round-trip time

  sync(socket: Socket) {
    const t0 = Date.now();
    socket.emit('ping', { t: t0 });

    socket.once('pong', ({ t, serverTime }) => {
      const t3 = Date.now();
      this.rtt    = t3 - t0;
      this.offset = serverTime - (t0 + t3) / 2;
    });
  }

  // Convert server timestamp → local display time
  serverToLocal(serverMs: number): number {
    return serverMs - this.offset;
  }

  // Get remaining ms until server deadline
  msUntil(endsAt: number): number {
    const serverNow = Date.now() + this.offset;
    return Math.max(0, endsAt - serverNow);
  }
}

// Usage — question timer driven by server epoch
socket.on('game:question:show', ({ endsAt }) => {
  const remaining = clock.msUntil(endsAt);
  // All clients see identical countdown regardless of join latency
  timerBar.start(remaining);
});

Server Timer Authority

Node.js Server
// Server is single source of truth for time
async function startQuestion(
  gameCode: string, qIdx: number
) {
  const room  = await redis.hgetall(`room:${gameCode}`);
  const quiz  = await getQuiz(room.quizId);
  const q     = quiz.questions[qIdx];
  const endsAt = Date.now() + q.timeMs;   // absolute epoch

  // Store deadline in Redis for reconnect recovery
  await redis.hset(`room:${gameCode}`, {
    state: 'ACTIVE',
    currentQ: qIdx,
    questionEndsAt: endsAt
  });

  // Broadcast WITHOUT answer — clients can't cheat
  io.to(gameCode).emit('game:question:show', {
    qIndex: qIdx,
    question: q.text,
    answers: q.options,    // shuffled, no correct flag
    endsAt,                // epoch ms — clients sync to this
    totalQuestions: quiz.questions.length
  });

  // Server-side timer — authoritative end
  await scheduleReveal(gameCode, qIdx, q.timeMs);
}

// Use Bull queue for reliable timer (survives pod crash)
async function scheduleReveal(
  gameCode: string, qIdx: number, delayMs: number
) {
  await revealQueue.add(
    { gameCode, qIdx },
    { delay: delayMs, attempts: 3 }
  );
}
Answer Processing Pipeline
// Answer submission handler — full validation pipeline
socket.on('game:answer:submit', async ({ gameCode, qIndex, answerIdx, clientTime, answerHash }) => {
  const playerId = socket.data.playerId;  // set from JWT on connect

  // 1. Validate room state
  const room = await redis.hgetall(`room:${gameCode}`);
  if (room.state !== 'ACTIVE' || +room.currentQ !== qIndex) return;

  // 2. Anti-cheat: check not already answered this question
  const alreadyAnswered = await redis.hexists(
    `room:${gameCode}:q:${qIndex}:answers`, playerId
  );
  if (alreadyAnswered) return;

  // 3. Validate server-side deadline (client can't fake timing)
  const serverNow = Date.now();
  if (serverNow > +room.questionEndsAt + 500) return; // 500ms grace

  // 4. Validate HMAC hash (anti-replay / packet tampering)
  const expected = hmac(`${gameCode}:${playerId}:${qIndex}:${answerIdx}`);
  if (answerHash !== expected) return;

  // 5. Atomic store — HSETNX prevents race conditions
  const stored = await redis.hsetnx(
    `room:${gameCode}:q:${qIndex}:answers`,
    playerId,
    `${answerIdx}:${serverNow}`  // store server time, not client time
  );
  if (!stored) return;  // race condition — already set

  // 6. Increment distribution counter for progress bar
  await Promise.all([
    redis.incr(`room:${gameCode}:q:${qIndex}:dist:${answerIdx}`),
    redis.incr(`room:${gameCode}:q:${qIndex}:totalAnswered`),
  ]);

  // 7. Ack immediately (before reveal)
  socket.emit('game:answer:ack', { received: true, serverTime });

  // 8. Check if all players answered → early reveal
  const [answered, total] = await Promise.all([
    redis.get(`room:${gameCode}:q:${qIndex}:totalAnswered`),
    redis.scard(`room:${gameCode}:players`),
  ]);
  if (+answered >= total) triggerEarlyReveal(gameCode, qIndex);
});

Anti-Cheat Strategy

Layer 1

Server-Side Scoring

Correct answers are never sent to the client before reveal. Score calculation only happens server-side. Client-reported scores are ignored entirely.

Critical
Layer 2

HMAC Answer Hashing

Each answer submission includes an HMAC-SHA256 signature of gameCode:playerId:qIdx:answerIdx using a session secret. Prevents packet replay attacks.

High Value
Layer 3

Server Timestamp Authority

Server records submission time using its own clock. Client-sent timestamps are ignored for scoring. 500ms grace period for network jitter. No client can fake speed.

Timer Integrity
Layer 4

One-Answer Atomic Lock

Redis HSETNX ensures exactly-once semantics. First write wins atomically. Subsequent submissions are silently dropped — no double-answering possible.

Race-Proof
Layer 5

Rate Limiting per Player

Redis sliding window rate limiter: max 2 answer events per question per player. Bot detection via submission pattern analysis. Auto-kick after 5 violations.

Bot Defense
Layer 6

Question Index Validation

Server verifies submitted qIndex matches current question. Players can't pre-answer future questions or re-answer past ones.

State Validation

Score Calculation — Server Side Only

Node.js — Scoring Engine
interface ScoringConfig {
  basePoints:      number;  // 1000
  maxTimeBonus:    number;  // 500 — for answering instantly
  streakMultiplier:number;  // 1.5x for 3+ streak
  wrongPenalty:    number;  // 0 (no negative) or -50 for hard mode
}

async function calculateScore(
  gameCode: string, playerId: string,
  qIdx: number,   correctIdx: number,
  config: ScoringConfig
): Promise<number> {

  // Retrieve player's answer (stored with server timestamp)
  const raw = await redis.hget(
    `room:${gameCode}:q:${qIdx}:answers`, playerId
  );
  if (!raw) return 0;  // didn't answer

  const [answerIdx, submittedAt] = raw.split(':');
  if (+answerIdx !== correctIdx) return config.wrongPenalty;

  // Time bonus: full points for instant, zero at deadline
  const room      = await redis.hgetall(`room:${gameCode}`);
  const qDuration = await getQuestionDuration(room.quizId, qIdx);
  const elapsed   = +submittedAt - (+room.questionEndsAt - qDuration);
  const timeFrac  = Math.max(0, 1 - elapsed / qDuration);
  const timeBonus = Math.floor(config.maxTimeBonus * timeFrac);

  // Streak bonus from player metadata
  const streak = await redis.hget(
    `room:${gameCode}:player:${playerId}`, 'streak'
  );
  const mult = +streak >= 3 ? config.streakMultiplier : 1;

  return Math.floor((config.basePoints + timeBonus) * mult);
}

100K Player Scaling Strategy

~3K
Connections / Pod
34
Pods for 100K
6
Redis Cluster Nodes
3
PostgreSQL Nodes
16
Redis Shards

Kubernetes HPA Config

YAML — HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: quizblaze-ws
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: quizblaze-ws
  minReplicas: 3
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60
  - type: Pods
    pods:
      metric:
        name: ws_connections_per_pod
      target:
        type: AverageValue
        averageValue: "2800"    # scale before 3K
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Pods
        value: 5             # add 5 pods at once
        periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300  # slow drain

Socket.io Redis Adapter

Node.js — Multi-Pod Setup
import { createAdapter } from '@socket.io/redis-adapter';
import { createClient }  from 'redis';

// Two Redis clients — one pub, one sub
const pubClient = createClient({ url: process.env.REDIS_URL });
const subClient = pubClient.duplicate();

await Promise.all([
  pubClient.connect(),
  subClient.connect()
]);

io.adapter(createAdapter(pubClient, subClient));

// Now io.to(roomId).emit() works across ALL pods
// Pod-1 can broadcast to players on Pod-7, Pod-23, etc.

// Graceful shutdown — drain connections first
process.on('SIGTERM', async () => {
  // Stop accepting new connections
  io.close();

  // Wait for in-flight events to complete
  await sleep(5000);

  // Clients auto-reconnect to another pod
  process.exit(0);
});

Capacity Planning

Scale TierConcurrent PlayersWS PodsRedisPostgreSQLEstimated Cost/mo
Starter 1K2 pods (2vCPU/4GB)Single node 4GBSingle db.t3.medium ~$180
Growth 10K5 pods (4vCPU/8GB)Cluster 3×6GBPrimary + 1 read replica ~$850
Scale 100K34 pods (4vCPU/8GB)Cluster 6×16GBPrimary + 2 replicas + RDS Proxy ~$6,400
Kahoot-level 10M+Multi-region, 500+ podsEnterprise cluster + ElastiCacheAurora Global + Citus sharding ~$80K+

Redis Cluster Sharding Strategy

Redis Cluster Key Design
// Hash tags force related keys to same shard
// All keys for a room route to same Redis node via {gameCode}
room:{ABC123}
room:{ABC123}:players
room:{ABC123}:lb
room:{ABC123}:q:3:answers
room:{ABC123}:player:uid_xyz

// This means all room operations are local — no cross-shard transactions
// ZADD, HSET, HSETNX, INCR all execute on same shard = fast + atomic

// Leaderboard sorted set — O(log N) insertion, O(N) range query
ZADD room:{ABC123}:lb NX 0 playerId        // init score 0
ZINCRBY room:{ABC123}:lb 850 playerId      // add points atomically
ZREVRANGE room:{ABC123}:lb 0 49 WITHSCORES // top 50 in O(N)

// Throttled leaderboard broadcast — push every 2s, not on every answer
const lbKey = `lb_throttle:${gameCode}`;
const shouldSend = await redis.set(lbKey, 1, 'NX', 'EX', 2);
if (shouldSend) broadcastLeaderboard(gameCode);

Failover & Reconnect Strategy

🔄 Client Reconnect Flow

!

Disconnect Detected

Socket.io detects disconnect via heartbeat timeout (30s). Server marks player as connected: 0 in Redis. Game continues — no pause.

Exponential Backoff Retry

Client retries: 1s → 2s → 4s → 8s → 16s (max). Socket.io handles this automatically. New connection can land on any pod.

🔑

Session Token Handshake

Client sends reconnect:attempt with JWT + gameCode + playerId. Any pod can validate since state is in Redis.

State Snapshot Restore

Server fetches full room state from Redis: current question, time remaining, player's score, leaderboard. Sends as reconnect:state snapshot.

Pod Failure Recovery

Node.js — Reconnect Handler
// Server: handle reconnection from any pod
socket.on('reconnect:attempt', async (data) => {
  const { gameCode, playerId, sessionToken } = data;

  // Validate session token (Redis blacklist check)
  const valid = await validateSession(sessionToken);
  if (!valid) { socket.emit('error:session'); return; }

  // Fetch current game state from Redis
  const [room, player, lb] = await Promise.all([
    redis.hgetall(`room:${gameCode}`),
    redis.hgetall(`room:${gameCode}:player:${playerId}`),
    redis.zrevrange(`room:${gameCode}:lb`, 0, 49, 'WITHSCORES'),
  ]);

  if (!room || room.state === 'ENDED') {
    socket.emit('error:game_ended'); return;
  }

  // Rejoin Socket.io room (new pod, same logical room)
  await socket.join(gameCode);

  // Update player connection status
  await redis.hset(`room:${gameCode}:player:${playerId}`, {
    connected: 1, socketId: socket.id
  });

  // Build state snapshot for client
  const timeRemaining = +room.questionEndsAt - Date.now();

  socket.emit('reconnect:state', {
    gameState:     room.state,
    currentQ:      +room.currentQ,
    questionEndsAt:+room.questionEndsAt,  // client re-syncs timer
    myScore:       +player.score,
    myStreak:      +player.streak,
    leaderboard:   parseLeaderboard(lb),
    serverTime:    Date.now(),         // for clock sync
  });
});
Bull Queue for Timers: Game timers run as Bull jobs in Redis, not in pod memory. If the pod processing the timer crashes, Bull retries on another pod automatically. Zero timer loss.

Failure Scenarios & Mitigations

FailureImpactDetectionMitigationRTO
WS Pod crashes~3K players disconnectK8s liveness probe (5s)Clients auto-reconnect to other pods. State in Redis. New pod spins in 30s.~5s reconnect
Redis primary failsAll operations stallRedis SentinelSentinel promotes replica in <30s. Clients see brief error, retry succeeds.~30s
PostgreSQL failureAuth + persistence failsHealthcheck + RDS Multi-AZRDS Multi-AZ automatic failover. Read replicas absorb reads.~60s
Load balancer downAll traffic failsCloud provider monitorHAProxy in active-passive pair. DNS failover to secondary.~60s
Network partitionSplit-brain riskRedis Cluster votingRedis requires quorum (3/5) for writes. Partitioned nodes enter read-only mode.Auto
Memory leak in podDegraded performancePrometheus OOM alertK8s restarts pod (memory limit: 8GB). Graceful drain first via SIGTERM handler.~60s

PostgreSQL Schema

SQL — Core Tables
-- Users
CREATE TABLE users (
  id            UUID         PRIMARY KEY DEFAULT gen_random_uuid(),
  username      VARCHAR(32)  UNIQUE NOT NULL,
  email         VARCHAR(255) UNIQUE NOT NULL,
  password_hash TEXT         NOT NULL,
  avatar_id     SMALLINT     DEFAULT 0,
  xp            INT          DEFAULT 0,
  level         SMALLINT     DEFAULT 1,
  streak        SMALLINT     DEFAULT 0,
  streak_last_at TIMESTAMPTZ,
  badges        JSONB        DEFAULT '[]',
  settings      JSONB        DEFAULT '{}',
  created_at    TIMESTAMPTZ  DEFAULT NOW(),
  last_seen_at  TIMESTAMPTZ  DEFAULT NOW()
);
CREATE INDEX idx_users_xp ON users (xp DESC);

-- Quizzes
CREATE TABLE quizzes (
  id          UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  creator_id  UUID        REFERENCES users(id) ON DELETE CASCADE,
  title       VARCHAR(120) NOT NULL,
  description TEXT,
  category    VARCHAR(40),
  difficulty  SMALLINT    CHECK (difficulty BETWEEN 1 AND 3),
  is_public   BOOLEAN     DEFAULT TRUE,
  play_count  INT         DEFAULT 0,
  avg_score   NUMERIC(5,2),
  created_at  TIMESTAMPTZ DEFAULT NOW()
);

-- Questions (stored as JSONB array on quiz for fast reads)
CREATE TABLE questions (
  id          UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  quiz_id     UUID        REFERENCES quizzes(id) ON DELETE CASCADE,
  position    SMALLINT    NOT NULL,
  question    TEXT        NOT NULL,
  options     JSONB       NOT NULL,  -- ["opt0","opt1","opt2","opt3"]
  correct_idx SMALLINT    NOT NULL,  -- NEVER exposed via API pre-reveal
  explanation TEXT,
  time_ms     INT         DEFAULT 20000,
  points      SMALLINT    DEFAULT 1000
);

-- Game sessions (persistent record)
CREATE TABLE game_sessions (
  id           UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  game_code    CHAR(6)     NOT NULL,
  quiz_id      UUID        REFERENCES quizzes(id),
  host_id      UUID        REFERENCES users(id),
  player_count SMALLINT,
  final_state  JSONB,       -- full leaderboard snapshot
  started_at   TIMESTAMPTZ,
  ended_at     TIMESTAMPTZ,
  duration_ms  INT
);
CREATE INDEX idx_sessions_quiz ON game_sessions(quiz_id, started_at DESC);

-- Player game results
CREATE TABLE player_results (
  id           UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id   UUID        REFERENCES game_sessions(id) ON DELETE CASCADE,
  player_id    UUID        REFERENCES users(id),
  rank         SMALLINT,
  score        INT,
  correct_count SMALLINT,
  best_streak  SMALLINT,
  xp_earned    INT,
  answers      JSONB,       -- [{qIdx, answerIdx, correct, pts, ms}]
  created_at   TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_results_player ON player_results(player_id, created_at DESC);
CREATE INDEX idx_results_session ON player_results(session_id, rank);

Latency Optimisation

📦

Binary Message Packing

Replace JSON with msgpack binary serialization. Answer events drop from ~180 bytes to ~40 bytes. Leaderboard broadcasts shrink 60%. Critical at 100K concurrent senders.

~60% payload reduction
🗜️

Selective Broadcasting

Leaderboard updates throttled to 1 per 2s using Redis NX lock. Answer progress (distribution bars) throttled separately. Full snapshot only on question reveal. Reduces Redis pub/sub pressure 10×.

10× fewer broadcasts

Redis Pipeline Batching

Answer processing pipelines 4 Redis commands (HSETNX, INCR, INCR, HGET) into a single round-trip. Scoring pipelines 6 commands. Reduces Redis RTT overhead by 80% at peak load.

5× Redis throughput
🌍

GeoDNS + Edge Nodes

Route53 latency-based routing directs players to nearest region: US-East, EU-West, AP-Southeast. Average RTT drops from 180ms to 35ms for EU players. Edge terminates TLS.

~145ms saved (EU)
🔌

HTTP/3 + WebTransport

Initial connection uses HTTP/3 QUIC for 0-RTT reconnects after network change (WiFi → cellular). WebTransport as long-term WS replacement — no HoL blocking, multiplexed streams.

0-RTT reconnect
💾

Quiz Preloading

Full quiz loaded from PostgreSQL into Redis on room creation. During gameplay, every question read hits Redis only (~0.3ms). No DB queries during active gameplay. Questions invalidated after game ends.

Zero DB reads mid-game

CDN Strategy for Static Assets

CloudFront + S3 Config
// next.config.js — CDN asset configuration
const nextConfig = {
  assetPrefix: process.env.CDN_URL,  // https://cdn.quizblaze.com
  images: {
    domains: ['cdn.quizblaze.com'],
    formats: ['image/avif', 'image/webp'],
  },

  // Aggressive caching — JS/CSS chunks are content-hashed
  async headers() {
    return [{
      source: '/_next/static/:path*',
      headers: [{
        key: 'Cache-Control',
        value: 'public, max-age=31536000, immutable'  // 1 year
      }]
    }];
  }
};

// CloudFront cache behavior for game assets
# TTL: 1yr for hashed assets, 0 for API routes, 5min for HTML
# Compress: gzip + brotli enabled
# HTTP/3: enabled on all distributions
# Origin shield: us-east-1 (reduces origin hits 90%)
# Price class: PriceClass_All (all edge locations)

// WS connections bypass CDN — direct to LB via separate subdomain
// wss://ws.quizblaze.com → HAProxy → WS pods
// https://quizblaze.com  → CloudFront → Next.js → API
⚠️
Critical: Never route WebSocket traffic through CloudFront — it doesn't support persistent connections. Use a separate subdomain (e.g. wss://ws.quizblaze.com) pointing directly to HAProxy. SSL termination happens at the load balancer, not the CDN.