GitHub user hanishi added a comment to the discussion: Pekko Ad 
Network(promovolve)

I should be clear about the role the LLM played here. The pacing logic was not 
designed by the model. What it provided was acceleration: a way to iterate 
through reasoning faster, surface edge cases earlier, and pressure-test 
assumptions, particularly around control-theoretic failure modes that would 
otherwise have taken much longer to uncover.

The final system is entirely the product of human understanding and 
mathematical invariants. Every decision, constraint, and trade-off was derived, 
validated, and owned by me. The LLM shortened the path to insight; it did not 
replace the reasoning required to get there. Below are notes produced through 
iterative discussion between the LLM and myself.

# Budget Pacing

Budget pacing spreads ad spend evenly throughout the day, preventing the budget 
from being exhausted early. This prevents campaigns from going dark in the 
afternoon after burning through their budget in the morning.

## Overview

Pacing is **always enforced** at the AdServer level using a pluggable 
`PacingStrategy`. The default strategy is `RateAwarePacing`, which uses PI 
(Proportional-Integral) control to dynamically adjust throttling based on:

- **Spend ratio**: actual spend vs expected spend (traffic-shaped or linear)
- **Request arrival rate**: observed requests/sec for accurate throttle 
calculation
- **Traffic shape**: learned or configured hourly traffic patterns

## Architecture

Pacing is modular and configurable:

```
promovolve.publisher.delivery/
  PacingStrategy.scala      -- trait with shouldServe() method + PacingContext
  AdaptivePacing.scala      -- RateAwarePacing factory with PI control
  TrafficShapeTracker.scala -- learns/stores traffic patterns per time bucket

promovolve.publisher.delivery.pacing/
  DayClock.scala            -- real vs simulated day handling (UTC-based)
  TrafficObserver.scala     -- EMA-smoothed request rate tracking
  PacingController.scala    -- coordinates pacing state and day boundaries
```

### Flow Diagram

**IMPORTANT**: Pacing decision (`shouldServe`) happens BEFORE Thompson Sampling 
selection.
This prevents exploration bias where TS picks an exploration arm that then gets 
filtered by pacing.

```
  AdServer                          CampaignEntity
     │                                    │
     │  ┌─────────────────────┐           │
     │  │ PacingStrategy      │           │
     │  │ .shouldServe()      │           │
     │  └──────────┬──────────┘           │
     │             │                      │
     │      [if shouldServe=false]        │
     │         → return NoSelection       │
     │         → do NOT run TS            │
     │                                    │
     │      [if shouldServe=true]         │
     │             │                      │
     │  ┌─────────────────┐               │
     │  │ Thompson Sample │               │
     │  │ (pick winner)   │               │
     │  └────────┬────────┘               │
     │           │                        │
     │── TryReserve ─────────────────────▶│
     │◀── Reserved / InsufficientBudget ──│
     │                                    │
     ▼                                    ▼
```

**Key design decisions:**
- Pacing is always enforced (no opt-out)
- `shouldServe()` is called BEFORE Thompson Sampling (critical for correct TS 
learning)
- Strategies use `PacingContext` for all state (spend, budget, time, traffic 
shape)
- AdServer orchestrates pacing (not CampaignEntity)
- Easy to swap strategies without code changes

## RateAwarePacing (Default Strategy)

PI-controlled pacing that directly adjusts throttle probability based on spend 
error.

### How It Works

1. **Base throttle**: Calculate what throttle would achieve perfect pace
   ```
   baseThrottle = 1 - (targetImpsPerSec / requestRate)
   ```

2. **Error calculation**: Positive when under-paced, negative when over-paced
   ```
   error = 1.0 - spendRatio
   ```

3. **PI adjustment**: Directly added/subtracted from throttle
   ```
   adjustment = Kp × error + Ki × integralError
   finalThrottle = baseThrottle - adjustment
   ```

4. **Traffic shape multiplier**: Scale target based on expected traffic volume
   - During peak hours (multiplier > 1): higher target allows more impressions
   - During valley hours (multiplier < 1): lower target prevents overspend

### PI Control Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `Kp` | 0.5 | Proportional gain - immediate response to error |
| `Ki` | 0.3 | Integral gain - accumulated error correction |
| `feedforwardWindow` | 0.0-0.2 | Proactive adjustment near bucket transitions |
| `gracePeriodFraction` | 0.01 | Startup period (1% of day) with no PI 
adjustment |

### Asymmetric Gains

The controller uses **asymmetric gains** to recover from overspend faster than 
it accelerates during underspend:

- **Over-pacing** (spendRatio > 1.0): gains multiplied by 2.0x
- **Under-pacing** (spendRatio ≤ 1.0): gains unchanged

This is because overspend is costly (budget exhaustion stops delivery), while 
underspend is recoverable (can catch up later).

### Volatility-Adjusted Gains

When created via `AdaptivePacing.forShape()`, PI gains are automatically tuned 
based on traffic shape volatility:

| Volatility (CV) | Kp | Ki | Feedforward | Use Case |
|-----------------|-----|-----|-------------|----------|
| 0.0 (uniform) | 0.3 | 0.2 | 20% | Flat traffic |
| 0.5 (typical) | 0.5 | 0.3 | 10% | Normal daily pattern |
| 1.0 (high) | 0.8 | 0.5 | 0% | Drastic peaks/valleys |
| 1.5+ (extreme) | 1.0 | 0.6 | 0% | Very spiky traffic |

### Grace Period (Hybrid: Time + Request Count + Staleness)

The grace period uses a **hybrid** condition requiring BOTH time AND request 
count, plus staleness detection:

```scala
// In AdaptivePacing.throttleProbability()
val initialGraceComplete =
  ctx.elapsedSeconds >= MinGraceSeconds &&    // 10 seconds
  ctx.requestCount >= MinGraceRequests        // 50 requests

// Staleness threshold scales with day duration
val scaledStaleThresholdMs = max(
  MinStaleRateThresholdMs,                    // 1000ms minimum
  BaseStaleRateThresholdMs * dayDurationSeconds / 86400
)
val rateIsStale = ctx.msSinceLastRequest > scaledStaleThresholdMs

val inGracePeriod = !initialGraceComplete || rateIsStale
```

**Key constants** (in `AdaptivePacing` object):
```scala
val MinGraceSeconds: Double = 10.0            // Time before PI activates
val MinGraceRequests: Long = 50L              // Requests before PI activates
val BaseStaleRateThresholdMs: Long = 30000L   // Base staleness for real days
val MinStaleRateThresholdMs: Long = 1000L     // Minimum staleness threshold
```

**Scaled staleness examples**:
| Day Duration | Scaled Threshold | Effective |
|--------------|------------------|-----------|
| 86400s (real) | 30000ms | 30 seconds |
| 3600s (1 hour) | 1250ms | 1.25 seconds |
| 600s (10 min) | 208ms → 1000ms | 1 second (clamped to min) |

**During grace period**:
- Uses base throttle only (no PI adjustment)
- Does NOT accumulate integral error
- Prevents step-changes when grace ends

**CRITICAL**: If `requestCount` or `msSinceLastRequest` are not properly passed 
through behavior calls, grace period may never complete, causing pacing to fail 
silently.

**Staleness re-entry**: After a 30+ second traffic gap (e.g., quiet period), 
the rate data is considered stale and grace period re-activates. This prevents 
using outdated rate estimates.

## Traffic Shape Tracking

`TrafficShapeTracker` learns or stores traffic patterns for traffic-aware 
pacing.

### Problem: Linear Pacing

Linear pacing assumes uniform traffic:
```
Budget: $30/day

Linear Target:
Hour:    0    6    12   18   24
Target:  $0  $7.5  $15  $22.5 $30  (straight line)
```

But real traffic has shape:
```
Traffic Volume:
         ▁▁▂▃▅▆███▇▆▅▄▃▂▁▁
         night  peak  evening
```

This causes over-throttling during peaks and impossible targets during valleys.

### Solution: Traffic-Shaped Targeting

The cumulative distribution function (CDF) of traffic becomes the expected 
spend curve:
```
Traffic-Shaped Target:
Hour:    0    6    12   18   24
Target:  $1   $5   $18   $26  $30
           ╱      ╱╲
          ╱      ╱  ╲
         ╱──────╱    ╲────────
         └─ matches traffic shape ─┘
```

### Bucket-to-Time Mapping

The `trafficShape` array (24 values) maps to time **relative to `dayStart`**:

| Index | Time from dayStart | Example (dayStart = midnight UTC) |
|-------|-------------------|-----------------------------------|
| 0     | +0h to +1h        | 00:00 - 01:00 UTC                 |
| 6     | +6h to +7h        | 06:00 - 07:00 UTC                 |
| 12    | +12h to +13h      | 12:00 - 13:00 UTC                 |
| 22    | +22h to +23h      | 22:00 - 23:00 UTC (typical peak)  |
| 23    | +23h to +24h      | 23:00 - 00:00 UTC                 |

### Weekday vs Weekend Shapes

Two separate shapes can be configured:
- `weekdayShapeVolumes`: 24 values for Monday-Friday
- `weekendShapeVolumes`: 24 values for Saturday-Sunday

The system automatically selects the appropriate shape based on the current day 
(UTC).

## DayClock

Handles real vs simulated day timing with **consistent UTC timezone**.

### RealDayClock (dayDurationSeconds = 86400)

- `dayStart` is UTC midnight today
- `elapsedSeconds` = time since midnight UTC
- Traffic shape bucket aligns with UTC wall clock hour

Example at 14:00 UTC:
```
elapsedSeconds = 50400  (14 hours)
bucket = 14
```

### SimulatedDayClock (dayDurationSeconds < 86400)

- `dayStart` is when simulation started (or last day rollover)
- `elapsedSeconds` starts from 0
- Elapsed time is **scaled** to 86400 for traffic shape lookup

Example with `dayDurationSeconds = 600` (10-minute day):
```
After 25 seconds (1/24 of 600):
scaledElapsed = 25 × (86400 / 600) = 3600
bucket = 1
```

### dayDurationSeconds Validation

**`dayDurationSeconds` must not exceed 86400** (24 hours).

- Server rejects values > 86400 with error: `"dayDurationSeconds cannot exceed 
86400 (24 hours)"`
- Client (RunScenario) exits with same error

This ensures the traffic shape (24 buckets) maps correctly to time.

## TrafficObserver

Tracks request arrival rate using exponential moving average (EMA).

```scala
// Update on each request
observer.recordRequest(now)

// Get smoothed rate
val reqPerSec = observer.smoothedRate  // e.g., 150.3
```

The smoothed rate is used by `RateAwarePacing` to calculate base throttle:
```
baseThrottle = 1 - (targetRate / observedRate)
```

Without rate tracking, high-traffic scenarios would cause severe throttle 
oscillation.

## PacingController

Coordinates pacing state across components:

- Tracks `dayStart` for elapsed time calculation
- Detects day boundaries (UTC midnight for real days)
- Manages pacing strategy lifecycle (reset at day boundary)
- Stores/restores traffic shape snapshots

### Day Boundary Detection

```scala
def hasNewDayStarted(newDayStart: Instant): Boolean = {
  val lastDay = LocalDate.ofInstant(lastDayStart, ZoneOffset.UTC)
  val newDay = LocalDate.ofInstant(newDayStart, ZoneOffset.UTC)
  lastDay != newDay
}
```

**All day boundary logic uses UTC** for consistency across server and client.

## UTC Timezone Requirement

**The entire pacing system uses UTC consistently:**

| Component | UTC Usage |
|-----------|-----------|
| `DayClock` | `utcMidnightToday()` for real days |
| `AdServer` | Day boundary detection via `ZoneOffset.UTC` |
| `PacingController` | Day comparison via `ZoneOffset.UTC` |
| `RunScenario` (client) | `LocalTime.now(ZoneOffset.UTC).getHour` for bucket |

This ensures client and server agree on which traffic shape bucket to use.

## PacingContext

Immutable snapshot passed to strategies:

```scala
final case class PacingContext(
    dailyBudget: BigDecimal,
    todaySpend: BigDecimal,
    dayStart: Instant,
    now: Instant,
    requestArrivalRate: Double = 0.0,      // From TrafficObserver
    competingCampaigns: Int = 1,
    avgCpm: Double = 5.0,
    dayDurationSeconds: Int = 86400,       // Must be <= 86400
    trafficShape: Option[TrafficShapeTracker] = None
) {
  def elapsedHours: Double
  def expectedSpendFraction: Double        // Traffic-shaped or linear
  def expectedSpend: BigDecimal
  def spendRatio: Double                   // actual / expected
  def remainingBudget: BigDecimal
  def remainingHours: Double
}
```

## Configuration

### Site Pacing Config

```bash
# Set pacing config for a site
curl -X PUT http://localhost:8080/v1/publishers/pub-1/sites/site-123/pacing \
  -H "Content-Type: application/json" \
  -d '{
    "dayDurationSeconds": 600,
    "weekdayShapeVolumes": 
[0.3,0.2,1.0,0.2,0.0,0.3,0.5,2.5,0.1,2.0,1.5,2.0,2.5,3.0,2.5,2.0,1.5,1.2,1.2,1.0,0.4,2.0,5.0,0.4],
    "weekendShapeVolumes": 
[0.3,0.2,0.1,0.1,0.1,0.2,0.3,0.5,0.8,1.2,1.5,1.8,2.0,2.2,2.3,2.2,2.0,1.8,2.0,2.5,2.8,2.0,1.2,0.5]
  }'
```

### Test Throttle Override

For testing, you can force a fixed throttle probability:

```json
{
  "testThrottleOverride": 0.5
}
```

This bypasses PI control and uses `FixedThrottlePacing(0.5)`.

### Changing the Default Strategy

To use a different strategy, modify `AdServer.apply()`:

```scala
AdServer(
  publisherId,
  // ... other params
  pacingStrategy = AdaptivePacing.forShapeVolumes(myShapeArray),
  // or: pacingStrategy = FixedThrottlePacing(0.3),
)
```

## Observing Pacing

### Server-side Stats

```bash
curl http://localhost:8080/test/site-stats/site-123
```

Response:
```json
{
  "siteId": "site-123",
  "selected": 58,
  "pacingSkipped": 42,
  "budgetExhausted": 0,
  "noCandidates": 0,
  "totalSpend": 0.29,
  "elapsedHours": 0.5,
  "expectedSpendFraction": 0.5
}
```

### RunScenario Reports

When running `RunScenario.scala` with `--continuous`, periodic reports include:

```
  ─── Report @ 100 requests (45.2s elapsed) ───

    Requests:         100  (2/sec)
    Selected:          58 (58.0%)
    Pacing skip:       42

    Pacing status:
      Spend ratio:   1.02x → (stable)
      Spend rate:    $0.0064/sec (target: $0.0067/sec)
      Rate status:   ON PACE
```

## Outcome Types

| Outcome | Description |
|---------|-------------|
| `selected` | Ad was served successfully |
| `pacingSkipped` | Rejected by `shouldServe()` to control spend rate |
| `budgetExhausted` | Campaign has no remaining budget |
| `noCandidates` | No eligible ads for this request |

## Testing

Run a simulation with budget constraints to observe pacing:

```bash
# 1. Start the server
sbt "api/run"

# 2. Run pacing test with traffic shapes (10-minute simulated day)
scala-cli scripts/RunScenario.scala -- \
  --scenario scenarios/continuous.json \
  --continuous

# 3. Or run with real-day timing (aligns with UTC wall clock)
# Edit continuous.json: "dayDurationSeconds": 86400
```

The periodic reports will show:
- Spend ratio converging to 1.0x
- Pacing skip rate adjusting to maintain pace
- Traffic shape bucket changes (for short days)

## Custom Strategies

Implement the `PacingStrategy` trait:

```scala
trait PacingStrategy {
  /** Called BEFORE Thompson Sampling - return true to serve, false to skip */
  def shouldServe(ctx: PacingContext): Boolean

  /** Calculate throttle probability [0.0, 1.0] */
  def throttleProbability(ctx: PacingContext): Double

  /** Reset state at day boundary */
  def reset(): Unit

  /** Strategy name for logging */
  def name: String
}
```

Example custom strategy:
```scala
class TimeOfDayPacing extends PacingStrategy {
  def throttleProbability(ctx: PacingContext): Double = {
    val hour = (ctx.elapsedHours % 24).toInt
    if (hour >= 9 && hour <= 17) 0.0  // No throttle during business hours
    else 0.8  // Heavy throttle off-hours
  }
  def name = "time-of-day"
}
```

## Debugging Pacing Issues

### Issue: 0% Pacing Despite Overspend

**Symptoms**: `pacingSkipped = 0` even with 2x-4x overspend ratio

**Root cause**: Usually caused by grace period never completing.

**Check 1: Behavior parameter propagation**

The `behavior()` function in `AdServer.scala` has many parameters with default 
values:

```scala
def behavior(
  ...
  requestCount: Long = 0L,        // Default: 0
  lastRequestTimeMs: Long = 0L    // Default: 0
)
```

**CRITICAL BUG PATTERN**: If a `behavior()` call doesn't pass these parameters, 
they silently reset to 0:

```scala
// BAD - missing requestCount and lastRequestTimeMs
behavior(
  cachedDomainBlocklist,
  creativeStats,
  serveStats,
  lastDayStart,
  pacingStrategy,
  smoothedReqRate,
  pendingSpendByCampaign,
  dayDurationSeconds,
  spendInfoCache,
  trafficShapeTracker,
  rolloverGraceUntilMs,
  warmupMode
  // MISSING: requestCount, lastRequestTimeMs → both become 0!
)

// GOOD - all parameters passed
behavior(
  cachedDomainBlocklist,
  creativeStats,
  serveStats,
  lastDayStart,
  pacingStrategy,
  smoothedReqRate,
  pendingSpendByCampaign,
  dayDurationSeconds,
  spendInfoCache,
  trafficShapeTracker,
  rolloverGraceUntilMs,
  warmupMode,
  requestCount,       // Preserve state
  lastRequestTimeMs   // Preserve state
)
```

With `requestCount = 0`, the grace period condition `requestCount >= 50` is 
never met, so pacing uses only `baseThrottle` (which is 0 when request rate is 
low).

**How to find this bug**: Search for all `behavior(` calls in `AdServer.scala` 
and verify each one passes all 14 parameters.

**Check 2: SpendInfo cache empty**

If `spendInfoCache` is always empty, pacing gate goes through 
`SpendInfoFetched` path. Check logs for:
```
PACING GATE: Cache empty, fetching spend info from N campaigns
```

If this appears on every request, spend updates aren't being cached.

**Check 3: Stale SpendUpdate filtering**

For simulated days, SpendUpdates are filtered if their `dayStart` is too old:
```scala
val isStale = dayDurationSeconds != 86400 && 
lastDayStart.exists(currentDayStart =>
  su.dayStart.toEpochMilli < currentDayStart.toEpochMilli - 5000
)
```

Check logs for: `Ignoring stale SpendUpdate`

### Issue: Pacing Too Aggressive

**Symptoms**: Very few impressions served, high pacingSkipped

**Possible causes**:
1. PI gains too high for traffic pattern
2. Traffic shape mismatch
3. Rate estimate too high

### Key Log Messages

Enable debug logging and look for:

```
// Pacing decisions
PACING GATE: Cache empty, fetching spend info from N campaigns
PACING GATE: Request throttled (aggregateThrottle=X%)
PACING GATE: Request passes (aggregateThrottle=X%)

// Day boundary
Day rollover detected, resetting pacing for new day

// Grace period
Grace period ended: fresh SpendUpdate received

// SpendUpdate handling
Ignoring stale SpendUpdate: campaign=X updateDayStart=Y currentDayStart=Z
SpendUpdate received: campaign=X spend=Y budget=Z dayStart=W
```



GitHub link: 
https://github.com/apache/pekko/discussions/2608#discussioncomment-15526511

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to