+1 -Lari
On 2026/04/24 16:46:56 PengHui Li wrote: > Hi all, > > I'd like to start a discussion on PIP-470, which proposes a broker option > to close (unload) inactive topics from broker memory without deleting their > data. > > PIP: https://github.com/apache/pulsar/pull/25574/files#diff-pip-470 > Prototype PR: https://github.com/apache/pulsar/pull/25574 > Problem > > Deployments with very large numbers of mostly-idle topics (commonly tens of > thousands to millions, with a long tail of low-traffic topics) face two > recurring problems: > > 1. Broker memory pressure. Every loaded topic pins a managed ledger with > its cache, subscription/cursor state, rate limiters, dispatchers, and > schema references. An idle topic that hasn't been produced/consumed for > hours still occupies all of that memory. > 2. Metrics cardinality. Per-topic metric series grow linearly with the > number of loaded topics, inflating scrape payloads and monitoring cost. > > Today the only built-in remedy is brokerDeleteInactiveTopicsEnabled, but > that deletes the data — which many operators explicitly do not want. Their > remaining options are: > > - Leave every idle topic loaded and pay the memory/metrics cost, or > - Run an external cron that polls topics stats and calls pulsar-admin > topics unload per topic — awkward, reimplements the existing inactivity > detection, and adds a moving part to the deployment. > > Proposal > > Add a new dynamic broker configuration: > > brokerCloseInactiveTopicsEnabled = false # default > > When enabled, the existing inactivity monitor reuses its current detection > (mode, frequency, max-inactive-duration) but performs a close — the same > code path as pulsar-admin topics unload — instead of a delete. Ledgers in > BookKeeper, subscriptions, cursors, and topic policies are all preserved; > only the in-memory topic and its broker-cache entry are released. The next > produce/consume reconnect transparently reloads the topic. > > The new flag is mutually exclusive with brokerDeleteInactiveTopicsEnabled; > broker startup fails fast if both are set. > Design highlights > > - Reuses brokerDeleteInactiveTopicsMode / FrequencySeconds / > MaxInactiveDurationSeconds for detection — no new detection surface. > - Wires into the existing PersistentTopic.checkGC() / > NonPersistentTopic.checkGC() by swapping the terminal action. The > retention-window guard is bypassed in the close branch because it exists to > prevent data loss, which is moot when nothing is deleted. > - No admin-API, wire-protocol, or schema changes. > - Default is false, so the change is behavior-preserving for existing > deployments. > > Out of scope for v1 > > - Per-topic or per-namespace overrides (broker-level only in v1; a > follow-up can extend InactiveTopicPolicies with an action field if > operators want per-namespace control). > - Changes to InactiveTopicDeleteMode or InactiveTopicPolicies schema. > - A new admin endpoint — manual unload remains available for ad-hoc use. > > > Regards, > Penghui >
