Hi all, I'd like to start a discussion on PIP-470, which proposes a broker option to close (unload) inactive topics from broker memory without deleting their data.
PIP: https://github.com/apache/pulsar/pull/25574/files#diff-pip-470 Prototype PR: https://github.com/apache/pulsar/pull/25574 Problem Deployments with very large numbers of mostly-idle topics (commonly tens of thousands to millions, with a long tail of low-traffic topics) face two recurring problems: 1. Broker memory pressure. Every loaded topic pins a managed ledger with its cache, subscription/cursor state, rate limiters, dispatchers, and schema references. An idle topic that hasn't been produced/consumed for hours still occupies all of that memory. 2. Metrics cardinality. Per-topic metric series grow linearly with the number of loaded topics, inflating scrape payloads and monitoring cost. Today the only built-in remedy is brokerDeleteInactiveTopicsEnabled, but that deletes the data — which many operators explicitly do not want. Their remaining options are: - Leave every idle topic loaded and pay the memory/metrics cost, or - Run an external cron that polls topics stats and calls pulsar-admin topics unload per topic — awkward, reimplements the existing inactivity detection, and adds a moving part to the deployment. Proposal Add a new dynamic broker configuration: brokerCloseInactiveTopicsEnabled = false # default When enabled, the existing inactivity monitor reuses its current detection (mode, frequency, max-inactive-duration) but performs a close — the same code path as pulsar-admin topics unload — instead of a delete. Ledgers in BookKeeper, subscriptions, cursors, and topic policies are all preserved; only the in-memory topic and its broker-cache entry are released. The next produce/consume reconnect transparently reloads the topic. The new flag is mutually exclusive with brokerDeleteInactiveTopicsEnabled; broker startup fails fast if both are set. Design highlights - Reuses brokerDeleteInactiveTopicsMode / FrequencySeconds / MaxInactiveDurationSeconds for detection — no new detection surface. - Wires into the existing PersistentTopic.checkGC() / NonPersistentTopic.checkGC() by swapping the terminal action. The retention-window guard is bypassed in the close branch because it exists to prevent data loss, which is moot when nothing is deleted. - No admin-API, wire-protocol, or schema changes. - Default is false, so the change is behavior-preserving for existing deployments. Out of scope for v1 - Per-topic or per-namespace overrides (broker-level only in v1; a follow-up can extend InactiveTopicPolicies with an action field if operators want per-namespace control). - Changes to InactiveTopicDeleteMode or InactiveTopicPolicies schema. - A new admin endpoint — manual unload remains available for ad-hoc use. Regards, Penghui
