315157973 opened a new issue, #18128:
URL: https://github.com/apache/pulsar/issues/18128

   ### Motivation
   
   Broker uses the `Trimledgers` thread to clean up outdated ledgers. During 
clearing, each Broker traverses the topic metadata in memory to find the ledger 
that reach the retention or TTL threshold. 
   However, there are some problems with this approach. When a topic has no 
producer and consumer, Broker deletes the metadata of topic from memory. As a 
result, ledgers of these topics can never be deleted.
   Therefore, we need a way to scan and clean all outdated ledgers .
   
   ### Goal
   
   The full scan will cause a large number of requests to the ZooKeeper.
    Therefore, the existing cleanup mode will be retained and a full scan mode 
will be added.
   
   
   ### API Changes
   
   1. Add a new scheduling thread pool
   
   2. Add the following configuration item:
   # Full scan interval. This parameter is enabled only when the value > 0.
   fullScanTrimLedgerInterval=0
   # Maximum number of ZooKeeper requests per second during scanning
   fullScanMaximumZooKeeperRequestsPerSecond=200
   
   ### Implementation
   
   1. Only the Leader Broker performs full scan.
   2. Leader Broker traverse `managedLedger` in each namespace from ZK. Since 
Ledger metadata contains the creation time. If the creation time is greater 
than the retention time + TTL time, Ledger should be deleted. 
   Only the metadata of Ledger is parsed instead of loading all topics to the 
memory.
   The zk request frequency is limited using semaphore.
   
   3. When a topic that meets the conditions, the leader broker loads the topic 
and invokes its `TrimLedger` method. After cleaning is done, the leader closes 
the topic to release memory.
   
   ### Alternatives
   
   _No response_
   
   ### Anything else?
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to