Hi there,
   I'm preparing the patch for new 64-bits zookeeper-based ledger manager with 
radix tree (BOOKKEEPER-553), but I get a problem.

   A LedgerManager implementation needs to complete a method called 
"getLedgerRanges", which scans all ledger ids in order from metadata storage to 
do local Scan-And-Compare GC. However, for a radix tree ledger manager 
implementation, scan all ledgers in order (in BFS way) may require large memory 
usage.

   After digging the Scan-And-Compare GC, I find it's not necessary to make any 
order requirement. What Scan-And-Compare GC approach wants to know is the 
ledger id that exists in local bookie server but not in metadata storage. So we 
can do it in another way:
   1. Get a snapshot of local ledger id list "L".
   2. Get all the ledger id from metadata storage and remove it from list "L" 
(here we do not require the metadata storage return ledger id range in any 
order guarantee).
   3. After step 2 finish looping all ledger id in metadata storage, GC the 
remaining ledger id in "L" list.
   By this, we don't require a "ORDER SCAN" now, 
LedgerManager#asyncProcessLedgers is enough to do this job.

   Of cause, this implementation has one drawback: GC process can only take 
place after iterating all ledger id in metadata storage. But I just don't think 
we need specific order guarantee for Scan-And-Compare GC, there are already 
some other better improved GC approaches.

- Jiannan

Reply via email to