[
https://issues.apache.org/jira/browse/BOOKKEEPER-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506496#comment-13506496
]
Fangmin Lv commented on BOOKKEEPER-249:
---------------------------------------
An interesting idea, but I think it's not appropriate for our scenario. It's
hard and high costs to maintain the greatest lower bound(GLB), when delete one
ledger we need to scan zookeeper and meta store to decide the GLB, partially or
totally.
Maybe there will be large space waste, consider an extreme case, suppose we
have L1, L2, ... L(2n-1), L(2n) ledgers, only the odd ledgers were deleted in
the following time, then the deleted ledger will never been reclaimed.
Also it's difficult to keep things correct, take the same example:
1. initially we set the GLB to 0, we had L1, L5, and L6 in a bookie. and L5 is
deleted.
2. a gc cycle is coming. found the GLB is 0, skip gc
3. L1 is deleted, GLB is updated to L5
4. new gc cycle is coming. found the GLB is L5 and would delete ledgers less
then L5, in the mean time, L2 is added to the bookie, then this bookie will
delete L2 which should not happen.
Correct me if I'm wrong or misread your idea, thanks.
> Revisit garbage collection algorithm in Bookie server
> -----------------------------------------------------
>
> Key: BOOKKEEPER-249
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-249
> Project: Bookkeeper
> Issue Type: Improvement
> Components: bookkeeper-server
> Reporter: Sijie Guo
> Fix For: 4.2.0
>
> Attachments: gc_revisit.pdf
>
>
> Per discussion in BOOKKEEPER-181, it would be better to revisit garbage
> collection algorithm in bookie server. so create a subtask to focus on it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira