[ https://issues.apache.org/jira/browse/BOOKKEEPER-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864545#comment-15864545 ]
ASF GitHub Bot commented on BOOKKEEPER-944: ------------------------------------------- GitHub user reddycharan opened a pull request: https://github.com/apache/bookkeeper/pull/108 BOOKKEEPER-944: LowWaterMark Storage Threshold - Current implementation toggles READONLY status of the bookie as soon as a directory usage falls below the disk storage threshold. Added LowWaterMark parameter that limits such switches. 1. Bookie transition from RW to RONLY only when all the ledger dirs usage > HWM (storage threshold) 2. Bookie transition from RONLY to RW only when total system disk usage (ledger/index disks) capacity is < LWM 3. When bookie is in RW mode all disks which are < HWM (storage threshold) are RW - refactored code to remove circular dependency between LedgerDirsManager and LedgerDirsMonitor - currently Bookie won't start as read-only if disk usage is above threshold for all disks, instead of that start Bookie in Readonly mode - relevant testcases Author: Andrey Yegorov <ayego...@salesforce.com> Co-Author: Charan Reddy Guttapalem <cguttapa...@salesforce.com> You can merge this pull request into a Git repository by running: $ git pull https://github.com/reddycharan/bookkeeper lwmhwm Alternatively you can review and apply these changes as the patch at: https://github.com/apache/bookkeeper/pull/108.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #108 ---- commit ed8f31d8d501426c7744fb95b9c1b8eae7a2b6d4 Author: Andrey Yegorov <ayego...@salesforce.com> Date: 2016-08-29T23:07:00Z BOOKKEEPER-944: LowWaterMark Storage Threshold - Current implementation toggles READONLY status of the bookie as soon as a directory usage falls below the disk storage threshold. Added LowWaterMark parameter that limits such switches. 1. Bookie transition from RW to RONLY only when all the ledger dirs usage > HWM (storage threshold) 2. Bookie transition from RONLY to RW only when total system disk usage (ledger/index disks) capacity is < LWM 3. When bookie is in RW mode all disks which are < HWM (storage threshold) are RW - refactored code to remove circular dependency between LedgerDirsManager and LedgerDirsMonitor - currently Bookie won't start as read-only if disk usage is above threshold for all disks, instead of that start Bookie in Readonly mode - relevant testcases Author: Andrey Yegorov <ayego...@salesforce.com> Co-Author: Charan Reddy Guttapalem <cguttapa...@salesforce.com> ---- > Multiple issues and improvements to BK Compaction. > -------------------------------------------------- > > Key: BOOKKEEPER-944 > URL: https://issues.apache.org/jira/browse/BOOKKEEPER-944 > Project: Bookkeeper > Issue Type: Improvement > Components: bookkeeper-server > Affects Versions: 4.4.0 > Reporter: Venkateswararao Jujjuri (JV) > Assignee: Venkateswararao Jujjuri (JV) > > We have identified multiple issues with BK compaction. > This issue is to list all of them in one Jira ticket. > 1. > MajorCompaction and MinorCompaction are very basic. Either they do it or > won’t do it. Proposal is to add Low Water Mark(LWM) and High Water Mark(HWM) > to the disk space. Have different compaction frequency and re-claim %s when > the disk space is < low water mark , > LWM < HWM, > HWM. > 2. > MajorCompaction and Minor Compactions are strictly frequency based. They > should at least time of the day based, and also run during low system load, > and if the system load raises, reduce the compaction depending on the disk > availability > 3. > Current code disables compaction when disk space grows beyond configured > threshold. There is no exit from this point. Have an option to keep reserved > space for compaction, at least 2 entryLog file sizes when > isForceGCAllowWhenNoSpace enabled. > 4. > Current code toggles READONLY status of the bookie as soon as it falls below > the disk storage threshold. Imagine if we keep 95% as the threshold, Bookie > becomes RW as soon as it falls below 95 % and few more writes pushes it above > 95 and it turns back to RONLY. Use a set of defines (another set of LWM/HWM?) > where Bookie turns RO on high end and won't become RW until it hits low end. > 5. > Current code never checks if the compaction is enabled or disabled once the > major/minor compaction is started. If the bookie goes > disk threshold (95%) > and at that compaction is going on, it never checks until it finishes but > there may not be disk available for compaction to take place. So check if > compaction is enabled after processing every EntryLog. > 6. > Current code changes the Bookie Cookie value even when new storage is added. > When the cookie changes Bookie becomes a new one, and BK cluster treats it as > new bookie. If we have mechanism to keep valid cookie even after adding > additional disk space, we may have a chance to bring the bookie back to > healthy mode and have compaction going. > 7. Bug > CheckPoint was never attempted to complete after once sync failure. There is > a TODO in the code for this area. > 8. > When the disk is above threshold, Bookie goes to RO. If we have to restart > the bookie, on the way back, bookie tries to create new entrylog and other > files, which will fail because disk usage is above threshold, hence bookie > refuses to come up. -- This message was sent by Atlassian JIRA (v6.3.15#6346)