[ https://issues.apache.org/jira/browse/HBASE-8665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677551#comment-13677551 ]
Ted Yu commented on HBASE-8665: ------------------------------- {code} + private List<CompactionRequest> requestCompactionInternal(final HRegion r, final String why, + int p, List<Pair<CompactionRequest, Store>> requests, boolean selectNow) throws IOException { {code} Mind adding javadoc for the selectNow parameter ? {code} + // Store priority decreased while we were in queue (due to some other compaction?), + // requeue with new priority to avoid blocking potential higher priorities. + this.parent.execute(this); {code} Add debug log for the above case ? Please add license header for StatefulStoreMockMaker.java {code} + public class BlockingStoreMockMaker extends StatefulStoreMockMaker { {code} Can BlockingStoreMockMaker be private ? > bad compaction priority behavior in queue can cause store to be blocked > ----------------------------------------------------------------------- > > Key: HBASE-8665 > URL: https://issues.apache.org/jira/browse/HBASE-8665 > Project: HBase > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Attachments: HBASE-8665-v0.patch > > > Note that this can be solved by bumping up the number of compaction threads > but still it seems like this priority "inversion" should be dealt with. > There's a store with 1 big file and 3 flushes (1 2 3 4) sitting around and > minding its own business when it decides to compact. Compaction (2 3 4) is > created and put in queue, it's low priority, so it doesn't get out of the > queue for some time - other stores are compacting. Meanwhile more files are > flushed and at (1 2 3 4 5 6 7) it decides to compact (5 6 7). This compaction > now has higher priority than the first one. After that if the load is high it > enters vicious cycle of compacting and compacting files as they arrive, with > store being blocked on and off, with the (2 3 4) compaction staying in queue > for up to ~20 minutes (that I've seen). > I wonder why we do thing thing where we queue compaction and compact > separately. Perhaps we should take snapshot of all store priorities, then do > select in order and execute the first compaction we find. This will need > starvation safeguard too but should probably be better. > Btw, exploring compaction policy may be more prone to this, as it can select > files from the middle, not just beginning, which, given the treatment of > already selected files that was not changed from the old ratio-based one (all > files with lower seqNums than the ones selected are also ineligible for > further selection), will make more files ineligible (e.g. imagine with 10 > blocking files, with 8 present (1-8), (6 7 8) being selected and getting > stuck). Today I see the case that would also apply to old policy, but > yesterday I saw file distribution something like this: 4,5g, 2,1g, 295,9m, > 113,3m, 68,0m, 67,8m, 1,1g, 295,1m, 100,4m, unfortunately w/o enough logs to > figure out how it resulted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira