[ https://issues.apache.org/jira/browse/HBASE-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13617829#comment-13617829 ]
Sergey Shelukhin commented on HBASE-7842: ----------------------------------------- Looks good as such; one question left is: bq. Regardless, it appears that this patch both replaces default policy and hijacks the default test, so the default ratio algorithm becomes a poor bastard child, not used and not tested Should we delete it altogether and swap it with yours in place? Do you want to just replace the default one? > Add compaction policy that explores more storefile groups > --------------------------------------------------------- > > Key: HBASE-7842 > URL: https://issues.apache.org/jira/browse/HBASE-7842 > Project: HBase > Issue Type: New Feature > Components: Compaction > Reporter: Elliott Clark > Assignee: Elliott Clark > Attachments: HBASE-7842-0.patch, HBASE-7842-2.patch, > HBASE-7842-3.patch, HBASE-7842-4.patch, HBASE-7842-5.patch > > > Some workloads that are not as stable can have compactions that are too large > or too small using the current storefile selection algorithm. > Currently: > * Find the first file that Size(fi) <= Sum(0, i-1, FileSize(fx)) > * Ensure that there are the min number of files (if there aren't then bail > out) > * If there are too many files keep the larger ones. > I would propose something like: > * Find all sets of storefiles where every file satisfies > ** FileSize(fi) <= Sum(0, i-1, FileSize(fx)) > ** Num files in set =< max > ** Num Files in set >= min > * Then pick the set of files that maximizes ((# storefiles in set) / > Sum(FileSize(fx))) > The thinking is that the above algorithm is pretty easy reason about, all > files satisfy the ratio, and should rewrite the least amount of data to get > the biggest impact in seeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira