[ https://issues.apache.org/jira/browse/HBASE-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13594187#comment-13594187 ]
Hadoop QA commented on HBASE-7842: ---------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572209/HBASE-7842-0.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4684//console This message is automatically generated. > Add compaction policy that explores more storefile groups > --------------------------------------------------------- > > Key: HBASE-7842 > URL: https://issues.apache.org/jira/browse/HBASE-7842 > Project: HBase > Issue Type: New Feature > Components: Compaction > Reporter: Elliott Clark > Assignee: Elliott Clark > Attachments: HBASE-7842-0.patch > > > Some workloads that are not as stable can have compactions that are too large > or too small using the current storefile selection algorithm. > Currently: > * Find the first file that Size(fi) <= Sum(0, i-1, FileSize(fx)) > * Ensure that there are the min number of files (if there aren't then bail > out) > * If there are too many files keep the larger ones. > I would propose something like: > * Find all sets of storefiles where every file satisfies > ** FileSize(fi) <= Sum(0, i-1, FileSize(fx)) > ** Num files in set =< max > ** Num Files in set >= min > * Then pick the set of files that maximizes ((# storefiles in set) / > Sum(FileSize(fx))) > The thinking is that the above algorithm is pretty easy reason about, all > files satisfy the ratio, and should rewrite the least amount of data to get > the biggest impact in seeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira