[
https://issues.apache.org/jira/browse/HBASE-6472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lars George resolved HBASE-6472.
--------------------------------
Resolution: Invalid
> Bulk load files can cause minor compactions with single files
> -------------------------------------------------------------
>
> Key: HBASE-6472
> URL: https://issues.apache.org/jira/browse/HBASE-6472
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Reporter: Lars George
>
> This is visible in the logs:
> {noformat}
> 2012-07-17 00:00:29,081 INFO org.apache.hadoop.hbase.regionserver.Store:
> Started compaction of 1 file(s) in cf=m into
> hdfs://nn:9000/hbase/t1/27889b3c9257e8dff4cddae5eedfd064/.tmp,
> seqid=1918186410, totalSize=124.7m
> 2012-07-17 00:00:29,081 DEBUG org.apache.hadoop.hbase.regionserver.Store:
> Compacting
> hdfs://nn:9000/hbase/t1/27889b3c9257e8dff4cddae5eedfd064/m/5478928132066935241,
> keycount=1435991, bloomtype=NONE, size=124.7m
> 2012-07-17 00:00:34,538 INFO org.apache.hadoop.hbase.regionserver.Store:
> Completed major compaction of 1 file(s), new
> file=hdfs://nn:9000/hbase/t1/27889b3c9257e8dff4cddae5eedfd064/m/7554093203376406441,
> size=124.7m; total size for store is 124.7m
> {noformat}
> I assume the reason is that the Store.java check is reversed:
> {code}
> // skip selection algorithm if we don't have enough files
> if (compactSelection.getFilesToCompact().size() <
> this.minFilesToCompact) {
> if(LOG.isDebugEnabled()) {
> LOG.debug("Not compacting files because we only have " +
> compactSelection.getFilesToCompact().size() +
> " files ready for compaction. Need " + this.minFilesToCompact +
> " to initiate.");
> }
> compactSelection.emptyFileList();
> return compactSelection;
> }
> // remove bulk import files that request to be excluded from minors
> compactSelection.getFilesToCompact().removeAll(Collections2.filter(
> compactSelection.getFilesToCompact(),
> new Predicate<StoreFile>() {
> public boolean apply(StoreFile input) {
> return input.excludeFromMinorCompaction();
> }
> }));
> {code}
> The bulk files should be removed first, and +then+ it should check if there
> are enough files.
> There are other places doing the same things, i.e. removing the bulk files
> and check what needs to be done, so this needs some extra care to get it all
> right.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira