[ 
https://issues.apache.org/jira/browse/HBASE-14263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704018#comment-14704018
 ] 

Vladimir Rodionov commented on HBASE-14263:
-------------------------------------------

Basically, the current implementation won't work for custom max file size 
(*hbase.hstore.compaction.max.size*) and it does not work as expected doing 
selection on a small store files (smaller than 
*hbase.hstore.compaction.min.size*). 

> ExploringCompactionPolicy logic around file selection is broken
> ---------------------------------------------------------------
>
>                 Key: HBASE-14263
>                 URL: https://issues.apache.org/jira/browse/HBASE-14263
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>             Fix For: 2.0.0
>
>
> It seems that logic around selection of store file candidates is broken:
> {code}
>         // Compute the total size of files that will
>         // have to be read if this set of files is compacted.
>         long size = getTotalStoreSize(potentialMatchFiles);
>         // Store the smallest set of files.  This stored set of files will be 
> used
>         // if it looks like the algorithm is stuck.
>         if (mightBeStuck && size < smallestSize) {
>           smallest = potentialMatchFiles;
>           smallestSize = size;
>         }
>         if (size > comConf.getMaxCompactSize()) {
>           continue;
>         }
>         ++opts;
>         if (size >= comConf.getMinCompactSize()
>             && !filesInRatio(potentialMatchFiles, currentRatio)) {
>           continue;
>         }
> {code}
> This is from applyCompactionPolicy method. As you can see, both min 
> compaction size and max compaction size are applied to a *selection* of files 
> and not to individual files. It mostly works as expected only because nobody 
> seems using non-default hbase.hstore.compaction.max.size, which is  
> Long.MAX_VALUE  and  it  is not  that  easy  to  figure out  what  is  going  
> on  on an opposite side (why small files do not get included?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to