[ 
https://issues.apache.org/jira/browse/HBASE-26242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-26242.
-------------------------------
    Hadoop Flags: Reviewed
      Resolution: Fixed

> Allow split when store file count larger than the configured blocking file 
> count
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-26242
>                 URL: https://issues.apache.org/jira/browse/HBASE-26242
>             Project: HBase
>          Issue Type: Wish
>          Components: regionserver
>    Affects Versions: 1.7.1, 3.0.0-alpha-2, 2.4.10
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>             Fix For: 2.4.11, 3.0.0-alpha-3, 2.5.0
>
>
> Currently, region will not split when the number of store files is up to the 
> configed blocking count, by `hbase.hstore.blockingStoreFiles`.
> The relevant codes are as follows, 
> the CompactSplit#requestSplit() (called by the MemstoreFlusher and 
> CompactionRunner) checks the compaction priority of the region, if the 
> compact priority < PRIORITY_USER, the region will not split.
> {code:java}
> public synchronized boolean requestSplit(final Region r) {
>   // don't split regions that are blocking
>   HRegion hr = (HRegion)r;
>   try {
>     if (shouldSplitRegion() && hr.getCompactPriority() >= PRIORITY_USER) {
>       byte[] midKey = hr.checkSplit().orElse(null);
>       if (midKey != null) {
>         requestSplit(r, midKey);
>         return true;
>       }
>     }
> .... {code}
> But the region's compact priority is the minimum of all the stores, when the 
> number of storefiles in a store is larger than the configed 
> `hbase.hstore.blockingStoreFiles`, the compact priority will be a negative 
> number, while PRIORITY_USER = 1.
> {code:java}
> public int getStoreCompactionPriority() {
>   int priority = blockingFileCount - storefiles.size();
>   return (priority == HStore.PRIORITY_USER) ? priority + 1 : priority;
> } {code}
> As a result, when a region size is up to the split limit, but its speed of 
> reducing the number of files through compaction is slower than the speed of 
> generating new files(e.g. compacting L0 files to stripes, bulk load, flush 
> memstore), the region will never split. 
> The problem is obvious in StripeStoreEngine, though memstore flushing is 
> pending when store file count up to the blocking count, each L0 compaction 
> may generate the stripe count new files to each stripe. And in this scenario, 
> since the store always compact priority to split, the stripe count is larger 
> and larger, the new files generated by compact is more and more, no split in 
> the end. While split can divide the compaction pressure(1 parent compaction + 
> 2 children compaction can be reduced to 2 children compaction).
> We can add a configuration to enable split when blocking, not only keep the 
> origin behavior but also support flexible control. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to