[
https://issues.apache.org/jira/browse/HBASE-26242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang resolved HBASE-26242.
-------------------------------
Hadoop Flags: Reviewed
Resolution: Fixed
> Allow split when store file count larger than the configured blocking file
> count
> --------------------------------------------------------------------------------
>
> Key: HBASE-26242
> URL: https://issues.apache.org/jira/browse/HBASE-26242
> Project: HBase
> Issue Type: Wish
> Components: regionserver
> Affects Versions: 1.7.1, 3.0.0-alpha-2, 2.4.10
> Reporter: Xiaolin Ha
> Assignee: Xiaolin Ha
> Priority: Major
> Fix For: 2.4.11, 3.0.0-alpha-3, 2.5.0
>
>
> Currently, region will not split when the number of store files is up to the
> configed blocking count, by `hbase.hstore.blockingStoreFiles`.
> The relevant codes are as follows,
> the CompactSplit#requestSplit() (called by the MemstoreFlusher and
> CompactionRunner) checks the compaction priority of the region, if the
> compact priority < PRIORITY_USER, the region will not split.
> {code:java}
> public synchronized boolean requestSplit(final Region r) {
> // don't split regions that are blocking
> HRegion hr = (HRegion)r;
> try {
> if (shouldSplitRegion() && hr.getCompactPriority() >= PRIORITY_USER) {
> byte[] midKey = hr.checkSplit().orElse(null);
> if (midKey != null) {
> requestSplit(r, midKey);
> return true;
> }
> }
> .... {code}
> But the region's compact priority is the minimum of all the stores, when the
> number of storefiles in a store is larger than the configed
> `hbase.hstore.blockingStoreFiles`, the compact priority will be a negative
> number, while PRIORITY_USER = 1.
> {code:java}
> public int getStoreCompactionPriority() {
> int priority = blockingFileCount - storefiles.size();
> return (priority == HStore.PRIORITY_USER) ? priority + 1 : priority;
> } {code}
> As a result, when a region size is up to the split limit, but its speed of
> reducing the number of files through compaction is slower than the speed of
> generating new files(e.g. compacting L0 files to stripes, bulk load, flush
> memstore), the region will never split.
> The problem is obvious in StripeStoreEngine, though memstore flushing is
> pending when store file count up to the blocking count, each L0 compaction
> may generate the stripe count new files to each stripe. And in this scenario,
> since the store always compact priority to split, the stripe count is larger
> and larger, the new files generated by compact is more and more, no split in
> the end. While split can divide the compaction pressure(1 parent compaction +
> 2 children compaction can be reduced to 2 children compaction).
> We can add a configuration to enable split when blocking, not only keep the
> origin behavior but also support flexible control.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)