[ https://issues.apache.org/jira/browse/HBASE-26242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang resolved HBASE-26242. ------------------------------- Hadoop Flags: Reviewed Resolution: Fixed > Allow split when store file count larger than the configured blocking file > count > -------------------------------------------------------------------------------- > > Key: HBASE-26242 > URL: https://issues.apache.org/jira/browse/HBASE-26242 > Project: HBase > Issue Type: Wish > Components: regionserver > Affects Versions: 1.7.1, 3.0.0-alpha-2, 2.4.10 > Reporter: Xiaolin Ha > Assignee: Xiaolin Ha > Priority: Major > Fix For: 2.4.11, 3.0.0-alpha-3, 2.5.0 > > > Currently, region will not split when the number of store files is up to the > configed blocking count, by `hbase.hstore.blockingStoreFiles`. > The relevant codes are as follows, > the CompactSplit#requestSplit() (called by the MemstoreFlusher and > CompactionRunner) checks the compaction priority of the region, if the > compact priority < PRIORITY_USER, the region will not split. > {code:java} > public synchronized boolean requestSplit(final Region r) { > // don't split regions that are blocking > HRegion hr = (HRegion)r; > try { > if (shouldSplitRegion() && hr.getCompactPriority() >= PRIORITY_USER) { > byte[] midKey = hr.checkSplit().orElse(null); > if (midKey != null) { > requestSplit(r, midKey); > return true; > } > } > .... {code} > But the region's compact priority is the minimum of all the stores, when the > number of storefiles in a store is larger than the configed > `hbase.hstore.blockingStoreFiles`, the compact priority will be a negative > number, while PRIORITY_USER = 1. > {code:java} > public int getStoreCompactionPriority() { > int priority = blockingFileCount - storefiles.size(); > return (priority == HStore.PRIORITY_USER) ? priority + 1 : priority; > } {code} > As a result, when a region size is up to the split limit, but its speed of > reducing the number of files through compaction is slower than the speed of > generating new files(e.g. compacting L0 files to stripes, bulk load, flush > memstore), the region will never split. > The problem is obvious in StripeStoreEngine, though memstore flushing is > pending when store file count up to the blocking count, each L0 compaction > may generate the stripe count new files to each stripe. And in this scenario, > since the store always compact priority to split, the stripe count is larger > and larger, the new files generated by compact is more and more, no split in > the end. While split can divide the compaction pressure(1 parent compaction + > 2 children compaction can be reduced to 2 children compaction). > We can add a configuration to enable split when blocking, not only keep the > origin behavior but also support flexible control. > -- This message was sent by Atlassian Jira (v8.20.10#820010)