Please check https://issues.apache.org/jira/browse/HBASE-21355
- Ravi Singal On Wed, 6 Feb 2019 at 4:18 AM, Austin Heyne <[email protected]> wrote: > Hey all, > > We've just recently completed a bulk load of data into a number of > tables and once complete we restarted the cluster and migrated from EMR > 5.19 to 5.20 thus moving our HBase version from 1.4.7 to 1.4.8. > Everything seemed stable at the time and there was no activity across > the cluster for a few days. Yesterday we activated our live ingest > pipeline that is now pushing more data into those bulk loaded tables. > What we're seeing now after about 24hrs of live ingest (~450 requests/s > across 80 regionservers) is that HBase is splitting regions like crazy. > In the last 24hrs we've double our region count, going from 24k to 51k > regions. > > Looking at the tables, the regions being split are of relatively small > size. One example table had 136 regions yesterday with about 8-10GB per > region. That table now has 1446 regions, each at 1-2GB but has only > grown by ~700GB. > > The current configuration we're using has the following values set which > I was under the impression would prevent this situation. > > <property> > <name>hbase.hregion.max.filesize</name> > <value>21474836480</value> > </property> > > <property> > <name>hbase.regionserver.regionSplitLimit</name> > <value>256</value> > </property> > > <property> > <name>hbase.regionserver.region.split.policy</name> > > <value>org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy</value> > </property> > > For now I've disable splits through the shell. Any insight into what may > be causing this would be really appreciated. Also if anyone is aware of > a log4j config for this situation that would be very useful. > > Thanks, > Austin > >
