Hi we are using CDH 5.7 HBase 1.2 we are doing a performance testing over HBase through regular Load, which has 4 Region Servers.
Input Data is compressed binary files around 2TB, which we process and write as Key-Value pairs to HBase. the output data size in HBase is almost 4 times around 8TB, because we are writing as text. this process is a Map-Reduce Job, when we are doing the load, we observed there's a lot of GC happening on Region Server's ,so we changed couple of parameters to decrease the GC time. we increased the flush size to 128MB to 1 GB and compactionThreshold to 50 and regionserver.maxlogs to 42 following are the configuration we changed from default. hbase.hregion.memstore.flush.size = 1 GB hbase.hstore.max.filesize=10GB hbase.hregion.preclose.flush.size= 50 MB hbase.hstore.compactionThreshold=50 hbase.regionserver.maxlogs=42 after the load, we observed that HBase table has only 4 regions with each of size around 2.5 TB i am trying to understand, what configuration parameter caused this issue. i was going through this article http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/ Region split policy in our HBase is org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitPolicy according to Region Split policy, Region Server should create regions when the region size limit is exceeded. can some one explain me the root cause. Thanks, Yeshwanth