We are pre-splitting our tables before bulk loading also but we don't use
the RegionSplitter.
We split manually (we did some testing and found the optimal split points)
by putting into .META table a new HRegionInfo, assigning that region
(HBaseAdmin.assign("region name")) and after you finish assi
Hi,
I'm tuning hbase for storage of a few billion rows and, more or less, bulk
loading.
I'm using MD5 strings as row ids to create an evenly distributed range and
non-sequential values during loading and this is working relatively well
for us.
I've pre-split my tables using org.apache.hadoop.hba