Hi! http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/ Says that splitting is just a placing 'reference' file. Why there sould be massive splitting?
2014-10-02 23:08 GMT+04:00 Jean-Marc Spaggiari <[email protected]>: > Hi Serega, > > Bulk load just "push" the file into an HBase region, so there should not be > any issue. Split however might take some time because HBase will have to > split it again and again util it become small enough. So if you max file > size is 10GB, it will split it to 100GB then 50GB then 25GB then 12GB then > 6GB... Each time, everything will be re-written. a LOT of wasted IOs. > > So response is: Yes, HBase can handle BUT it's not a good practice. Better > to split the table before and generate the bulk based on the splited > regions. Also, it might affect the others tables and the performances > because HBase will have to do massive IOs, which at the end might impact > the performances. > > JM > > 2014-10-02 15:03 GMT-04:00 Serega Sheypak <[email protected]>: > > > Hi, I'm doing HBase bulk load to an empty table. > > Input data size is 200GB > > Is it OK to load data into one default region and then wait while HBase > > splits 200GB region? > > > > I don't have any SLA for initial load. I can wait unitl HBase splits > > initial load files. > > This table is READ only. > > > > The only conideration is not affect others tables and do not cause HBase > > cluster degradation. > > >
