The reference files will be rewritten during compaction, which normally happens right after splits.
You did not mention if your 200gb data is one fileļ¼or many hfiles. Jerry On Oct 2, 2014 12:26 PM, "Serega Sheypak" <[email protected]> wrote: > Sorry, massive IO. > This table is read-only. So hbase should just place reference files, why > Hbase would rewrite the files? > > 2014-10-02 23:24 GMT+04:00 Serega Sheypak <[email protected]>: > > > Hi! > > http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/ > > Says that splitting is just a placing 'reference' file. > > Why there sould be massive splitting? > > > > 2014-10-02 23:08 GMT+04:00 Jean-Marc Spaggiari <[email protected] > >: > > > >> Hi Serega, > >> > >> Bulk load just "push" the file into an HBase region, so there should not > >> be > >> any issue. Split however might take some time because HBase will have to > >> split it again and again util it become small enough. So if you max file > >> size is 10GB, it will split it to 100GB then 50GB then 25GB then 12GB > then > >> 6GB... Each time, everything will be re-written. a LOT of wasted IOs. > >> > >> So response is: Yes, HBase can handle BUT it's not a good practice. > Better > >> to split the table before and generate the bulk based on the splited > >> regions. Also, it might affect the others tables and the performances > >> because HBase will have to do massive IOs, which at the end might impact > >> the performances. > >> > >> JM > >> > >> 2014-10-02 15:03 GMT-04:00 Serega Sheypak <[email protected]>: > >> > >> > Hi, I'm doing HBase bulk load to an empty table. > >> > Input data size is 200GB > >> > Is it OK to load data into one default region and then wait while > HBase > >> > splits 200GB region? > >> > > >> > I don't have any SLA for initial load. I can wait unitl HBase splits > >> > initial load files. > >> > This table is READ only. > >> > > >> > The only conideration is not affect others tables and do not cause > HBase > >> > cluster degradation. > >> > > >> > > > > >
