Re: HBase 2 ,bulk import question

Austin Heyne Thu, 18 Jul 2019 06:21:42 -0700

Bulk importing requires the table the data is being bulk imported intoto already exists. This is because the mapreduce job needs to extractthe region start/end keys in order to drive the reducers. This meansthat you need to create your table before hand, providing theappropriate pre-splitting and then run your bulk ingest and bulk load toget the data into the table. If you were to not pre-split your tablethen you would end up with one reducer in your bulk ingest job. Thisalso means that your bulk ingest cluster will need to be able tocommunicate with your HBase instance.


-Austin


On 7/18/19 4:39 AM, Michael wrote:

Hi,

I looked at the possibility of bulk importing into hbase, but somehow I
don't get it. I am not able to perform a presplitting of the data, so
does bulk importing work without presplitting?
As I understand it, instead of putting the data, I create the hbase
region files, but all tutorials I read mentioned presplitting...

So, is presplitting essential for bulk importing?

It would be really helpful, if someone could point me to demo
implementation of a bulk import.

Thanks for helping
  Michael

Re: HBase 2 ,bulk import question

Reply via email to