RE: Optimizing external table structure

2016-02-14 Thread Riesland, Zack
of the data. I believe that there is a very similar CsvBulkLoad tool in the HBase jars. -Original Message- From: Liu, Ming (Ming) [mailto:ming@esgyn.cn] Sent: Saturday, February 13, 2016 7:06 PM To: user@hive.apache.org Subject: re: Optimizing external table structure Hi, Zack, Can

Optimizing external table structure

2016-02-13 Thread Riesland, Zack
On a daily basis, we move large amounts of data from hive to hbase, via phoenix. In order to do this, we create an external hive table with the data we need to move (all a subset of 1 compressed ORC table), and then use the Phoenix CsvBulkUpload utility. From everything I've read, this is the

Re: Optimizing external table structure

2016-02-13 Thread Jörn Franke
How many disk drives do you have / node? Generally one node should have 12 drives not configured as raid and not configured as lvm. Files could be a little bit larger (4 or better 40 gb - your namenode will thank you) or use Hadoop Archive (HAR). I am not sure about the latest status of

RE: Optimizing external table structure

2016-02-13 Thread Riesland, Zack
Thanks. We have 16 disks per node, to answer your question. From: Jörn Franke [jornfra...@gmail.com] Sent: Saturday, February 13, 2016 9:46 AM To: user@hive.apache.org Subject: Re: Optimizing external table structure How many disk drives do you have

re: Optimizing external table structure

2016-02-13 Thread Liu, Ming (Ming)
@hive.apache.org 主题: RE: Optimizing external table structure Thanks. We have 16 disks per node, to answer your question. From: Jörn Franke [jornfra...@gmail.com] Sent: Saturday, February 13, 2016 9:46 AM To: user@hive.apache.org Subject: Re: Optimizing external table