Re: HBaseContext with Spark

2017-01-27 Thread Chetan Khatri
storage handler bulk load: SET hive.hbase.bulk=true; INSERT OVERWRITE TABLE users SELECT … ; But for now, you have to do some work and issue multiple Hive commands Sample source data for range partitioning Save sampling results to a file Run CLUSTER BY query using HiveHFileOutputFormat and

Re: HBaseContext with Spark

2017-01-27 Thread Chetan Khatri
@Ted, I dont think so. On Thu, Jan 26, 2017 at 6:35 AM, Ted Yu wrote: > Does the storage handler provide bulk load capability ? > > Cheers > > On Jan 25, 2017, at 3:39 AM, Amrit Jangid > wrote: > > Hi chetan, > > If you just need HBase Data into

Re: HBaseContext with Spark

2017-01-25 Thread Amrit Jangid
Hi chetan, If you just need HBase Data into Hive, You can use Hive EXTERNAL TABLE with STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'. Try this if you problem can be solved https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration Regards Amrit . On Wed, Jan 25,

Re: HBaseContext with Spark

2017-01-25 Thread Ted Yu
The references are vendor specific. Suggest contacting vendor's mailing list for your PR. My initial interpretation of HBase repository is that of Apache. Cheers On Wed, Jan 25, 2017 at 7:38 AM, Chetan Khatri wrote: > @Ted Yu, Correct but HBase-Spark module

Re: HBaseContext with Spark

2017-01-25 Thread Chetan Khatri
@Ted Yu, Correct but HBase-Spark module available at HBase repository seems too old and written code is not optimized yet, I have been already submitted PR for the same. I dont know if it is clearly mentioned that now it is part of HBase itself then people are committing to older repo where

Re: HBaseContext with Spark

2017-01-25 Thread Ted Yu
Though no hbase release has the hbase-spark module, you can find the backport patch on HBASE-14160 (for Spark 1.6) You can build the hbase-spark module yourself. Cheers On Wed, Jan 25, 2017 at 3:32 AM, Chetan Khatri wrote: > Hello Spark Community Folks, > >

HBaseContext with Spark

2017-01-25 Thread Chetan Khatri
Hello Spark Community Folks, Currently I am using HBase 1.2.4 and Hive 1.2.1, I am looking for Bulk Load from Hbase to Hive. I have seen couple of good example at HBase Github Repo: https://github.com/ apache/hbase/tree/master/hbase-spark If I would like to use HBaseContext with HBase 1.2.4,