Re: SPARK - DataFrame for BulkLoad

2016-05-18 Thread Michael Segel
Yes, but he’s using phoenix which may not work cleanly with your HBase spark module. They key issue here may be Phoenix which is separate from HBase. > On May 18, 2016, at 5:36 AM, Ted Yu wrote: > > Please see HBASE-14150 > > The hbase-spark module would be available

Re: SPARK - DataFrame for BulkLoad

2016-05-18 Thread Ted Yu
Please see HBASE-14150 The hbase-spark module would be available in the upcoming hbase 2.0 release. On Tue, May 17, 2016 at 11:48 PM, Takeshi Yamamuro wrote: > Hi, > > Have you checked this? > >

Re: SPARK - DataFrame for BulkLoad

2016-05-18 Thread Takeshi Yamamuro
Hi, Have you checked this? http://mail-archives.apache.org/mod_mbox/spark-user/201311.mbox/%3ccacyzca3askwd-tujhqi1805bn7sctguaoruhd5xtxcsul1a...@mail.gmail.com%3E // maropu On Wed, May 18, 2016 at 1:14 PM, Mohanraj Ragupathiraj < mohanaug...@gmail.com> wrote: > I have 100 million records to

SPARK - DataFrame for BulkLoad

2016-05-17 Thread Mohanraj Ragupathiraj
I have 100 million records to be inserted to a HBase table (PHOENIX) as a result of a Spark Job. I would like to know if i convert it to a Dataframe and save it, will it do Bulk load (or) it is not the efficient way to write data to a HBase table -- Thanks and Regards Mohan