Re: about spark on hbase

2015-12-18 Thread Akhil Das
*First you create the HBase configuration:*

  val hbaseTableName = "paid_daylevel"
  val hbaseColumnName = "paid_impression"
  val hconf = HBaseConfiguration.create()
  hconf.set("hbase.zookeeper.quorum", "sigmoid-dev-master")
  hconf.set("hbase.zookeeper.property.clientPort", "2182")
  hconf.set("hbase.defaults.for.version.skip", "true")
  hconf.set(TableOutputFormat.OUTPUT_TABLE, hbaseTableName)
  hconf.setClass("mapreduce.job.outputformat.class",
classOf[TableOutputFormat[String]], classOf[OutputFormat[String, Mutation]])
  val admin = new HBaseAdmin(hconf)

*Then you read the values:*

val values = sparkContext.newAPIHadoopRDD(hconf, classOf[TableInputFormat],
classOf[ImmutableBytesWritable], classOf[Result]).map {
  case (key, row) => {
val rowkey = Bytes.toString(key.get())
val valu = Bytes.toString(row.getValue(Bytes.toBytes("CF"),
Bytes.toBytes(hbaseColumnName)))

(rowkey, valu.toInt)
  }



*Then you modify or do whatever you want with the values using the rdd
transformations and then save the values:*

values.map(valu => (new ImmutableBytesWritable, {
  val record = new Put(Bytes.toBytes(valu._1))
  record.add(Bytes.toBytes("CF"), Bytes.toBytes(hbaseColumnName),
Bytes.toBytes(valu._2.toString))

  record
}
  )
).saveAsNewAPIHadoopDataset(hconf)



​You can also look at the
http://spark-packages.org/package/nerdammer/spark-hbase-connector ​




Thanks
Best Regards

On Tue, Dec 15, 2015 at 6:08 PM, censj  wrote:

> hi,all:
> how cloud I through spark function  hbase get value then update this
> value and put this value to hbase ?
>


Re: About Spark On Hbase

2015-12-15 Thread Josh Mahonin
And as yet another option, there is
https://phoenix.apache.org/phoenix_spark.html

It however requires that you are also using Phoenix in conjunction with
HBase.

On Tue, Dec 15, 2015 at 4:16 PM, Ted Yu  wrote:

> There is also
> http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase
>
> FYI
>
> On Tue, Dec 15, 2015 at 11:51 AM, Zhan Zhang 
> wrote:
>
>> If you want dataframe support, you can refer to
>> https://github.com/zhzhan/shc, which I am working on to integrate to
>> HBase upstream with existing support.
>>
>> Thanks.
>>
>> Zhan Zhang
>>
>> On Dec 15, 2015, at 4:34 AM, censj  wrote:
>>
>>
>> hi,*fight fate*
>> *Did I can in *bulkPut() function use Get value first ,then put this
>> value to Hbase ?
>>
>>
>> 在 2015年12月9日,16:02,censj  写道:
>>
>> Thank you! I know
>>
>> 在 2015年12月9日,15:59,fightf...@163.com 写道:
>>
>> If you are using maven , you can add the cloudera maven repo to the
>> repository in pom.xml
>> and add the dependency of spark-hbase.
>> I just found this :
>> http://spark-packages.org/package/nerdammer/spark-hbase-connector
>> as Feng Dongyu recommend, you can try this also, but I had no experience
>> of using this.
>>
>>
>> --
>> fightf...@163.com
>>
>>
>> *发件人:* censj 
>> *发送时间:* 2015-12-09 15:44
>> *收件人:* fightf...@163.com
>> *抄送:* user@spark.apache.org
>> *主题:* Re: About Spark On Hbase
>> So, I how to get this jar? I use set package project.I not found sbt lib.
>>
>> 在 2015年12月9日,15:42,fightf...@163.com 写道:
>>
>> I don't think it really need CDH component. Just use the API
>>
>> --
>> fightf...@163.com
>>
>>
>> *发件人:* censj 
>> *发送时间:* 2015-12-09 15:31
>> *收件人:* fightf...@163.com
>> *抄送:* user@spark.apache.org
>> *主题:* Re: About Spark On Hbase
>> But this is dependent on CDH。I not install CDH。
>>
>> 在 2015年12月9日,15:18,fightf...@163.com 写道:
>>
>> Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase
>> Also, HBASE-13992 <https://issues.apache.org/jira/browse/HBASE-13992>
>> already integrates that feature into the hbase side, but
>> that feature has not been released.
>>
>> Best,
>> Sun.
>>
>> --
>> fightf...@163.com
>>
>>
>> *From:* censj 
>> *Date:* 2015-12-09 15:04
>> *To:* user@spark.apache.org
>> *Subject:* About Spark On Hbase
>> hi all,
>>  now I using spark,but I not found spark operation hbase open
>> source. Do any one tell me?
>>
>>
>>
>>
>>
>


Re: About Spark On Hbase

2015-12-15 Thread Ted Yu
There is also
http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase

FYI

On Tue, Dec 15, 2015 at 11:51 AM, Zhan Zhang  wrote:

> If you want dataframe support, you can refer to
> https://github.com/zhzhan/shc, which I am working on to integrate to
> HBase upstream with existing support.
>
> Thanks.
>
> Zhan Zhang
>
> On Dec 15, 2015, at 4:34 AM, censj  wrote:
>
>
> hi,*fight fate*
> *Did I can in *bulkPut() function use Get value first ,then put this
> value to Hbase ?
>
>
> 在 2015年12月9日,16:02,censj  写道:
>
> Thank you! I know
>
> 在 2015年12月9日,15:59,fightf...@163.com 写道:
>
> If you are using maven , you can add the cloudera maven repo to the
> repository in pom.xml
> and add the dependency of spark-hbase.
> I just found this :
> http://spark-packages.org/package/nerdammer/spark-hbase-connector
> as Feng Dongyu recommend, you can try this also, but I had no experience
> of using this.
>
>
> --
> fightf...@163.com
>
>
> *发件人:* censj 
> *发送时间:* 2015-12-09 15:44
> *收件人:* fightf...@163.com
> *抄送:* user@spark.apache.org
> *主题:* Re: About Spark On Hbase
> So, I how to get this jar? I use set package project.I not found sbt lib.
>
> 在 2015年12月9日,15:42,fightf...@163.com 写道:
>
> I don't think it really need CDH component. Just use the API
>
> ----------
> fightf...@163.com
>
>
> *发件人:* censj 
> *发送时间:* 2015-12-09 15:31
> *收件人:* fightf...@163.com
> *抄送:* user@spark.apache.org
> *主题:* Re: About Spark On Hbase
> But this is dependent on CDH。I not install CDH。
>
> 在 2015年12月9日,15:18,fightf...@163.com 写道:
>
> Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase
> Also, HBASE-13992 <https://issues.apache.org/jira/browse/HBASE-13992>
> already integrates that feature into the hbase side, but
> that feature has not been released.
>
> Best,
> Sun.
>
> --
> fightf...@163.com
>
>
> *From:* censj 
> *Date:* 2015-12-09 15:04
> *To:* user@spark.apache.org
> *Subject:* About Spark On Hbase
> hi all,
>  now I using spark,but I not found spark operation hbase open source.
> Do any one tell me?
>
>
>
>
>


Re: About Spark On Hbase

2015-12-15 Thread Zhan Zhang
If you want dataframe support, you can refer to https://github.com/zhzhan/shc, 
which I am working on to integrate to HBase upstream with existing support.

Thanks.

Zhan Zhang
On Dec 15, 2015, at 4:34 AM, censj 
mailto:ce...@lotuseed.com>> wrote:


hi,fight fate
Did I can in bulkPut() function use Get value first ,then put this value to 
Hbase ?


在 2015年12月9日,16:02,censj mailto:ce...@lotuseed.com>> 写道:

Thank you! I know
在 2015年12月9日,15:59,fightf...@163.com<mailto:fightf...@163.com> 写道:

If you are using maven , you can add the cloudera maven repo to the repository 
in pom.xml
and add the dependency of spark-hbase.
I just found this : 
http://spark-packages.org/package/nerdammer/spark-hbase-connector
as Feng Dongyu recommend, you can try this also, but I had no experience of 
using this.



fightf...@163.com<mailto:fightf...@163.com>

发件人: censj<mailto:ce...@lotuseed.com>
发送时间: 2015-12-09 15:44
收件人: fightf...@163.com<mailto:fightf...@163.com>
抄送: user@spark.apache.org<mailto:user@spark.apache.org>
主题: Re: About Spark On Hbase
So, I how to get this jar? I use set package project.I not found sbt lib.
在 2015年12月9日,15:42,fightf...@163.com<mailto:fightf...@163.com> 写道:

I don't think it really need CDH component. Just use the API


fightf...@163.com<mailto:fightf...@163.com>

发件人: censj<mailto:ce...@lotuseed.com>
发送时间: 2015-12-09 15:31
收件人: fightf...@163.com<mailto:fightf...@163.com>
抄送: user@spark.apache.org<mailto:user@spark.apache.org>
主题: Re: About Spark On Hbase
But this is dependent on CDH。I not install CDH。
在 2015年12月9日,15:18,fightf...@163.com<mailto:fightf...@163.com> 写道:

Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase
Also, HBASE-13992<https://issues.apache.org/jira/browse/HBASE-13992>  already 
integrates that feature into the hbase side, but
that feature has not been released.

Best,
Sun.


fightf...@163.com<mailto:fightf...@163.com>

From: censj<mailto:ce...@lotuseed.com>
Date: 2015-12-09 15:04
To: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: About Spark On Hbase
hi all,
 now I using spark,but I not found spark operation hbase open source. Do 
any one tell me?





Re: About Spark On Hbase

2015-12-15 Thread censj

hi,fight fate
Did I can in bulkPut() function use Get value first ,then put this 
value to Hbase ?


> 在 2015年12月9日,16:02,censj  写道:
> 
> Thank you! I know
>> 在 2015年12月9日,15:59,fightf...@163.com <mailto:fightf...@163.com> 写道:
>> 
>> If you are using maven , you can add the cloudera maven repo to the 
>> repository in pom.xml 
>> and add the dependency of spark-hbase. 
>> I just found this : 
>> http://spark-packages.org/package/nerdammer/spark-hbase-connector 
>> <http://spark-packages.org/package/nerdammer/spark-hbase-connector> 
>> as Feng Dongyu recommend, you can try this also, but I had no experience of 
>> using this. 
>> 
>> 
>> fightf...@163.com <mailto:fightf...@163.com>
>>  
>> 发件人: censj <mailto:ce...@lotuseed.com>
>> 发送时间: 2015-12-09 15:44
>> 收件人: fightf...@163.com <mailto:fightf...@163.com>
>> 抄送: user@spark.apache.org <mailto:user@spark.apache.org>
>> 主题: Re: About Spark On Hbase
>> So, I how to get this jar? I use set package project.I not found sbt lib.
>>> 在 2015年12月9日,15:42,fightf...@163.com <mailto:fightf...@163.com> 写道:
>>> 
>>> I don't think it really need CDH component. Just use the API 
>>> 
>>> fightf...@163.com <mailto:fightf...@163.com>
>>>  
>>> 发件人: censj <mailto:ce...@lotuseed.com>
>>> 发送时间: 2015-12-09 15:31
>>> 收件人: fightf...@163.com <mailto:fightf...@163.com>
>>> 抄送: user@spark.apache.org <mailto:user@spark.apache.org>
>>> 主题: Re: About Spark On Hbase
>>> But this is dependent on CDH。I not install CDH。
>>>> 在 2015年12月9日,15:18,fightf...@163.com <mailto:fightf...@163.com> 写道:
>>>> 
>>>> Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase 
>>>> <https://github.com/cloudera-labs/SparkOnHBase> 
>>>> Also, HBASE-13992 <https://issues.apache.org/jira/browse/HBASE-13992>  
>>>> already integrates that feature into the hbase side, but 
>>>> that feature has not been released. 
>>>> 
>>>> Best,
>>>> Sun.
>>>> 
>>>> fightf...@163.com <mailto:fightf...@163.com>
>>>>  
>>>> From: censj <mailto:ce...@lotuseed.com>
>>>> Date: 2015-12-09 15:04
>>>> To: user@spark.apache.org <mailto:user@spark.apache.org>
>>>> Subject: About Spark On Hbase
>>>> hi all,
>>>>  now I using spark,but I not found spark operation hbase open source. 
>>>> Do any one tell me? 
> 



Re: About Spark On Hbase

2015-12-09 Thread censj
Thank you! I know
> 在 2015年12月9日,15:59,fightf...@163.com 写道:
> 
> If you are using maven , you can add the cloudera maven repo to the 
> repository in pom.xml 
> and add the dependency of spark-hbase. 
> I just found this : 
> http://spark-packages.org/package/nerdammer/spark-hbase-connector 
> <http://spark-packages.org/package/nerdammer/spark-hbase-connector> 
> as Feng Dongyu recommend, you can try this also, but I had no experience of 
> using this. 
> 
> 
> fightf...@163.com <mailto:fightf...@163.com>
>  
> 发件人: censj <mailto:ce...@lotuseed.com>
> 发送时间: 2015-12-09 15:44
> 收件人: fightf...@163.com <mailto:fightf...@163.com>
> 抄送: user@spark.apache.org <mailto:user@spark.apache.org>
> 主题: Re: About Spark On Hbase
> So, I how to get this jar? I use set package project.I not found sbt lib.
>> 在 2015年12月9日,15:42,fightf...@163.com <mailto:fightf...@163.com> 写道:
>> 
>> I don't think it really need CDH component. Just use the API 
>> 
>> fightf...@163.com <mailto:fightf...@163.com>
>>  
>> 发件人: censj <mailto:ce...@lotuseed.com>
>> 发送时间: 2015-12-09 15:31
>> 收件人: fightf...@163.com <mailto:fightf...@163.com>
>> 抄送: user@spark.apache.org <mailto:user@spark.apache.org>
>> 主题: Re: About Spark On Hbase
>> But this is dependent on CDH。I not install CDH。
>>> 在 2015年12月9日,15:18,fightf...@163.com <mailto:fightf...@163.com> 写道:
>>> 
>>> Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase 
>>> <https://github.com/cloudera-labs/SparkOnHBase> 
>>> Also, HBASE-13992 <https://issues.apache.org/jira/browse/HBASE-13992>  
>>> already integrates that feature into the hbase side, but 
>>> that feature has not been released. 
>>> 
>>> Best,
>>> Sun.
>>> 
>>> fightf...@163.com <mailto:fightf...@163.com>
>>>  
>>> From: censj <mailto:ce...@lotuseed.com>
>>> Date: 2015-12-09 15:04
>>> To: user@spark.apache.org <mailto:user@spark.apache.org>
>>> Subject: About Spark On Hbase
>>> hi all,
>>>  now I using spark,but I not found spark operation hbase open source. 
>>> Do any one tell me? 



回复: Re: About Spark On Hbase

2015-12-09 Thread fightf...@163.com
If you are using maven , you can add the cloudera maven repo to the repository 
in pom.xml 
and add the dependency of spark-hbase. 
I just found this : 
http://spark-packages.org/package/nerdammer/spark-hbase-connector 
as Feng Dongyu recommend, you can try this also, but I had no experience of 
using this. 




fightf...@163.com
 
发件人: censj
发送时间: 2015-12-09 15:44
收件人: fightf...@163.com
抄送: user@spark.apache.org
主题: Re: About Spark On Hbase
So, I how to get this jar? I use set package project.I not found sbt lib.
在 2015年12月9日,15:42,fightf...@163.com 写道:

I don't think it really need CDH component. Just use the API 



fightf...@163.com
 
发件人: censj
发送时间: 2015-12-09 15:31
收件人: fightf...@163.com
抄送: user@spark.apache.org
主题: Re: About Spark On Hbase
But this is dependent on CDH。I not install CDH。
在 2015年12月9日,15:18,fightf...@163.com 写道:

Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase 
Also, HBASE-13992  already integrates that feature into the hbase side, but 
that feature has not been released. 

Best,
Sun.



fightf...@163.com
 
From: censj
Date: 2015-12-09 15:04
To: user@spark.apache.org
Subject: About Spark On Hbase
hi all,
 now I using spark,but I not found spark operation hbase open source. Do 
any one tell me? 



Re: About Spark On Hbase

2015-12-08 Thread censj
So, I how to get this jar? I use set package project.I not found sbt lib.
> 在 2015年12月9日,15:42,fightf...@163.com 写道:
> 
> I don't think it really need CDH component. Just use the API 
> 
> fightf...@163.com <mailto:fightf...@163.com>
>  
> 发件人: censj <mailto:ce...@lotuseed.com>
> 发送时间: 2015-12-09 15:31
> 收件人: fightf...@163.com <mailto:fightf...@163.com>
> 抄送: user@spark.apache.org <mailto:user@spark.apache.org>
> 主题: Re: About Spark On Hbase
> But this is dependent on CDH。I not install CDH。
>> 在 2015年12月9日,15:18,fightf...@163.com <mailto:fightf...@163.com> 写道:
>> 
>> Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase 
>> <https://github.com/cloudera-labs/SparkOnHBase> 
>> Also, HBASE-13992 <https://issues.apache.org/jira/browse/HBASE-13992>  
>> already integrates that feature into the hbase side, but 
>> that feature has not been released. 
>> 
>> Best,
>> Sun.
>> 
>> fightf...@163.com <mailto:fightf...@163.com>
>>  
>> From: censj <mailto:ce...@lotuseed.com>
>> Date: 2015-12-09 15:04
>> To: user@spark.apache.org <mailto:user@spark.apache.org>
>> Subject: About Spark On Hbase
>> hi all,
>>  now I using spark,but I not found spark operation hbase open source. Do 
>> any one tell me? 



回复: Re: About Spark On Hbase

2015-12-08 Thread fightf...@163.com
I don't think it really need CDH component. Just use the API 



fightf...@163.com
 
发件人: censj
发送时间: 2015-12-09 15:31
收件人: fightf...@163.com
抄送: user@spark.apache.org
主题: Re: About Spark On Hbase
But this is dependent on CDH。I not install CDH。
在 2015年12月9日,15:18,fightf...@163.com 写道:

Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase 
Also, HBASE-13992  already integrates that feature into the hbase side, but 
that feature has not been released. 

Best,
Sun.



fightf...@163.com
 
From: censj
Date: 2015-12-09 15:04
To: user@spark.apache.org
Subject: About Spark On Hbase
hi all,
 now I using spark,but I not found spark operation hbase open source. Do 
any one tell me? 



Re: About Spark On Hbase

2015-12-08 Thread censj
Can you get me a example?
I want to update base data.
> 在 2015年12月9日,15:19,Fengdong Yu  写道:
> 
> https://github.com/nerdammer/spark-hbase-connector 
> 
> 
> This is better and easy to use.
> 
> 
> 
> 
> 
>> On Dec 9, 2015, at 3:04 PM, censj > > wrote:
>> 
>> hi all,
>>  now I using spark,but I not found spark operation hbase open source. Do 
>> any one tell me? 
>>  
> 



Re: About Spark On Hbase

2015-12-08 Thread censj
But this is dependent on CDH。I not install CDH。
> 在 2015年12月9日,15:18,fightf...@163.com 写道:
> 
> Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase 
>  
> Also, HBASE-13992   
> already integrates that feature into the hbase side, but 
> that feature has not been released. 
> 
> Best,
> Sun.
> 
> fightf...@163.com 
>  
> From: censj 
> Date: 2015-12-09 15:04
> To: user@spark.apache.org 
> Subject: About Spark On Hbase
> hi all,
>  now I using spark,but I not found spark operation hbase open source. Do 
> any one tell me? 



Re: About Spark On Hbase

2015-12-08 Thread Fengdong Yu
https://github.com/nerdammer/spark-hbase-connector

This is better and easy to use.





> On Dec 9, 2015, at 3:04 PM, censj  wrote:
> 
> hi all,
>  now I using spark,but I not found spark operation hbase open source. Do 
> any one tell me? 
>  



Re: About Spark On Hbase

2015-12-08 Thread fightf...@163.com
Actually you can refer to https://github.com/cloudera-labs/SparkOnHBase 
Also, HBASE-13992  already integrates that feature into the hbase side, but 
that feature has not been released. 

Best,
Sun.



fightf...@163.com
 
From: censj
Date: 2015-12-09 15:04
To: user@spark.apache.org
Subject: About Spark On Hbase
hi all,
 now I using spark,but I not found spark operation hbase open source. Do 
any one tell me?