about spark on hbase problem

2021-08-17 Thread igyu
System.setProperty("java.security.krb5.conf", config.getJSONObject("auth").getString("krb5")) val conf = HBaseConfiguration.create() val zookeeper = config.getString("zookeeper") val port = config.getString("port") conf.set(HConstants.ZOOKEEPER_QUORUM, zookeeper)

Re: Spark submit hbase issue

2021-04-14 Thread Mich Talebzadeh
Try adding hbase-site.xml file to %SPARK_HOME%\conf and see if it works HTH view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other

Spark submit hbase issue

2021-04-14 Thread KhajaAsmath Mohammed
Hi, Spark submit is connecting to local host instead of zookeeper mentioned in hbase-site.xml. This same program works in ide which gets connected to hbase-site.xml. What am I missing in spark submit? > >  > spark-submit --driver-class-path >

Re: Spark RDD + HBase: adoption trend

2021-01-20 Thread Sean Owen
into that API for certain operations. If that's a connector to read data from HBase - you probably do want to return DataFrames ideally. Unless you're relying on very specific APIs from very specific versions, I wouldn't think a distro's Spark or HBase is much different? On Wed, Jan 20, 2021 at 7:44 AM Marco

Re: Spark RDD + HBase: adoption trend

2021-01-20 Thread Jacek Laskowski
/bigtable/spark where you can find a demo that I worked on last year and made sure that: "Apache HBase™ Spark Connector implements the DataSource API for Apache HBase and allows executing relational queries on data stored in Cloud Bigtable." That makes hbase-rdd even more obsolete but not n

Spark RDD + HBase: adoption trend

2021-01-20 Thread Marco Firrincieli
t sure how much of the community still works/uses RDDs. Also, for lack of time, we always mainly worked using Cloudera-flavored Hadoop/HBase & Spark versions. We were thinking the community would then help us organize the project in a more "generic" way, but that didn't happen. So I

Re: Needed some best practices to integrate Spark with HBase

2020-07-20 Thread YogeshGovi
I also need good docs on this. Especially integrating pyspark with hive reading tables from hbase. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Spark with HBase on Spark Runtime 2.2.1

2018-05-05 Thread SparkUser6
I wrote a simple program to read data from HBase, the program works find in Cloudera backed by HDFS. The program works fine on SPARK RUNTIME 1.6 on Cloudera. But does NOT work on EMR with Spark Runtime 2.2.1. But getting an exception while testing data on EMR with S3. // Spark conf

Re: HDP 2.5 - Python - Spark-On-Hbase

2017-09-30 Thread Debabrata Ghosh
Ayan, Did you get to work the HBase Connection through Pyspark as well ? I have got the Spark - HBase connection working with Scala (via HBasecontext). However, but I eventually want to get this working within a Pyspark code - Would you have some suitable code snippets

Needed some best practices to integrate Spark with HBase

2017-09-29 Thread Debabrata Ghosh
Dear All, Greetings ! I needed some best practices for integrating Spark with HBase. Would you be able to point me to some useful resources / URL's to your convenience please. Thanks, Debu

Re: HDP 2.5 - Python - Spark-On-Hbase

2017-06-28 Thread ayan guha
2.1 and earlier >>>- The ability to configure closure serializer >>>- HTTPBroadcast >>>- TTL-based metadata cleaning >>>- *Semi-private class org.apache.spark.Logging. We suggest you use >>>slf4j directly.* >>>- SparkContext.metricsSystem

Re: HDP 2.5 - Python - Spark-On-Hbase

2017-06-26 Thread Weiqing Yang
gt;- The ability to configure closure serializer >>- HTTPBroadcast >>- TTL-based metadata cleaning >>- *Semi-private class org.apache.spark.Logging. We suggest you use >>slf4j directly.* >>- SparkContext.metricsSystem >> >> Thanks, >> &

Re: HDP 2.5 - Python - Spark-On-Hbase

2017-06-26 Thread ayan guha
t; *From:* ayan guha [mailto:guha.a...@gmail.com] > *Sent:* Monday, June 26, 2017 6:26 AM > *To:* Weiqing Yang > *Cc:* user > *Subject:* Re: HDP 2.5 - Python - Spark-On-Hbase > > > > Hi > > > > I am using following: > > > > --packages com.hortonwork

RE: HDP 2.5 - Python - Spark-On-Hbase

2017-06-25 Thread Mahesh Sawaiker
Yang Cc: user Subject: Re: HDP 2.5 - Python - Spark-On-Hbase Hi I am using following: --packages com.hortonworks:shc:1.0.0-1.6-s_2.10 --repositories http://repo.hortonworks.com/content/groups/public/ Is it compatible with Spark 2.X? I would like to use it Best Ayan On Sat, Jun 24, 2017 at 2

Re: HDP 2.5 - Python - Spark-On-Hbase

2017-06-25 Thread ayan guha
Hi I am using following: --packages com.hortonworks:shc:1.0.0-1.6-s_2.10 --repositories http://repo.hortonworks.com/content/groups/public/ Is it compatible with Spark 2.X? I would like to use it Best Ayan On Sat, Jun 24, 2017 at 2:09 AM, Weiqing Yang wrote: >

Re: HDP 2.5 - Python - Spark-On-Hbase

2017-06-23 Thread Weiqing Yang
Yes. What SHC version you were using? If hitting any issues, you can post them in SHC github issues. There are some threads about this. On Fri, Jun 23, 2017 at 5:46 AM, ayan guha wrote: > Hi > > Is it possible to use SHC from Hortonworks with pyspark? If so, any > working

HDP 2.5 - Python - Spark-On-Hbase

2017-06-23 Thread ayan guha
Hi Is it possible to use SHC from Hortonworks with pyspark? If so, any working code sample available? Also, I faced an issue while running the samples with Spark 2.0 "Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging" Any workaround? Thanks in advance -- Best

Re: How to set NameSpace while storing from Spark to HBase using saveAsNewAPIHadoopDataSet

2016-12-19 Thread Rabin Banerjee
Thanks , It worked !! On Mon, Dec 19, 2016 at 5:55 PM, Dhaval Modi <dhavalmod...@gmail.com> wrote: > > Replace with ":" > > Regards, > Dhaval Modi > > On 19 December 2016 at 13:10, Rabin Banerjee <dev.rabin.baner...@gmail.com > > wrote: > >&

Re: How to set NameSpace while storing from Spark to HBase using saveAsNewAPIHadoopDataSet

2016-12-19 Thread Dhaval Modi
Replace with ":" Regards, Dhaval Modi On 19 December 2016 at 13:10, Rabin Banerjee <dev.rabin.baner...@gmail.com> wrote: > HI All, > > I am trying to save data from Spark into HBase using saveHadoopDataSet > API . Please refer the below code . Code is working fine

How to set NameSpace while storing from Spark to HBase using saveAsNewAPIHadoopDataSet

2016-12-19 Thread Rabin Banerjee
HI All, I am trying to save data from Spark into HBase using saveHadoopDataSet API . Please refer the below code . Code is working fine .But the table is getting stored in the default namespace.how to set the NameSpace in the below code? wordCounts.foreachRDD ( rdd = { val conf

Re: Spark to HBase Fast Bulk Upload

2016-09-19 Thread Kabeer Ahmed
Hi, Without using Spark there are a couple of options. You can refer to the link: http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/. The gist is that you convert the data into HFiles and use the bulk upload option to get the data quickly into HBase. HTH Kabeer. On

Spark to HBase Fast Bulk Upload

2016-09-19 Thread Punit Naik
Hi Guys I have a huge dataset (~ 1TB) which has about a billion records. I have to transfer it to an HBase table. What is the fastest way of doing it? -- Thank You Regards Punit Naik

Re: Issues with Spark On Hbase Connector and versions

2016-08-30 Thread Weiqing Yang
t;mailto:user@spark.apache.org>> Subject: Re: Issues with Spark On Hbase Connector and versions There is connection leak problem with hortonworks hbase connector if you use hbase 1.2.0. I tried to use hortonwork's connector and felt into the same problem. Have a look at this

Re: Issues with Spark On Hbase Connector and versions

2016-08-29 Thread Sachin Jain
wer these > > 1. What versions of Hbase & Spark expected? I could not run examples > provided using spark 1.6.0 & hbase 1.2.0 > 2. I get error when i run example provided here > <https://github.com/hortonworks/shc/blob/master/src/main/scala/org/apache/ > sp

Issues with Spark On Hbase Connector and versions

2016-08-27 Thread spats
Regarding hbase connector by hortonworks https://github.com/hortonworks-spark/shc, it would be great if someone can answer these 1. What versions of Hbase & Spark expected? I could not run examples provided using spark 1.6.0 & hbase 1.2.0 2. I get error when i run example provided her

RE: Spark with HBase Error - Py4JJavaError

2016-07-08 Thread Puneet Tripathi
Hi Ram, Thanks very much it worked. Puneet From: ram kumar [mailto:ramkumarro...@gmail.com] Sent: Thursday, July 07, 2016 6:51 PM To: Puneet Tripathi Cc: user@spark.apache.org Subject: Re: Spark with HBase Error - Py4JJavaError Hi Puneet, Have you tried appending --jars $SPARK_HOME/lib/spark

Re: Spark with HBase Error - Py4JJavaError

2016-07-07 Thread ram kumar
gt; > > > *From:* Puneet Tripathi [mailto:puneet.tripa...@dunnhumby.com] > *Sent:* Thursday, July 07, 2016 12:42 PM > *To:* user@spark.apache.org > *Subject:* Spark with HBase Error - Py4JJavaError > > > > Hi, > > > > We are running Hbase in fully distributed mode.

RE: Spark with HBase Error - Py4JJavaError

2016-07-07 Thread Puneet Tripathi
Guys, Please can anyone help on the issue below? Puneet From: Puneet Tripathi [mailto:puneet.tripa...@dunnhumby.com] Sent: Thursday, July 07, 2016 12:42 PM To: user@spark.apache.org Subject: Spark with HBase Error - Py4JJavaError Hi, We are running Hbase in fully distributed mode. I tried

Spark with HBase Error - Py4JJavaError

2016-07-07 Thread Puneet Tripathi
Hi, We are running Hbase in fully distributed mode. I tried to connect to Hbase via pyspark and then write to hbase using saveAsNewAPIHadoopDataset , but it failed the error says: Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.saveAsHadoopDataset. :

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

2016-04-14 Thread Teng Qiu
forward you this mails, hope these can help you, you can take a look at this post http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html 2016-03-04 3:30 GMT+01:00 Divya Gehlot <divya.htco...@gmail.com>: > Hi Teng, > > Thanks for the link you shared , helpe

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

2016-04-08 Thread Wojciech Indyk
Hello Divya! Have you solved the problem? I suppose the log comes from driver. You need to look also at logs on worker JVMs, there can be an exception or something. Do you have Kerberos on your cluster? It could be similar to a problem http://issues.apache.org/jira/browse/SPARK-14115 Based on

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

2016-03-01 Thread Teng Qiu
and also make sure that hbase-site.xml is set in your classpath on all nodes, both master and workers, and also client. normally i put it into $SPARK_HOME/conf/ then the spark cluster will be started with this conf file. btw. @Ted, did you tried insert into hbase table with spark's HiveContext?

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

2016-03-01 Thread Ted Yu
16/03/01 01:36:31 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal): java.lang.RuntimeException: hbase-default.xml file seems to be for an older version of HBase (null), this version is 1.1.2.2.3.4.0-3485 The above was likely caused by some

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

2016-02-29 Thread Ted Yu
16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=9 watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase Since baseZNode didn't match what you set in hbase-site.xml, the cause was likely that hbase-site.xml being

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

2016-02-29 Thread Ted Yu
16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error) Is your cluster secure cluster ? bq. Trace : Was there any output after 'Trace :' ? Was hbase-site.xml accessible to your Spark job

[ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

2016-02-29 Thread Divya Gehlot
Hi, I am getting error when I am trying to connect hive table (which is being created through HbaseIntegration) in spark Steps I followed : *Hive Table creation code *: CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH

Re: Spark and HBase RDD join/get

2016-01-14 Thread Ted Yu
For #1, yes it is possible. You can find some example in hbase-spark module of hbase where hbase as DataSource is provided. e.g. https://github.com/apache/hbase/blob/master/hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/HBaseRDDFunctions.scala Cheers On Thu, Jan 14, 2016 at 5:04 AM

Spark and HBase RDD join/get

2016-01-14 Thread Kristoffer Sjögren
Hi We have a RDD that needs to be mapped with information from HBase, where the exact key is the user id. What's the different alternatives for doing this? - Is it possible to do HBase.get() requests from a map function in Spark? - Or should we join RDDs with all full HBase table scan? I ask

Re: Spark and HBase RDD join/get

2016-01-14 Thread Kristoffer Sjögren
Thanks Ted! On Thu, Jan 14, 2016 at 4:49 PM, Ted Yu <yuzhih...@gmail.com> wrote: > For #1, yes it is possible. > > You can find some example in hbase-spark module of hbase where hbase as > DataSource is provided. > e.g. > > https://github.com/apache/hbase/blob/master

Re: about spark on hbase

2015-12-18 Thread Akhil Das
hen save the values:* values.map(valu => (new ImmutableBytesWritable, { val record = new Put(Bytes.toBytes(valu._1)) record.add(Bytes.toBytes("CF"), Bytes.toBytes(hbaseColumnName), Bytes.toBytes(valu._2.toString)) record } ) ).s

Re: About Spark On Hbase

2015-12-15 Thread Zhan Zhang
you can add the cloudera maven repo to the repository in pom.xml and add the dependency of spark-hbase. I just found this : http://spark-packages.org/package/nerdammer/spark-hbase-connector as Feng Dongyu recommend, you can try this also, but I had no experience of using this.

Re: About Spark On Hbase

2015-12-15 Thread Ted Yu
There is also http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase FYI On Tue, Dec 15, 2015 at 11:51 AM, Zhan Zhang <zzh...@hortonworks.com> wrote: > If you want dataframe support, you can refer to > https://github.com/zhzhan/shc, which I am working on to integrat

Re: About Spark On Hbase

2015-12-15 Thread Josh Mahonin
Huawei-Spark/Spark-SQL-on-HBase > > FYI > > On Tue, Dec 15, 2015 at 11:51 AM, Zhan Zhang <zzh...@hortonworks.com> > wrote: > >> If you want dataframe support, you can refer to >> https://github.com/zhzhan/shc, which I am working on to integrate to >> HBase up

about spark on hbase

2015-12-15 Thread censj
hi,all: how cloud I through spark function hbase get value then update this value and put this value to hbase ?

Re: About Spark On Hbase

2015-12-15 Thread censj
>> If you are using maven , you can add the cloudera maven repo to the >> repository in pom.xml >> and add the dependency of spark-hbase. >> I just found this : >> http://spark-packages.org/package/nerdammer/spark-hbase-connector >> <http://spark-packages.o

回复: Re: About Spark On Hbase

2015-12-09 Thread fightf...@163.com
If you are using maven , you can add the cloudera maven repo to the repository in pom.xml and add the dependency of spark-hbase. I just found this : http://spark-packages.org/package/nerdammer/spark-hbase-connector as Feng Dongyu recommend, you can try this also, but I had no experience

Re: About Spark On Hbase

2015-12-09 Thread censj
Thank you! I know > 在 2015年12月9日,15:59,fightf...@163.com 写道: > > If you are using maven , you can add the cloudera maven repo to the > repository in pom.xml > and add the dependency of spark-hbase. > I just found this : > http://spark-packages.org/package/nerdammer/

Re: About Spark On Hbase

2015-12-08 Thread censj
Can you get me a example? I want to update base data. > 在 2015年12月9日,15:19,Fengdong Yu <fengdo...@everstring.com> 写道: > > https://github.com/nerdammer/spark-hbase-connector > <https://github.com/nerdammer/spark-hbase-connector> > > This is better and easy to u

回复: Re: About Spark On Hbase

2015-12-08 Thread fightf...@163.com
I don't think it really need CDH component. Just use the API fightf...@163.com 发件人: censj 发送时间: 2015-12-09 15:31 收件人: fightf...@163.com 抄送: user@spark.apache.org 主题: Re: About Spark On Hbase But this is dependent on CDH。I not install CDH。 在 2015年12月9日,15:18,fightf...@163.com 写道: Actually

Re: About Spark On Hbase

2015-12-08 Thread censj
tuseed.com> > 发送时间: 2015-12-09 15:31 > 收件人: fightf...@163.com <mailto:fightf...@163.com> > 抄送: user@spark.apache.org <mailto:user@spark.apache.org> > 主题: Re: About Spark On Hbase > But this is dependent on CDH。I not install CDH。 >> 在 2015年12月9日,15:18,fightf...@16

Re: About Spark On Hbase

2015-12-08 Thread fightf...@163.com
Spark On Hbase hi all, now I using spark,but I not found spark operation hbase open source. Do any one tell me?

Re: About Spark On Hbase

2015-12-08 Thread censj
12-09 15:04 > To: user@spark.apache.org <mailto:user@spark.apache.org> > Subject: About Spark On Hbase > hi all, > now I using spark,but I not found spark operation hbase open source. Do > any one tell me?

Re: About Spark On Hbase

2015-12-08 Thread Fengdong Yu
https://github.com/nerdammer/spark-hbase-connector This is better and easy to use. > On Dec 9, 2015, at 3:04 PM, censj <ce...@lotuseed.com> wrote: > > hi all, > now I using spark,but I not found spark operation hbase open source. Do > any one tell me? >

About Spark On Hbase

2015-12-08 Thread censj
hi all, now I using spark,but I not found spark operation hbase open source. Do any one tell me?

Re: Spark on hbase using Phoenix in secure cluster

2015-12-07 Thread Ruslan Dautkhanov
erberized cluster and ticket was generated using kinit command > before running spark job. That's why Spark on hbase worked but when phoenix > is used to get the connection to hbase, it does not pass the authentication > to all nodes. Probably it is not handled in Phoenix version 4.3 or

Re: Spark on hbase using Phoenix in secure cluster

2015-12-07 Thread Akhilesh Pathodia
Yes, its a kerberized cluster and ticket was generated using kinit command before running spark job. That's why Spark on hbase worked but when phoenix is used to get the connection to hbase, it does not pass the authentication to all nodes. Probably it is not handled in Phoenix version 4.3

Spark on hbase using Phoenix in secure cluster

2015-12-07 Thread Akhilesh Pathodia
Hi, I am running spark job on yarn in cluster mode in secured cluster. I am trying to run Spark on Hbase using Phoenix, but Spark executors are unable to get hbase connection using phoenix. I am running knit command to get the ticket before starting the job and also keytab file and principal

Re: Spark on hbase using Phoenix in secure cluster

2015-12-07 Thread Ruslan Dautkhanov
That error is not directly related to spark nor hbase javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] Is this a kerberized cluster? You likely don't have a good (non-expired

spark to hbase

2015-10-27 Thread jinhong lu
Hi, I write my result to hdfs, it did well: val model = lines.map(pairFunction).groupByKey().flatMap(pairFlatMapFunction).aggregateByKey(new TrainFeature())(seqOp, combOp).values model.map(a => (a.toKey() + "\t" + a.totalCount + "\t" + a.positiveCount)).saveAsTextFile(modelDataPath); But

Re: There is any way to write from spark to HBase CDH4?

2015-10-27 Thread Adrian Tanase
Also I just remembered about cloudera’s contribution http://blog.cloudera.com/blog/2015/08/apache-spark-comes-to-apache-hbase-with-hbase-spark-module/ From: Deng Ching-Mallete Date: Tuesday, October 27, 2015 at 12:03 PM To: avivb Cc: user Subject: Re: There is any way to write from spark to HBase

Re: There is any way to write from spark to HBase CDH4?

2015-10-27 Thread Adrian Tanase
ey.com> wrote: >I have already try it with https://github.com/unicredit/hbase-rdd and >https://github.com/nerdammer/spark-hbase-connector and in both cases I get >timeout. > >So I would like to know about other option to write from Spark to HBase >CDH4. > >Thanks! > &g

Re: spark to hbase

2015-10-27 Thread Deng Ching-Mallete
Hi, It would be more efficient if you configure the table and flush the commits by partition instead of per element in the RDD. The latter works fine because you only have 4 elements, but it won't bid well for large data sets IMO.. Thanks, Deng On Tue, Oct 27, 2015 at 5:22 PM, jinhong lu

Re: There is any way to write from spark to HBase CDH4?

2015-10-27 Thread Deng Ching-Mallete
PM, Adrian Tanase <atan...@adobe.com> wrote: > > Also I just remembered about cloudera’s contribution > > http://blog.cloudera.com/blog/2015/08/apache-spark-comes-to-apache-hbase-with-hbase-spark-module/ > > From: Deng Ching-Mallete > Date: Tuesday, October 27, 2015 at 12:03 PM >

Re: There is any way to write from spark to HBase CDH4?

2015-10-27 Thread Adrian Tanase
to write from spark to HBase CDH4? It's still in HBase' trunk, scheduled for 2.0.0 release based on Jira ticket. -Deng On Tue, Oct 27, 2015 at 6:35 PM, Fengdong Yu <fengdo...@everstring.com<mailto:fengdo...@everstring.com>> wrote: Does this released with Spark1.*? or still kept

回复: spark to hbase

2015-10-27 Thread fightf...@163.com
Hi I notice that you configured the following : configuration.set("hbase.master", "192.168.1:6"); Did you mistyped the host IP ? Best, Sun. fightf...@163.com 发件人: jinhong lu 发送时间: 2015-10-27 17:22 收件人: spark users 主题: spark to hbase Hi, I write my result to hd

Re: There is any way to write from spark to HBase CDH4?

2015-10-27 Thread Fengdong Yu
Does this released with Spark1.*? or still kept in the trunk? > On Oct 27, 2015, at 6:22 PM, Adrian Tanase <atan...@adobe.com> wrote: > > Also I just remembered about cloudera’s contribution > http://blog.cloudera.com/blog/2015/08/apache-spark-comes-to-apache-hbase-with-

There is any way to write from spark to HBase CDH4?

2015-10-27 Thread avivb
I have already try it with https://github.com/unicredit/hbase-rdd and https://github.com/nerdammer/spark-hbase-connector and in both cases I get timeout. So I would like to know about other option to write from Spark to HBase CDH4. Thanks! -- View this message in context: http://apache-spark

Re: spark to hbase

2015-10-27 Thread Ted Yu
Jinghong: Hadmin variable is not used. You can omit that line. Which hbase release are you using ? As Deng said, don't flush per row. Cheers > On Oct 27, 2015, at 3:21 AM, Deng Ching-Mallete wrote: > > Hi, > > It would be more efficient if you configure the table and

Re: spark to hbase

2015-10-27 Thread Ted Yu
Jinghong: In one of earlier threads on storing data to hbase, it was found that htrace jar was not on classpath, leading to write failure. Can you check whether you are facing the same problem ? Cheers On Tue, Oct 27, 2015 at 5:11 AM, Ted Yu wrote: > Jinghong: > Hadmin

Re: spark to hbase

2015-10-27 Thread jinhong lu
Hi, Ted thanks for your help. I check the jar, it is in classpath, and now the problem is : 1、 Follow codes runs good, and it put the result to hbse: val res = lines.map(pairFunction).groupByKey().flatMap(pairFlatMapFunction).aggregateByKey(new TrainFeature())(seqOp,

Re: spark to hbase

2015-10-27 Thread Ted Yu
For #2, have you checked task log(s) to see if there was some clue ? You may want to use foreachPartition to reduce the number of flushes. In the future, please remove color coding - it is not easy to read. Cheers On Tue, Oct 27, 2015 at 6:53 PM, jinhong lu wrote: > Hi,

Re: spark to hbase

2015-10-27 Thread jinhong lu
I write a demo, but still no response, no error, no log. My hbase is 0.98, hadoop 2.3, spark 1.4. And I run in yarn-client mode. any idea? thanks. package com.lujinhong.sparkdemo import org.apache.spark._ import org.apache.spark.rdd.NewHadoopRDD import org.apache.hadoop.conf.Configuration;

Re: spark to hbase

2015-10-27 Thread Fengdong Yu
Also, please remove the HBase related to the Scala Object, this will resolve the serialize issue and avoid open connection repeatedly. and remember close the table after the final flush. > On Oct 28, 2015, at 10:13 AM, Ted Yu wrote: > > For #2, have you checked task

Re: Python, Spark and HBase

2015-08-03 Thread ericbless
I wanted to confirm whether this is now supported, such as in Spark v1.3.0 I've read varying info online just thought I'd verify. Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p24117.html Sent from the Apache Spark

Re: Problem Run Spark Example HBase Code Using Spark-Submit

2015-06-26 Thread Akhil Das
Try to add them in the SPARK_CLASSPATH in your conf/spark-env.sh file Thanks Best Regards On Thu, Jun 25, 2015 at 9:31 PM, Bin Wang binwang...@gmail.com wrote: I am trying to run the Spark example code HBaseTest from command line using spark-submit instead run-example, in that case, I can

Problem Run Spark Example HBase Code Using Spark-Submit

2015-06-25 Thread Bin Wang
I am trying to run the Spark example code HBaseTest from command line using spark-submit instead run-example, in that case, I can learn more how to run spark code in general. However, it told me CLASS_NOT_FOUND about htrace since I am using CDH5.4. I successfully located the htrace jar file but I

Kerberos authentication exception when spark access hbase with yarn-cluster mode on a kerberos yarn Cluster

2015-06-17 Thread 马元文
Hi, all I have a question about spark access hbase with yarn-cluster mode on a kerberos yarn Cluster. Is it the only way to enable Spark access HBase by distributing the keytab to each NodeManager? It seems that Spark doesn't provide a delegation token like MR job, am I right

Spark and HBase join issue

2015-03-14 Thread francexo83
Hi all, I have the following cluster configurations: - 5 nodes on a cloud environment. - Hadoop 2.5.0. - HBase 0.98.6. - Spark 1.2.0. - 8 cores and 16 GB of ram on each host. - 1 NFS disk with 300 IOPS mounted on host 1 and 2. - 1 NFS disk with 300 IOPS mounted on host

Re: Spark and HBase join issue

2015-03-14 Thread Ted Yu
...@gmail.com wrote: Hi all, I have the following cluster configurations: - 5 nodes on a cloud environment. - Hadoop 2.5.0. - HBase 0.98.6. - Spark 1.2.0. - 8 cores and 16 GB of ram on each host. - 1 NFS disk with 300 IOPS mounted on host 1 and 2. - 1 NFS disk with 300 IOPS

Re: Spark with HBase

2014-12-15 Thread Aniket Bhatnagar
the same. Can someone please guide me through the steps to accomplish this. Thanks a lot for Helping -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-with-HBase-tp20226.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Spark with HBase

2014-12-03 Thread Akhil Das
You could go through these to start with http://www.vidyasource.com/blog/Programming/Scala/Java/Data/Hadoop/Analytics/2014/01/25/lighting-a-spark-with-hbase http://stackoverflow.com/questions/25189527/how-to-process-a-range-of-hbase-rows-using-spark Thanks Best Regards On Wed, Dec 3, 2014

Re: Spark with HBase

2014-12-03 Thread Ted Yu
for some links regarding the same. Can someone please guide me through the steps to accomplish this. Thanks a lot for Helping -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-with-HBase-tp20226.html Sent from the Apache Spark User List mailing list

Re: spark 1.1.0 - hbase 0.98.6-hadoop2 version - py4j.protocol.Py4JJavaError java.lang.ClassNotFoundException

2014-10-04 Thread Nick Pentreath
, I installed hbase-0.98.6-hadoop2. It's working not any problem with that. When i am try to run spark hbase python examples, (wordcount examples working - not python issue) ./bin/spark-submit --master local --driver-class-path ./examples/target/spark-examples_2.10-1.1.0.jar ./examples

spark 1.1.0 - hbase 0.98.6-hadoop2 version - py4j.protocol.Py4JJavaError java.lang.ClassNotFoundException

2014-10-03 Thread serkan.dogan
Hi, I installed hbase-0.98.6-hadoop2. It's working not any problem with that. When i am try to run spark hbase python examples, (wordcount examples working - not python issue) ./bin/spark-submit --master local --driver-class-path ./examples/target/spark-examples_2.10-1.1.0.jar ./examples

spark 1.1.0 - hbase 0.98.6-hadoop2 version - py4j.protocol.Py4JJavaError java.lang.ClassNotFoundException

2014-10-03 Thread serkan.dogan
Hi, I installed hbase-0.98.6-hadoop2. It's working not any problem with that. When i am try to run spark hbase python examples, (wordcount examples working - not python issue) ./bin/spark-submit --master local --driver-class-path ./examples/target/spark-examples_2.10-1.1.0.jar ./examples

Re: Spark with HBase

2014-08-07 Thread Akhil Das
( the version is quiet old) Attached is a piece of Code (Spark Java API) to connect to HBase. Thanks Best Regards On Thu, Aug 7, 2014 at 1:48 PM, Deepa Jayaveer deepa.jayav...@tcs.com wrote: Hi I read your white paper about . We wanted to do a Proof of Concept on Spark with HBase. Documents

Re: Spark with HBase

2014-08-07 Thread chutium
this two posts should be good for setting up spark+hbase environment and use the results of hbase table scan as RDD settings http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html some samples: http://www.abcn.net/2014/07/spark-hbase-result-keyvalue-bytearray.html -- View

Use Spark with HBase' HFileOutputFormat

2014-07-16 Thread Jianshi Huang
Hi, I want to use Spark with HBase and I'm confused about how to ingest my data using HBase' HFileOutputFormat. It recommends calling configureIncrementalLoad which does the following: - Inspects the table to configure a total order partitioner - Uploads the partitions file to the cluster

RE: Spark with HBase

2014-07-04 Thread N . Venkata Naga Ravi
Hi, Any update on the solution? We are still facing this issue... We could able to connect to HBase with independent code, but getting issue with Spark integration. Thx, Ravi From: nvn_r...@hotmail.com To: u...@spark.incubator.apache.org; user@spark.apache.org Subject: RE: Spark with HBase

Spark with HBase

2014-06-29 Thread N . Venkata Naga Ravi
I am using follwoing versiongs .. spark-1.0.0-bin-hadoop2 hbase-0.96.1.1-hadoop2 When executing Hbase Test , i am facing following exception. Looks like some version incompatibility, can you please help on it. NERAVI-M-70HY:spark-1.0.0-bin-hadoop2 neravi$ ./bin/run-example

RE: Spark with HBase

2014-06-29 Thread N . Venkata Naga Ravi
+user@spark.apache.org From: nvn_r...@hotmail.com To: u...@spark.incubator.apache.org Subject: Spark with HBase Date: Sun, 29 Jun 2014 15:28:43 +0530 I am using follwoing versiongs .. spark-1.0.0-bin-hadoop2 hbase-0.96.1.1-hadoop2 When executing Hbase Test , i am facing

Re: Problem using Spark with Hbase

2014-05-30 Thread Vibhor Banga
Thanks Mayur for the reply. Actually issue was the I was running Spark application on hadoop-2.2.0 and hbase version there was 0.95.2. But spark by default gets build by an older hbase version. So I had to build spark again with hbase version as 0.95.2 in spark build file. And it worked. Thanks

Re: Python, Spark and HBase

2014-05-29 Thread Nick Pentreath
format(self._fqn + name)) 660 661 def __call__(self, *args): Py4JError: org.apache.spark.api.python.PythonRDDnewAPIHadoopFile does not exist in the JVM -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase

Problem using Spark with Hbase

2014-05-28 Thread Vibhor Banga
Hi all, I am facing issues while using spark with HBase. I am getting NullPointerException at org.apache.hadoop.hbase.TableName.valueOf (TableName.java:288) Can someone please help to resolve this issue. What am I missing ? I am using following snippet of code - Configuration config

Re: Problem using Spark with Hbase

2014-05-28 Thread Vibhor Banga
Any one who has used spark this way or has faced similar issue, please help. Thanks, -Vibhor On Wed, May 28, 2014 at 6:03 PM, Vibhor Banga vibhorba...@gmail.com wrote: Hi all, I am facing issues while using spark with HBase. I am getting NullPointerException

Re: Python, Spark and HBase

2014-05-28 Thread twizansk
reference to the class org.apache.spark.api.python.PythonRDDnewAPIHadoopFile Any ideas? Also, do you have a working example of HBase access with the new code? Thanks Tommer -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6502.html

Re: Python, Spark and HBase

2014-05-28 Thread Matei Zaharia
to the class org.apache.spark.api.python.PythonRDDnewAPIHadoopFile Any ideas? Also, do you have a working example of HBase access with the new code? Thanks Tommer -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6502

Re: Python, Spark and HBase

2014-05-28 Thread twizansk
='org.apache.hadoop.hbase.client.Result' Is it possible that the typo is coming from inside the spark code? Tommer -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6506.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Python, Spark and HBase

2014-05-28 Thread twizansk
format(self._fqn + name)) 660 661 def __call__(self, *args): Py4JError: org.apache.spark.api.python.PythonRDDnewAPIHadoopFile does not exist in the JVM -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase

Re: Spark on HBase vs. Spark on HDFS

2014-05-23 Thread Mayur Rustagi
Also I am unsure if Spark on Hbase leverages Locality. When you cache process data do you see node_local jobs in process list. Spark on HDFS leverages locality quite well can really boost performance by 3-4x in my experience. If you are loading all your data from HBase to spark then you

  1   2   >