Btw, there are some examples in the Spark GitHub repo that you may find
helpful. Here's one
<https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/HBaseTest.scala>
related to HBase.

On Tue, Sep 16, 2014 at 1:22 PM, <abraham.ja...@thomsonreuters.com> wrote:

>  *Hi, *
>
>
>
> *I had a similar situation in which I needed to read data from HBase and
> work with the data inside of a spark context. After much goooogling, I
> finally got mine to work. There are a bunch of steps that you need to do
> get this working – *
>
>
>
> *The problem is that the spark context does not know anything about hbase,
> so you have to provide all the information about hbase classes to both the
> driver code and executor code…*
>
>
>
>
>
> SparkConf sconf = *new* SparkConf().setAppName(“App").setMaster("local");
>
> JavaSparkContext sc = *new* JavaSparkContext(sconf);
>
>
>
> sparkConf.set("spark.executor.extraClassPath", "$(hbase classpath)");  
> //ç=====
> you will need to add this to tell the executor about the classpath for
> HBase.
>
>
>
> Configuration conf = HBaseConfiguration.*create*();
>
> conf.set(*TableInputFormat*.INPUT_TABLE, "Article");
>
>
>
> JavaPairRDD<ImmutableBytesWritable, Result> hBaseRDD = sc.
> *newAPIHadoopRDD*(conf, *TableInputFormat*.*class*
> ,org.apache.hadoop.hbase.io.ImmutableBytesWritable.*class*,
>
> org.apache.hadoop.hbase.client.Result.*class*);
>
>
>
>
>
> *The when you submit the spark job – *
>
>
>
>
>
> *spark-submit --driver-class-path $(hbase classpath) --jars
> /usr/lib/hbase/hbase-server.jar,/usr/lib/hbase/hbase-client.jar,/usr/lib/hbase/hbase-common.jar,/usr/lib/hbase/hbase-protocol.jar,/usr/lib/hbase/lib/protobuf-java-2.5.0.jar,/usr/lib/hbase/lib/htrace-core.jar
> --class YourClassName --master local App.jar *
>
>
>
>
>
> Try this and see if it works for you.
>
>
>
>
>
> *From:* Y. Dong [mailto:tq00...@gmail.com]
> *Sent:* Tuesday, September 16, 2014 8:18 AM
> *To:* user@spark.apache.org
> *Subject:* HBase and non-existent TableInputFormat
>
>
>
> Hello,
>
>
>
> I’m currently using spark-core 1.1 and hbase 0.98.5 and I want to simply
> read from hbase. The Java code is attached. However the problem is
> TableInputFormat does not even exist in hbase-client API, is there any
> other way I can read from
>
> hbase? Thanks
>
>
>
> SparkConf sconf = *new* SparkConf().setAppName(“App").setMaster("local");
>
> JavaSparkContext sc = *new* JavaSparkContext(sconf);
>
>
>
> Configuration conf = HBaseConfiguration.*create*();
>
> conf.set(*TableInputFormat*.INPUT_TABLE, "Article");
>
>
>
> JavaPairRDD<ImmutableBytesWritable, Result> hBaseRDD = sc.
> *newAPIHadoopRDD*(conf, *TableInputFormat*.*class*
> ,org.apache.hadoop.hbase.io.ImmutableBytesWritable.*class*,
>
> org.apache.hadoop.hbase.client.Result.*class*);
>
>
>
>
>
>
>

Reply via email to