Hi Roy,

Here is a cloudera-labs project SparkOnHBase that makes it really simple to 
read HBase data into Spark.

https://github.com/cloudera-labs/SparkOnHBase 
<https://github.com/cloudera-labs/SparkOnHBase>

Link to blog that explains how to use the package.

http://blog.cloudera.com/blog/2014/12/new-in-cloudera-labs-sparkonhbase/ 
<http://blog.cloudera.com/blog/2014/12/new-in-cloudera-labs-sparkonhbase/>

It also has been committed to HBase project now.

http://blog.cloudera.com/blog/2015/08/apache-spark-comes-to-apache-hbase-with-hbase-spark-module/
 
<http://blog.cloudera.com/blog/2015/08/apache-spark-comes-to-apache-hbase-with-hbase-spark-module/>

HBase Jira link: https://issues.apache.org/jira/browse/HBASE-13992 
<https://issues.apache.org/jira/browse/HBASE-13992>


Guru Medasani
gdm...@gmail.com



> On Oct 8, 2015, at 9:29 PM, Roy Wang <roywang1...@163.com> wrote:
> 
> 
> I want to load hbase table into spark.
> JavaPairRDD<ImmutableBytesWritable, Result> hBaseRDD =
> sc.newAPIHadoopRDD(conf, TableInputFormat.class,
> ImmutableBytesWritable.class, Result.class);
> 
> *when call hBaseRDD.count(),got error.*
> 
> Caused by: java.lang.IllegalStateException: The input format instance has
> not been properly initialized. Ensure you call initializeTable either in
> your constructor or initialize method
>       at
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getTable(TableInputFormatBase.java:389)
>       at
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:158)
>       ... 11 more
> 
> *But when job start,I can get these logs*
> 2015-10-09 09:17:00[main] WARN  TableInputFormatBase:447 - initializeTable
> called multiple times. Overwriting connection and table reference;
> TableInputFormatBase will not close these old references when done.
> 
> Does anyone know how does this happen?
> 
> Thanks! 
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Error-in-load-hbase-on-spark-tp24986.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 

Reply via email to