Hi ted, yes I check hbase's github and the doc , but seems there isn't an available dependency for module hbase-spark. I checked the http://mvnrepository.com/search?q=HBase-Spark, not result matches. I also checked https://hbase.apache.org/book.html#spark> ,there is not dependency of hbase-spark or relative as well. And
-----邮件原件----- 发件人: Ted Yu [mailto:yuzhih...@gmail.com] 发送时间: 2016年6月29日 11:24 收件人: user@hbase.apache.org 主题: [Marketing Mail] Re: question about SparkSQL loading hbase tables There is no hbase release with full support for SparkSQL yet. For #1, the classes / directories are (master branch): ./hbase-spark/src/main/java/org/apache/hadoop/hbase/spark/example/hbasecontext ./hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/example/hbasecontext hbase-spark/src/main/scala/org/apache/spark/sql/datasources/hbase/HBaseTableCatalog.scala ./hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/datasources/HBaseSparkConf.scala For documentation, see HBASE-15473. On Tue, Jun 28, 2016 at 7:13 PM, 罗辉 <luo...@ifeng.com> wrote: > Hi there > > I am using SparkSQL to read from hbase, however > > 1. I find some API not available in my dependencies. Where to add > them: > > org.apache.hadoop.hbase.spark.example.hbasecontext > > org.apache.spark.sql.datasources.hbase.HBaseTableCatalog > > org.apache.hadoop.hbase.spark.datasources.HBaseSparkConf > > 2. Is there a complete example code about how to use SparkSQL > read/write from hbase? > > The document I refered is this: > http://hbase.apache.org/book.html#_sparksql_dataframes. It seems that > this is a snapshot for 2.0, while I am using hbase 1.2.1 + spark1.6.1 > + hadoop2.7.1. > > > > In my App, I want to load the entire hbase table into sparksql > > My code: > > > > import org.apache.spark._ > > import org.apache.hadoop.hbase._ > > import org.apache.hadoop.hbase.HBaseConfiguration > > import org.apache.hadoop.hbase.spark.example.hbasecontext > > import org.apache.spark.sql.datasources.hbase.HBaseTableCatalog > > import org.apache.hadoop.hbase.spark.datasources.HBaseSparkConf > > > > object HbaseConnector { > > def main(args: Array[String]) { > > val tableName = args(0) > > val sparkMasterUrlDev = "spark:// hadoopmaster:7077" > > val sparkMasterUrlLocal = "local[2]" > > > > val sparkConf = new SparkConf().setAppName("HbaseConnector for table " > + tableName).setMaster(sparkMasterUrlDev).set("spark.executor.memory", > "10g") > > val sc = new SparkContext(sparkConf) > > val sqlContext = new org.apache.spark.sql.SQLContext(sc) > > val conf = new HBaseConfiguration() > > conf.set("hbase.zookeeper.quorum", "z1,z2,z3") > > conf.set("hbase.zookeeper.property.clientPort", "2181") > > conf.set("hbase.rootdir", "hdfs://hadoopmaster:8020/hbase") > > // val hbaseContext = new HBaseContext(sc, conf) > > > > val pv = > sqlContext.read.options(Map(HBaseTableCatalog.tableCatalog -> > writeCatalog, HBaseSparkConf.TIMESTAMP -> tsSpecified.toString)) > > .format("org.apache.hadoop.hbase.spark") > > .load() > > pv.write.saveAsTable(tableName) > > > > } > > > > } > > > > My POM file is attached as well. > > > > Thanks for a help. > > > > San.Luo >