For org.apache.hadoop.hbase.client.Result, there is this method:

  public byte[] getValue(byte [] family, byte [] qualifier) {

which allows you to retrieve value for designated column.


On Mon, Oct 10, 2016 at 2:08 PM, Mich Talebzadeh <>

> Hi,
> I am trying to do some operation on an Hbase table that is being populated
> by Spark Streaming.
> Now this is just Spark on Hbase as opposed to Spark on Hive -> view on
> Hbase etc. I also have Phoenix view on this Hbase table.
> This is sample code
> scala>     val tableName = "marketDataHbase"
> >     val conf = HBaseConfiguration.create()
> conf: org.apache.hadoop.conf.Configuration = Configuration:
> core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml,
> yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml,
> hbase-default.xml, hbase-site.xml
> scala>     conf.set(TableInputFormat.INPUT_TABLE, tableName)
> scala>         //create rdd
> scala>
> *val hBaseRDD = sc.newAPIHadoopRDD(conf,
> classOf[TableInputFormat],classOf[
> <>.ImmutableBytesWritable],classOf[org.apache.hadoop.
> hbase.client.Result])*hBaseRDD:
> org.apache.spark.rdd.RDD[(
> ImmutableBytesWritable,
> org.apache.hadoop.hbase.client.Result)] = NewHadoopRDD[4] at
> newAPIHadoopRDD at <console>:64
> scala> hBaseRDD.count
> res11: Long = 22272
> scala>     // transform (ImmutableBytesWritable, Result) tuples into an RDD
> of Result's
> scala> val resultRDD = => tuple._2)
> resultRDD: org.apache.spark.rdd.RDD[org.apache.hadoop.hbase.client.Result]
> = MapPartitionsRDD[8] at map at <console>:41
> scala>  // transform into an RDD of (RowKey, ColumnValue)s  the RowKey has
> the time removed
> scala> val keyValueRDD = =>
> (Bytes.toString(result.getRow()).split(" ")(0),
> Bytes.toString(result.value)))
> keyValueRDD: org.apache.spark.rdd.RDD[(String, String)] =
> MapPartitionsRDD[9] at map at <console>:43
> scala> keyValueRDD.take(2).foreach(kv => println(kv))
> (000055e2-63f1-4def-b625-e73f0ac36271,43.89760813529593664528)
> (000151e9-ff27-493d-a5ca-288507d92f95,57.68882040742382868990)
> OK above I am only getting the rowkey (UUID above) and the last
> attribute (price).
> However, I have the rowkey and 3 more columns there in Hbase table!
> scan 'marketDataHbase', "LIMIT" => 1
> ROW                                                   COLUMN+CELL
>  000055e2-63f1-4def-b625-e73f0ac36271
> column=price_info:price, timestamp=1476133232864,
> value=43.89760813529593664528
>  000055e2-63f1-4def-b625-e73f0ac36271
> column=price_info:ticker, timestamp=1476133232864, value=S08
>  000055e2-63f1-4def-b625-e73f0ac36271
> column=price_info:timecreated, timestamp=1476133232864,
> value=2016-10-10T17:12:22
> 1 row(s) in 0.0100 seconds
> So how can I get the other columns?
> Thanks
> Dr Mich Talebzadeh
> LinkedIn *
> <
> OABUrV8Pw>*
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.

Reply via email to