Hi,

Maybe this might be helpful:
https://github.com/GenTang/spark_hbase/blob/master/src/main/scala/examples/pythonConverters.scala

Cheers
Gen

On Thu, Apr 2, 2015 at 1:50 AM, Eric Kimbrel <eric.kimb...@soteradefense.com
> wrote:

> I am attempting to read an hbase table in pyspark with a range scan.
>
> conf = {
>     "hbase.zookeeper.quorum": host,
>     "hbase.mapreduce.inputtable": table,
>     "hbase.mapreduce.scan" : scan
> }
> hbase_rdd = sc.newAPIHadoopRDD(
>         "org.apache.hadoop.hbase.mapreduce.TableInputFormat",
>         "org.apache.hadoop.hbase.io.ImmutableBytesWritable",
>         "org.apache.hadoop.hbase.client.Result",
>         keyConverter=keyConv,
>         valueConverter=valueConv,
>         conf=conf)
>
> If i jump over to scala or java and generate a base64 encoded protobuf scan
> object and convert it to a string, i can use that value for
> "hbase.mapreduce.scan" and everything works,  the rdd will correctly
> perform
> the range scan and I am happy.  The problem is that I can not find any
> reasonable way to generate that range scan string in python.   The scala
> code required is:
>
> import org.apache.hadoop.hbase.util.Base64;
> import org.apache.hadoop.hbase.protobuf.ProtobufUtil;
> import org.apache.hadoop.hbase.client.{Delete, HBaseAdmin, HTable, Put,
> Result => HBaseResult, Scan}
>
> val scan = new Scan()
> scan.setStartRow("test_domain\0email".getBytes)
> scan.setStopRow("test_domain\0email~".getBytes)
> def scanToString(scan:Scan): String = { Base64.encodeBytes(
> ProtobufUtil.toScan(scan).toByteArray()) }
> scanToString(scan)
>
>
> Is there another way to perform an hbase range scan from pyspark or is that
> functionality something that might be supported in the future?
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/pyspark-hbase-range-scan-tp22348.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to