Have you looked at http://happybase.readthedocs.org/en/latest/ ?

Cheers



> On Apr 1, 2015, at 4:50 PM, Eric Kimbrel <eric.kimb...@soteradefense.com> 
> wrote:
> 
> I am attempting to read an hbase table in pyspark with a range scan.  
> 
> conf = {
>    "hbase.zookeeper.quorum": host, 
>    "hbase.mapreduce.inputtable": table,
>    "hbase.mapreduce.scan" : scan
> }
> hbase_rdd = sc.newAPIHadoopRDD(
>        "org.apache.hadoop.hbase.mapreduce.TableInputFormat",
>        "org.apache.hadoop.hbase.io.ImmutableBytesWritable",
>        "org.apache.hadoop.hbase.client.Result",
>        keyConverter=keyConv,
>        valueConverter=valueConv,
>        conf=conf)
> 
> If i jump over to scala or java and generate a base64 encoded protobuf scan
> object and convert it to a string, i can use that value for
> "hbase.mapreduce.scan" and everything works,  the rdd will correctly perform
> the range scan and I am happy.  The problem is that I can not find any
> reasonable way to generate that range scan string in python.   The scala
> code required is:
> 
> import org.apache.hadoop.hbase.util.Base64;
> import org.apache.hadoop.hbase.protobuf.ProtobufUtil;
> import org.apache.hadoop.hbase.client.{Delete, HBaseAdmin, HTable, Put,
> Result => HBaseResult, Scan}
> 
> val scan = new Scan()
> scan.setStartRow("test_domain\0email".getBytes)
> scan.setStopRow("test_domain\0email~".getBytes)
> def scanToString(scan:Scan): String = { Base64.encodeBytes( 
> ProtobufUtil.toScan(scan).toByteArray()) }
> scanToString(scan)
> 
> 
> Is there another way to perform an hbase range scan from pyspark or is that
> functionality something that might be supported in the future?
> 
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/pyspark-hbase-range-scan-tp22348.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to