Re: Generalised Spark-HBase integration

2015-07-28 Thread Michal Haris
Hi Ted, yes, cloudera blog and your code was my starting point - but I needed something more spark-centric rather than on hbase. Basically doing a lot of ad-hoc transformations with RDDs that were based on HBase tables and then mutating them after series of iterative (bsp-like) steps. On 28 July

Re: Generalised Spark-HBase integration

2015-07-28 Thread Michal Haris
Oops, yes, I'm still messing with the repo on a daily basis.. fixed On 28 July 2015 at 17:11, Ted Yu yuzhih...@gmail.com wrote: I got a compilation error: [INFO] /home/hbase/s-on-hbase/src/main/scala:-1: info: compiling [INFO] Compiling 18 source files to

Re: Generalised Spark-HBase integration

2015-07-28 Thread Michal Haris
Cool, will revisit, is your latest code visible publicly somewhere ? On 28 July 2015 at 17:14, Ted Malaska ted.mala...@cloudera.com wrote: Yup you should be able to do that with the APIs that are going into HBase. Let me know if you need to chat about the problem and how to implement it with

Generalised Spark-HBase integration

2015-07-28 Thread Michal Haris
Hi all, last couple of months I've been working on a large graph analytics and along the way have written from scratch a HBase-Spark integration as none of the ones out there worked either in terms of scale or in the way they integrated with the RDD interface. This week I have generalised it into

Re: Generalised Spark-HBase integration

2015-07-28 Thread Ted Yu
I got a compilation error: [INFO] /home/hbase/s-on-hbase/src/main/scala:-1: info: compiling [INFO] Compiling 18 source files to /home/hbase/s-on-hbase/target/classes at 1438099569598 [ERROR] /home/hbase/s-on-hbase/src/main/scala/org/apache/spark/hbase/examples/simple/HBaseTableSimple.scala:36: