GroupBy Key and then sort values with the group

2014-09-17 Thread abraham.jacob
Hi Group, I am quite fresh in the spark world. There is a particular use case that I just cannot understand how to accomplish in spark. I am using Cloudera CDH5/YARN/Java 7. I have a dataset that has the following characteristics - A JavaPairRDD that represents the following - Key = {int ID}

RE: GroupBy Key and then sort values with the group

2014-09-17 Thread abraham.jacob
Thanks Sean, Makes total sense. I guess I was so caught up with RDD's and all the wonderful transformations it can do, that I did not think about pain old Java Collections.sort(list, comparator). Thanks, __ Abraham -Original Message- From: Sean Owen

RE: Stable spark streaming app

2014-09-17 Thread abraham.jacob
Nice write-up... very helpful! -Original Message- From: Tim Smith [mailto:secs...@gmail.com] Sent: Wednesday, September 17, 2014 1:11 PM Cc: spark users Subject: Re: Stable spark streaming app I don't have anything in production yet but I now at least have a stable (running for more

RE: HBase and non-existent TableInputFormat

2014-09-16 Thread abraham.jacob
Hi, I had a similar situation in which I needed to read data from HBase and work with the data inside of a spark context. After much ggling, I finally got mine to work. There are a bunch of steps that you need to do get this working - The problem is that the spark context does not know

RE: HBase and non-existent TableInputFormat

2014-09-16 Thread abraham.jacob
Yes that was very helpful… ☺ Here are a few more I found on my quest to get HBase working with Spark – This one details about Hbase dependencies and spark classpaths http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html This one has a code overview –