Re: unable to do group by with 1st column

2014-12-28 Thread Michael Albert
: Re: unable to do group by with 1st column Here is a sketch of what you need to do off the top of my head and based on a guess of what your RDD is like:val in: RDD[(K,Seq[(C,V)])] = ...in.flatMap { case (key, colVals) =   colVals.map { case (col, val) =     (col, (key, val))   } }.groupByKeySo

Re: unable to do group by with 1st column

2014-12-28 Thread Sean Owen
5g? Anyway, thanks for the info. Best wishes, Mike From: Sean Owen so...@cloudera.com To: Michael Albert m_albert...@yahoo.com Cc: user@spark.apache.org Sent: Friday, December 26, 2014 3:23 PM Subject: Re: unable to do group by with 1st column Here

RE: unable to do group by with 1st column

2014-12-26 Thread Sean Owen
:* Tobias Pfeiffer [mailto:t...@preferred.jp] *Sent:* Friday, December 26, 2014 6:35 AM *To:* Amit Behera *Cc:* u...@spark.incubator.apache.org *Subject:* Re: unable to do group by with 1st column Hi, On Fri, Dec 26, 2014 at 5:22 AM, Amit Behera amit.bd...@gmail.com wrote: How can I do

Re: unable to do group by with 1st column

2014-12-26 Thread Amit Behera
...@spark.incubator.apache.org *Subject:* Re: unable to do group by with 1st column Hi, On Fri, Dec 26, 2014 at 5:22 AM, Amit Behera amit.bd...@gmail.com wrote: How can I do it? Please help me to do. Have you considered using groupByKey? http://spark.apache.org/docs/latest

Re: unable to do group by with 1st column

2014-12-26 Thread Michael Albert
Greetings! I'm trying to do something similar, and having a very bad time of it. What I start with is key1: (col1, val-1-1, col2: val-1-2, col3: val-1-3, col4: val-1-4...)key2: (col1: val-2-1, col2: val-2-2, col3: val-2-3, col4: val 2-4, ...) What I want  (what I have been asked to produce

Re: unable to do group by with 1st column

2014-12-26 Thread Sean Owen
Here is a sketch of what you need to do off the top of my head and based on a guess of what your RDD is like: val in: RDD[(K,Seq[(C,V)])] = ... in.flatMap { case (key, colVals) = colVals.map { case (col, val) = (col, (key, val)) } }.groupByKey So the problem with both input and output

Re: unable to do group by with 1st column

2014-12-25 Thread Tobias Pfeiffer
Hi, On Fri, Dec 26, 2014 at 5:22 AM, Amit Behera amit.bd...@gmail.com wrote: How can I do it? Please help me to do. Have you considered using groupByKey? http://spark.apache.org/docs/latest/programming-guide.html#transformations Tobias

RE: unable to do group by with 1st column

2014-12-25 Thread Somnath Pandeya
; } }); From: Tobias Pfeiffer [mailto:t...@preferred.jp] Sent: Friday, December 26, 2014 6:35 AM To: Amit Behera Cc: u...@spark.incubator.apache.org Subject: Re: unable to do group by with 1st column Hi, On Fri, Dec 26, 2014 at 5:22 AM, Amit Behera amit.bd...@gmail.commailto:amit.bd