Key-Value Operations

2014-08-28 Thread Deep Pradhan
Hi, I have a RDD of key-value pairs. Now I want to find the key for which the values has the largest number of elements. How should I do that? Basically I want to select the key for which the number of items in values is the largest. Thank You

Re: Key-Value Operations

2014-08-28 Thread Sean Owen
If you mean your values are all a Seq or similar already, then you just take the top 1 ordered by the size of the value: rdd.top(1)(Ordering.by(_._2.size)) On Thu, Aug 28, 2014 at 9:34 AM, Deep Pradhan pradhandeep1...@gmail.com wrote: Hi, I have a RDD of key-value pairs. Now I want to find