You just need to call mapValues() to change your Iterable of things into a sorted Iterable of things for each key-value pair. In that function you write, it's no different from any other Java program. I imagine you'll need to copy the input Iterable into an ArrayList (unfortunately), sort it with whatever Comparator you want, and return the result.
On Wed, Sep 17, 2014 at 4:37 PM, <abraham.ja...@thomsonreuters.com> wrote: > Hi Group, > > > > I am quite fresh in the spark world. There is a particular use case that I > just cannot understand how to accomplish in spark. I am using Cloudera > CDH5/YARN/Java 7. > > > > I have a dataset that has the following characteristics – > > > > A JavaPairRDD that represents the following – > > > > Key => {int ID} > > Value => {date effectiveFrom, float value} > > > > Let’s say that the data I have is the following – > > > > > > Partition – 1 > > [K=> 1, V=> {09-17-2014, 2.8}] > > [K=> 1, V=> {09-11-2014, 3.9}] > > [K=> 3, V=> {09-18-2014, 5.0}] > > [K=> 3, V=> {09-10-2014, 7.4}] > > > > > > Partition – 2 > > [K=> 2, V=> {09-13-2014, 2.5}] > > [K=> 4, V=> {09-07-2014, 6.2}] > > [K=> 2, V=> {09-12-2014, 1.8}] > > [K=> 4, V=> {09-22-2014, 2.9}] > > > > > > Grouping by key gives me the following RDD > > > > Partition – 1 > > [K=> 1, V=> Iterable({09-17-2014, 2.8}, {09-11-2014, 3.9})] > > [K=> 3, V=> Iterable({09-18-2014, 5.0}, {09-10-2014, 7.4})] > > > > Partition – 2 > > [K=> 2, Iterable({09-13-2014, 2.5}, {09-12-2014, 1.8})] > > [K=> 4, Iterable({09-07-2014, 6.2}, {09-22-2014, 2.9})] > > > > Now I would like to sort by the values and the result should look like this > – > > > > Partition – 1 > > [K=> 1, V=> Iterable({09-11-2014, 3.9}, {09-17-2014, 2.8})] > > [K=> 3, V=> Iterable({09-10-2014, 7.4}, {09-18-2014, 5.0})] > > > > Partition – 2 > > [K=> 2, Iterable({09-12-2014, 1.8}, {09-13-2014, 2.5})] > > [K=> 4, Iterable({09-07-2014, 6.2}, {09-22-2014, 2.9})] > > > > > > What is the best way to do this in spark? If so desired, I can even move the > “effectiveFrom” (the field that I want to sort on) into the key field. > > > > A code snippet or some pointers on how to solve this would be very helpful. > > > > Regards, > > Abraham --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org