Thanks Marco. This solved the order problem. Had another question which is prefix to this.
As you can see below ID2,ID1 and ID3 are in order and I need to maintain this index order as well. But when we do groupByKey operation(*rdd.distinct.groupByKey().mapValues(v => v.toArray*)) everything is *jumbled*. Is there any way we can maintain this order as well ? scala> RDD.foreach(println) (ID2,18159) (ID1,18159) (ID3,18159) (ID2,18159) (ID1,18159) (ID3,18159) (ID2,36318) (ID1,36318) (ID3,36318) (ID2,54477) (ID1,54477) (ID3,54477) *Jumbled version : * Array( (ID1,Array(*18159*, 308703, 72636, 64544, 39244, 107937, *54477*, 145272, 100079, *36318*, 160992, 817, 89366, 150022, 19622, 44683, 58866, 162076, 45431, 100136)), (ID3,Array(100079, 19622, *18159*, 212064, 107937, 44683, 150022, 39244, 100136, 58866, 72636, 145272, 817, 89366, * 54477*, *36318*, 308703, 160992, 45431, 162076)), (ID2,Array(308703, * 54477*, 89366, 39244, 150022, 72636, 817, 58866, 44683, 19622, 160992, 107937, 100079, 100136, 145272, 64544, *18159*, 45431, *36318*, 162076)) ) *Expected output:* Array( (ID1,Array(*18159*,*36318*, *54477,...*)), (ID3,Array(*18159*,*36318*, *54477, ...*)), (ID2,Array(*18159*,*36318*, *54477, ...*)) ) As you can see after *groupbyKey* operation is complete item 18519 is in index 0 for ID1, index 2 for ID3 and index 16 for ID2 where as expected is index 0 On Sun, Jul 24, 2016 at 12:43 PM, Marco Mistroni <mmistr...@gmail.com> wrote: > Hello > Uhm you have an array containing 3 tuples? > If all the arrays have same length, you can just zip all of them, > creatings a list of tuples > then you can scan the list 5 by 5...? > > so something like > > (Array(0)_2,Array(1)._2,Array(2)._2).zipped.toList > > this will give you a list of tuples of 3 elements containing each items > from ID1, ID2 and ID3 ... sample below > res: List((18159,100079,308703), (308703, 19622, 54477), (72636,18159, > 89366)..........) > > then you can use a recursive function to compare each element such as > > def iterate(lst:List[(Int, Int, Int)]):T = { > if (lst.isEmpty): /// return your comparison > else { > val splits = lst.splitAt(5) > // do sometjhing about it using splits._1 > iterate(splits._2) > } > > will this help? or am i still missing something? > > kr > > > > > > > > > > > > > On 24 Jul 2016 5:52 pm, "janardhan shetty" <janardhan...@gmail.com> wrote: > >> Array( >> (ID1,Array(18159, 308703, 72636, 64544, 39244, 107937, 54477, 145272, >> 100079, 36318, 160992, 817, 89366, 150022, 19622, 44683, 58866, 162076, >> 45431, 100136)), >> (ID3,Array(100079, 19622, 18159, 212064, 107937, 44683, 150022, 39244, >> 100136, 58866, 72636, 145272, 817, 89366, 54477, 36318, 308703, 160992, >> 45431, 162076)), >> (ID2,Array(308703, 54477, 89366, 39244, 150022, 72636, 817, 58866, 44683, >> 19622, 160992, 107937, 100079, 100136, 145272, 64544, 18159, 45431, 36318, >> 162076)) >> ) >> >> I need to compare first 5 elements of ID1 with first five element of ID3 >> next first 5 elements of ID1 to ID2. Similarly next 5 elements in that >> order until the end of number of elements. >> Let me know if this helps >> >> >> On Sun, Jul 24, 2016 at 7:45 AM, Marco Mistroni <mmistr...@gmail.com> >> wrote: >> >>> Apologies I misinterpreted.... could you post two use cases? >>> Kr >>> >>> On 24 Jul 2016 3:41 pm, "janardhan shetty" <janardhan...@gmail.com> >>> wrote: >>> >>>> Marco, >>>> >>>> Thanks for the response. It is indexed order and not ascending or >>>> descending order. >>>> On Jul 24, 2016 7:37 AM, "Marco Mistroni" <mmistr...@gmail.com> wrote: >>>> >>>>> Use map values to transform to an rdd where values are sorted? >>>>> Hth >>>>> >>>>> On 24 Jul 2016 6:23 am, "janardhan shetty" <janardhan...@gmail.com> >>>>> wrote: >>>>> >>>>>> I have a key,value pair rdd where value is an array of Ints. I need >>>>>> to maintain the order of the value in order to execute downstream >>>>>> modifications. How do we maintain the order of values? >>>>>> Ex: >>>>>> rdd = (id1,[5,2,3,15], >>>>>> Id2,[9,4,2,5]....) >>>>>> >>>>>> Followup question how do we compare between one element in rdd with >>>>>> all other elements ? >>>>>> >>>>> >>