Hi
I wanted to ask whats the best way to achieve per key auto increment
numerals after sorting, for eg. :
raw file:
1,a,b,c,1,1
1,a,b,d,0,0
1,a,b,e,1,0
2,a,e,c,0,0
2,a,f,d,1,0
post-output (the last column is the position number after grouping on
first three fields and reverse sorting on last
Thanks Davies,
groupbykey was throwing up the error: unpack requires a string
argument of length 4
interestingly, I replace that with the sortbykey (which i read also
shuffles so that data for same key are on same partition) and it ran
fine - wondering if this a bug on groupbykey for Spark 1.3?
What's the issue with groupByKey()?
On Mon, Oct 19, 2015 at 1:11 AM, fahad shah wrote:
> Hi
>
> I wanted to ask whats the best way to achieve per key auto increment
> numerals after sorting, for eg. :
>
> raw file:
>
> 1,a,b,c,1,1
> 1,a,b,d,0,0
> 1,a,b,e,1,0
> 2,a,e,c,0,0
>