I suspect this is another instance of case classes not working as
expected between the driver and executor when used with spark-shell.
Search JIRA for some back story.
On Tue, Jan 5, 2016 at 12:42 AM, Arun Luthra wrote:
> Spark 1.5.0
>
> data:
>
>
Can you give a bit more information ?
Release of Spark you're using
Minimal dataset that shows the problem
Cheers
On Mon, Jan 4, 2016 at 3:55 PM, Arun Luthra wrote:
> I tried groupByKey and noticed that it did not group all values into the
> same group.
>
> In my test
Could you please post the associated code and output?
On Mon, Jan 4, 2016 at 3:55 PM Arun Luthra wrote:
> I tried groupByKey and noticed that it did not group all values into the
> same group.
>
> In my test dataset (a Pair rdd) I have 16 records, where there are only 4
>
I tried groupByKey and noticed that it did not group all values into the
same group.
In my test dataset (a Pair rdd) I have 16 records, where there are only 4
distinct keys, so I expected there to be 4 records in the groupByKey
object, but instead there were 8. Each of the 4 distinct keys appear
Spark 1.5.0
data:
p1,lo1,8,0,4,0,5,20150901|5,1,1.0
p1,lo2,8,0,4,0,5,20150901|5,1,1.0
p1,lo3,8,0,4,0,5,20150901|5,1,1.0
p1,lo4,8,0,4,0,5,20150901|5,1,1.0
p1,lo1,8,0,4,0,5,20150901|5,1,1.0
p1,lo2,8,0,4,0,5,20150901|5,1,1.0
Could you try simplifying the key and seeing if that makes any difference?
Make it just a string or an int so we can count out any issues in object
equality.
On Mon, Jan 4, 2016 at 4:42 PM Arun Luthra wrote:
> Spark 1.5.0
>
> data:
>
>
If I simplify the key to String column with values lo1, lo2, lo3, lo4, it
works correctly.
On Mon, Jan 4, 2016 at 4:49 PM, Daniel Imberman
wrote:
> Could you try simplifying the key and seeing if that makes any difference?
> Make it just a string or an int so we can
That's interesting.
I would try
case class Mykey(uname:String)
case class Mykey(uname:String, c1:Char)
case class Mykey(uname:String, lo:String, f1:Char, f2:Char, f3:Char,
f4:Char, f5:Char, f6:String)
In that order. It seems like there is some issue with equality between keys.
On Mon, Jan 4,