Re: Spark 1.5.2 - Different results from reduceByKey over multiple iterations

2016-06-23 Thread Nirav Patel
SO it was indeed my merge function. I created new result object for every merge and its working now. Thanks On Wed, Jun 22, 2016 at 3:46 PM, Nirav Patel wrote: > PS. In my reduceByKey operation I have two mutable object. What I do is > merge mutable2 into mutable1 and

Re: Spark 1.5.2 - Different results from reduceByKey over multiple iterations

2016-06-22 Thread Nirav Patel
PS. In my reduceByKey operation I have two mutable object. What I do is merge mutable2 into mutable1 and return mutable1. I read that it works for aggregateByKey so thought it will work for reduceByKey as well. I might be wrong here. Can someone verify if this will work or be un predictable? On

Re: Spark 1.5.2 - Different results from reduceByKey over multiple iterations

2016-06-22 Thread Nirav Patel
Hi, I do not see any indication of errors or executor getting killed in spark UI - jobs, stages, event timelines. No task failures. I also don't see any errors in executor logs. Thanks On Wed, Jun 22, 2016 at 2:32 AM, Ted Yu wrote: > For the run which returned incorrect

Re: Spark 1.5.2 - Different results from reduceByKey over multiple iterations

2016-06-22 Thread Ted Yu
For the run which returned incorrect result, did you observe any error (on workers) ? Cheers On Tue, Jun 21, 2016 at 10:42 PM, Nirav Patel wrote: > I have an RDD[String, MyObj] which is a result of Join + Map operation. It > has no partitioner info. I run reduceByKey

Re: Spark 1.5.2 - Different results from reduceByKey over multiple iterations

2016-06-22 Thread Takeshi Yamamuro
Hi, Could you check the issue also occurs in v1.6.1 and v2.0? // maropu On Wed, Jun 22, 2016 at 2:42 PM, Nirav Patel wrote: > I have an RDD[String, MyObj] which is a result of Join + Map operation. It > has no partitioner info. I run reduceByKey without passing any

Spark 1.5.2 - Different results from reduceByKey over multiple iterations

2016-06-21 Thread Nirav Patel
I have an RDD[String, MyObj] which is a result of Join + Map operation. It has no partitioner info. I run reduceByKey without passing any Partitioner or partition counts. I observed that output aggregation result for given key is incorrect sometime. like 1 out of 5 times. It looks like reduce