What's the type of the key? If the hash of key is different across slaves, then you could get this confusing results. We had met this similar results in Python, because of hash of None is different across machines.
Davies On Mon, Sep 8, 2014 at 8:16 AM, redocpot <julien19890...@gmail.com> wrote: > Update: > > Just test with HashPartitioner(8) and count on each partition: > > List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394), > *(5,657591*), (*6,658327*), (*7,658434*)), > List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394), > *(5,657594)*, (6,658326), (*7,658434*)), > List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394), > *(5,657592)*, (6,658326), (*7,658435*)), > List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394), > *(5,657591)*, (6,658326), (7,658434)), > List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394), > *(5,657592)*, (6,658326), (7,658435)), > List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394), > *(5,657592)*, (6,658326), (7,658435)), > List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394), > *(5,657592)*, (6,658326), (7,658435)), > List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394), > *(5,657591)*, (6,658326), (7,658435)) > > The result is not identical for each execution. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/groupBy-gives-non-deterministic-results-tp13698p13702.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org