What's the type of the key?

If the hash of key is different across slaves, then you could get this confusing
results. We had met this similar results in Python, because of hash of None
is different across machines.

Davies

On Mon, Sep 8, 2014 at 8:16 AM, redocpot <julien19890...@gmail.com> wrote:
> Update:
>
> Just test with HashPartitioner(8) and count on each partition:
>
> List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394),
> *(5,657591*), (*6,658327*), (*7,658434*)),
> List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394),
> *(5,657594)*, (6,658326), (*7,658434*)),
> List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394),
> *(5,657592)*, (6,658326), (*7,658435*)),
> List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394),
> *(5,657591)*, (6,658326), (7,658434)),
> List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394),
> *(5,657592)*, (6,658326), (7,658435)),
> List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394),
> *(5,657592)*, (6,658326), (7,658435)),
> List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394),
> *(5,657592)*, (6,658326), (7,658435)),
> List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394),
> *(5,657591)*, (6,658326), (7,658435))
>
> The result is not identical for each execution.
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/groupBy-gives-non-deterministic-results-tp13698p13702.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to