Shuffle guarantees

Corey Nolet Tue, 01 Mar 2016 12:01:37 -0800

So if I'm using reduceByKey() with a HashPartitioner, I understand that the
hashCode() of my key is used to create the underlying shuffle files.


Is anything other than hashCode() used in the shuffle files when the data
is pulled into the reducers and run through the reduce function? The reason
I'm asking is because there's a possibility of hashCode() colliding in two
different objects which end up hashing to the same number, right?

Shuffle guarantees

Reply via email to