So if I'm using reduceByKey() with a HashPartitioner, I understand that the
hashCode() of my key is used to create the underlying shuffle files.

Is anything other than hashCode() used in the shuffle files when the data
is pulled into the reducers and run through the reduce function? The reason
I'm asking is because there's a possibility of hashCode() colliding in two
different objects which end up hashing to the same number, right?

Reply via email to