Hello,

I was going through the Spark strategies class and found that by default
Sort merge join is preferred over shuffled hash join. The
preferSortMergeJoin needs to be explicitly set to False if we have to force
a shuffled hash join.

1) why is Sort merge join preferred over hash join?
2) are there any performance implications we need to take care of when we
force shuffled hash joins?

-- 
Sent from my iphone

Reply via email to