Hi Team, Assume we have a large dataset and sort merge is by default join that spark applies on this dataset.
Now, i want to understand internal working of joins. How does this join work or any join work ? Assume that data is already shuffled and sorted on the basis of keys. So lets say that Table A has two Partitions A & B where data is hashed based on hash value and sorted within partitions So my question is how does it comes to know that which partition from Table A has to be joined or searched with which partition from Table B ? TIA, Sid