Joins internally

Sid Thu, 11 Aug 2022 11:21:20 -0700

Hi Team,

Assume we have a large dataset and sort merge is by default join that spark
applies on this dataset.


Now, i want to understand internal working of joins.

How does this join work or any join work ?

Assume that data is already shuffled and sorted on the basis of keys.

So lets say that Table A has two Partitions A & B where data is hashed
based on hash value  and sorted within partitions

So my question is how does it comes to know that which partition from Table
A has to be joined or searched with which partition from Table B ?

TIA,
Sid

Joins internally

Reply via email to