Hi Gourav,

The static table is broadcasted prior to the join so the shuffle is primarily to
avoid OOME during the UDF.

It's not quite a Cartesian product, but yes the join results in multiple records
per input record. The number of output records varies depending on the number of
duplicates in the static table for that particular non-unique key. The key is a
single column.

Thanks,
Chris

On Fri, 2022-02-11 at 20:10 +0000, Gourav Sengupta wrote:
> Gourav Sengupta

Reply via email to