Hi, I wanted to broadcast a Dataframe to all executors and do an operation similar to join, but might return a variable number of rows than the rows in each partition and could use multiple rows to produce one row. I am trying to create a custom join operator for this use case. It would be great if you could point me to a similar code. My thought process to do this was to create a HashedRelation from the Dataframe and broadcast that HashedRelation and then extract internal rows on each partition at the executor level.
Thanks Mura