Hi, I am having 40+ structured data stored in s3 bucket as parquet file .
I am going to use 20 table in the use case. There s a Main table which drive the whole flow. Main table contains 1k record. My use case is for every record in the main table process the rest of table( join group by depends on main table field). How can I parallel the process. What I done was read the main table and create tocaliterator for df then do the rest of the processing. This one run one by one record. Please share me your ideas. Thank you.