Hi I have dataset which has 10 columns, created through a parquet file. I want to perform some operations on each column.
I create 10 datasets as dsBig.select(col). When I submit these 10 jobs will they be blocking each other as all of them reading from same parquet file. Is selecting different datasets from same parquet file blocking? Is it better if I used first read as dsBig.cache().select(col1) Regards Rohit