Hi

I have dataset which has 10 columns, created through a parquet file.
I want to perform some operations on each column.

I create 10 datasets as dsBig.select(col).

When I submit these 10 jobs will they be blocking each other as all of them 
reading from same parquet file. Is selecting different datasets from same 
parquet file blocking?

Is it better if I used first read as
dsBig.cache().select(col1)

Regards
Rohit

Reply via email to