Hi, I have an application which uses 3 parquet files , 2 of which are large and another one is small. These files are in hdfs and are partitioned by column "col1". Now I create 3 data-frames one for each parquet file but I pass col1 value so that it reads the relevant data. I always read from the hdfs as files are too big and almost all queries uses the col1 and thus read very small data around 10-20 MB.
Then I join these data-frames. Now the data-frames from large tables are joined using sort-merged and then join with 3rd one using broadcast hash join. I have 3 executors and each with 10 cores. Only application running on cluster is mine. The overall performance is quite good. But there is large scheduler delay especially when it reads the smaller data-frame from hdfs and in final step when it is using the broadcast hash join. Usually compute time is just 50% of the scheduler delay time. The size of 3rd smaller dataframe is less than 1 MB. I am not sure why there is so much scheduler delay. Is there a diagnostic tools which can be used to check why there is scheduler delay or point to possible causes of scheduler delay? Thanks