Could you share your code?  Are you sure you Spark 2.4 cluster had indeed read anything?  Looks like the Input size field is empty under 2.4.

-- ND

On 6/27/20 7:58 PM, Sanjeev Mishra wrote:

I have large amount of json files that Spark can read in 36 seconds but Spark 3.0 takes almost 33 minutes to read the same. On closer analysis, looks like Spark 3.0 is choosing different DAG than Spark 2.0. Does anyone have any idea what is going on? Is there any configuration problem with Spark 3.0.

Here are the details:

*Spark 2.4*


        Summary Metrics for 2203 Completed Tasks
        <http://10.0.0.8:4040/stages/stage/?id=0&attempt=0#tasksTitle>

Metric  Min     25th percentile         Median  75th percentile         Max
Duration        0.0 ms  0.0 ms  0.0 ms  1.0 ms  62.0 ms
GC Time         0.0 ms  0.0 ms  0.0 ms  0.0 ms  11.0 ms

Showing 1 to 2 of 2 entries


        Aggregated Metrics by Executor


Show  entries
Search:
Executor ID Logs Address Task Time Total Tasks Failed Tasks Killed Tasks Succeeded Tasks Blacklisted
driver  
        10.0.0.8:49159 <http://10.0.0.8:49159>    36 s    2203    0       0     
  2203    false



*Spark 3.0*


        Summary Metrics for 8 Completed Tasks
        
<http://10.0.0.8:4040/stages/stage/?id=1&attempt=0&task.eventTimelinePageNumber=1&task.eventTimelinePageSize=47#tasksTitle>

Metric  Min     25th percentile         Median  75th percentile         Max
Duration        3.8 min         4.0 min         4.1 min         4.4 min         
5.0 min
GC Time         3 s     3 s     3 s     4 s     4 s
Input Size / Records 15.6 MiB / 51028 16.2 MiB / 53303 16.8 MiB / 55259 17.8 MiB / 58148 20.2 MiB / 71624

Showing 1 to 3 of 3 entries


        Aggregated Metrics by Executor


Show  entries
Search:
Executor ID Logs Address Task Time Total Tasks Failed Tasks Killed Tasks Succeeded Tasks Blacklisted Input Size / Records
driver  
10.0.0.8:50224 <http://10.0.0.8:50224> 33 min 8 0 0 8 false 136.1 MiB / 451999



The DAG is also different
Spark 2.0 DAG

Screenshot 2020-06-27 16.30.26.png

Spark 3.0 DAG

Screenshot 2020-06-27 16.32.32.png


Reply via email to