Hi, I am using Spark 1.6.1, and I am looking at the Event Timeline on "Details for Stage" Spark UI web page in detail.
I found that the "scheduler delay" on event timeline is somehow misrepresented. I want to confirm if my understanding is correct. Here is the detailed description: In Spark's code, I found that the definition of "SCHEDULER_DELAY" is that "scheduler delay includes time to ship the task from the scheduler to the executor, and time to send the task result from the executor to the scheduler. If scheduler delay is large, consider decreasing the size of tasks or decreasing the size of task results" My interpretation of the definition is that the scheduler delay has two components. The first component happens at the beginning of a task when scheduler assigns task executable to the executor; The second component happens at the end of a task when the scheduler collects the results from the executor. However, on the event timeline figure, there is only one section for the scheduler delay at the beginning of each task, whose length represents the SUM of these two components. This means that the following "Task Deserialization Time" , “Shuffle Read Time", "Executor Computing Time", etc, should have started earlier on this event timeline figure. Best, Xiaoye