[ https://issues.apache.org/jira/browse/SPARK-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994213#comment-15994213 ]
Josh Rosen commented on SPARK-4836: ----------------------------------- [~ckadner], I'm pretty sure that this is still a problem. Regarding reproductions, you should be able to trigger this by triggering a fetch failure: run a shuffle stage, then delete some random portion of shuffle outputs while the reduce stage is running. The reduce stage should fail and re-run the previous map stage, leading to a second stage attempt. > Web UI should display separate information for all stage attempts > ----------------------------------------------------------------- > > Key: SPARK-4836 > URL: https://issues.apache.org/jira/browse/SPARK-4836 > Project: Spark > Issue Type: Bug > Components: Web UI > Affects Versions: 1.1.1, 1.2.0 > Reporter: Josh Rosen > > I've run into some cases where the web UI job page will say that a job took > 12 minutes but the sum of that job's stage times is something like 10 > seconds. In this case, it turns out that my job ran a stage to completion > (which took, say, 5 minutes) then lost some partitions of that stage and had > to run a new stage attempt to recompute one or two tasks from that stage. As > a result, the latest attempt for that stage reports only one or two tasks. > In the web UI, it seems that we only show the latest stage attempt, not all > attempts, which can lead to confusing / misleading displays for jobs with > failed / partially-recomputed stages. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org