[ https://issues.apache.org/jira/browse/SPARK-23237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341730#comment-16341730 ]
Alex Bozarth commented on SPARK-23237: -------------------------------------- Unlike the related task, I'm not sure about this one. I see the need as you stated it, but also as you stated, it would be difficult to go about it. I'm willing to look at a PR for this but I wouldn't hold out hope for convincing me. > Add UI / endpoint for threaddumps for executors with active tasks > ----------------------------------------------------------------- > > Key: SPARK-23237 > URL: https://issues.apache.org/jira/browse/SPARK-23237 > Project: Spark > Issue Type: New Feature > Components: Web UI > Affects Versions: 2.3.0 > Reporter: Imran Rashid > Priority: Major > > Frequently, when there are a handful of straggler tasks, users want to know > what is going on in those executors running the stragglers. Currently, that > is a bit of a pain to do: you have to go to the page for your active stage, > find the task, figure out which executor its on, then go to the executors > page, and get the thread dump. Or maybe you just go to the executors page, > find the executor with an active task, and then click on that, but that > doesn't work if you've got multiple stages running. > Users could figure this by extracting the info from the stage rest endpoint, > but it's such a common thing to do that we should make it easy. > I realize that figuring out a good way to do this is a little tricky. We > don't want to make it easy to end up pulling thread dumps from 1000 executors > back to the driver. So we've got to come up with a reasonable heuristic for > choosing which executors to poll. And we've also got to find a suitable > place to put this. > My suggestion is that the stage page always has a link to the thread dumps > for the *one* executor with the longest running task. And there would be a > corresponding endpoint in the rest api with the same info, maybe at > {{/applications/[app-id]/stages/[stage-id]/[stage-attempt-id]/slowestTaskThreadDump}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org