[ 
https://issues.apache.org/jira/browse/SPARK-23237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344434#comment-16344434
 ] 

Imran Rashid commented on SPARK-23237:
--------------------------------------

Can you expand a bit about what you are worried about?

Confusing UI?  Too tempting for users to refresh it constantly, not realizing 
the impact it has?

 

If its just the UI, I'd be fine with only making it an endpoint in the api.

And on the expense – well, the existing thread dump page already would have 
that problem.  Perhaps its just so inconvenient that nobody has been tempted to 
abuse it :P.  But I think its better to warn in the docs and then the user is 
allowed to shoot themselves in the foot.

> Add UI / endpoint for threaddumps for executors with active tasks
> -----------------------------------------------------------------
>
>                 Key: SPARK-23237
>                 URL: https://issues.apache.org/jira/browse/SPARK-23237
>             Project: Spark
>          Issue Type: New Feature
>          Components: Web UI
>    Affects Versions: 2.3.0
>            Reporter: Imran Rashid
>            Priority: Major
>
> Frequently, when there are a handful of straggler tasks, users want to know 
> what is going on in those executors running the stragglers.  Currently, that 
> is a bit of a pain to do: you have to go to the page for your active stage, 
> find the task, figure out which executor its on, then go to the executors 
> page, and get the thread dump.  Or maybe you just go to the executors page, 
> find the executor with an active task, and then click on that, but that 
> doesn't work if you've got multiple stages running.
> Users could figure this by extracting the info from the stage rest endpoint, 
> but it's such a common thing to do that we should make it easy.
> I realize that figuring out a good way to do this is a little tricky.  We 
> don't want to make it easy to end up pulling thread dumps from 1000 executors 
> back to the driver.  So we've got to come up with a reasonable heuristic for 
> choosing which executors to poll.  And we've also got to find a suitable 
> place to put this.
> My suggestion is that the stage page always has a link to the thread dumps 
> for the *one* executor with the longest running task.  And there would be a 
> corresponding endpoint in the rest api with the same info, maybe at 
> {{/applications/[app-id]/stages/[stage-id]/[stage-attempt-id]/slowestTaskThreadDump}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to