[ 
https://issues.apache.org/jira/browse/SPARK-26399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17260856#comment-17260856
 ] 

Ron Hu edited comment on SPARK-26399 at 1/8/21, 5:18 AM:
---------------------------------------------------------

The initial description of this jira has this statement:  "filtering for task 
status, and returning tasks that match (for example, FAILED tasks)"

To achieve the above statement, we need an new endpoint like this: 
/applications/[app-id]/stages?taskstatus=[FAILED|KILLED|SUCCESS]

If a user specifies /applications/[app-id]/stages?taskstatus=KILLED, then we 
generate a json file to contain all the killed task information from all the 
stages.  This way can help users monitor all the killed tasks.  For example, 
when a Spark user enables speculation, he needs the information of all the 
killed tasks so that he can monitor the benefit/cost brought by speculation.

I attach a sample json file  [^lispark230_restapi_ex2_stages_failedTasks.json]  
which contains the failed tasks and the corresponding stages for reference.


was (Author: ron8hu):
The initial description of this jira has this statement:  "filtering for task 
status, and returning tasks that match (for example, FAILED tasks)"

To achieve the above statement, we need an new endpoint like this: 
/applications/[app-id]/stages?taskstatus=[FAILED|KILLED|SUCCESS]

If a user specifies /applications/[app-id]/stages?taskstatus=KILLED, then we 
generate a json file to contain all the killed task information from all the 
stages.  This way can help users monitor all the killed tasks.  For example, a 
Spark user enables speculation, he needs the information of all the killed 
tasks so that he can monitor the benefit/cost brought by speculation.

I attach a sample json file  [^lispark230_restapi_ex2_stages_failedTasks.json]  
which contains the failed tasks and the corresponding stages for reference.

> Add new stage-level REST APIs and parameters
> --------------------------------------------
>
>                 Key: SPARK-26399
>                 URL: https://issues.apache.org/jira/browse/SPARK-26399
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Spark Core
>    Affects Versions: 3.1.0
>            Reporter: Edward Lu
>            Priority: Major
>         Attachments: executorMetricsSummary.json, 
> lispark230_restapi_ex2_stages_failedTasks.json, 
> lispark230_restapi_ex2_stages_withSummaries.json, 
> stage_executorSummary_image1.png
>
>
> Add the peak values for the metrics to the stages REST API. Also add a new 
> executorSummary REST API, which will return executor summary metrics for a 
> specified stage:
> {code:java}
> curl http://<spark history 
> server>:18080/api/v1/applications/<application_id>/<application_attempt/stages/<stage_id>/<stage_attempt>/executorMetricsSummary{code}
> Add parameters to the stages REST API to specify:
>  * filtering for task status, and returning tasks that match (for example, 
> FAILED tasks).
>  * task metric quantiles, add adding the task summary if specified
>  * executor metric quantiles, and adding the executor summary if specified
> Note that the above description is too brief to be clear.  Ron Hu added the 
> additional details to explain the use cases from the downstream products.  
> See the comments dated 1/07/2021 with a couple of sample json files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to