[ 
https://issues.apache.org/jira/browse/SPARK-26399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Hu updated SPARK-26399:
---------------------------
    Comment: was deleted

(was: Hi [~Baohe Zhang] , This ticket proposes a new REST API: 

http://<spark history 
server>:18080/api/v1/applications/<application_id>/<application_attempt/stages/<stage_id>/<stage_attempt>/executorSummary.
 

It means to display the percentile distribution of peak memory metrics among 
the executors used in a given stage.  It can help Spark users debug/monitor a 
bottleneck of a stage.

In the ticket https://issues.apache.org/jira/browse/SPARK-32446, it proposed to 
add a REST API,  which can display the percentile distribution of peak memory 
metrics for all executors used in an application.  The REST API is: 

http://<spark history 
server>:18080/api/v1/applications/<application_id>/<application_attempt>/executorSummary

Hence this ticket displays executorSummary for a given stage inside an 
application.  SPARK-32446 wants to display executorSummary for the entire 
application.  They are different.

 )

> Define query parameters to support various filtering conditions in REST API 
> for overall stages
> ----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-26399
>                 URL: https://issues.apache.org/jira/browse/SPARK-26399
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Spark Core
>    Affects Versions: 3.1.0
>            Reporter: Edward Lu
>            Priority: Major
>         Attachments: executorMetricsDistributions.json, 
> executorMetricsSummary.json, lispark230_restapi_ex2_stages_failedTasks.json, 
> lispark230_restapi_ex2_stages_withSummaries.json, 
> stage_executorSummary_image1.png, taskMetricsDistributions.json, 
> taskMetricsDistributions.json
>
>
> [~angerszhuuu] and [~ron8hu] discussed a generic and consistent way for 
> overall stages, i.e. endpoint /application/\{app-id}/stages.  It can be:
> /application/\{app-id}/stages?details=[true|false]&status=[ACTIVE|COMPLETE|FAILED|PENDING|SKIPPED]&withSummaries=[true|false]&taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING]
> where
>  * query parameter details=true is to show the detailed task information 
> within each stage.  The default value is details=false;
>  * query parameter status can select those stages with the specified status.  
> When status parameter is not specified, a list of all stages are generated.  
>  * query parameter withSummaries=true is to show both task metrics summary 
> information in percentile distribution (see example of a single stage in 
> [^taskMetricsDistributions.json] ) and executor metrics summary information 
> in percentile distribution (see example of a single stage in 
> [^executorMetricsDistributions.json]  ).  The default value is 
> withSummaries=false.
>  * query parameter taskStatus is to show only those tasks with the specified 
> status within their corresponding stages.  This parameter can be set when 
> details=true (i.e. this parameter will be ignored when details=false).
> The output is an aggregate of all stages meeting the filtering conditions for 
> a given application.   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to