[ https://issues.apache.org/jira/browse/SPARK-26399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ron Hu updated SPARK-26399: --------------------------- Comment: was deleted (was: Hi [~Baohe Zhang] , This ticket proposes a new REST API: http://<spark history server>:18080/api/v1/applications/<application_id>/<application_attempt/stages/<stage_id>/<stage_attempt>/executorSummary. It means to display the percentile distribution of peak memory metrics among the executors used in a given stage. It can help Spark users debug/monitor a bottleneck of a stage. In the ticket https://issues.apache.org/jira/browse/SPARK-32446, it proposed to add a REST API, which can display the percentile distribution of peak memory metrics for all executors used in an application. The REST API is: http://<spark history server>:18080/api/v1/applications/<application_id>/<application_attempt>/executorSummary Hence this ticket displays executorSummary for a given stage inside an application. SPARK-32446 wants to display executorSummary for the entire application. They are different. ) > Define query parameters to support various filtering conditions in REST API > for overall stages > ---------------------------------------------------------------------------------------------- > > Key: SPARK-26399 > URL: https://issues.apache.org/jira/browse/SPARK-26399 > Project: Spark > Issue Type: Sub-task > Components: Spark Core > Affects Versions: 3.1.0 > Reporter: Edward Lu > Priority: Major > Attachments: executorMetricsDistributions.json, > executorMetricsSummary.json, lispark230_restapi_ex2_stages_failedTasks.json, > lispark230_restapi_ex2_stages_withSummaries.json, > stage_executorSummary_image1.png, taskMetricsDistributions.json, > taskMetricsDistributions.json > > > [~angerszhuuu] and [~ron8hu] discussed a generic and consistent way for > overall stages, i.e. endpoint /application/\{app-id}/stages. It can be: > /application/\{app-id}/stages?details=[true|false]&status=[ACTIVE|COMPLETE|FAILED|PENDING|SKIPPED]&withSummaries=[true|false]&taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING] > where > * query parameter details=true is to show the detailed task information > within each stage. The default value is details=false; > * query parameter status can select those stages with the specified status. > When status parameter is not specified, a list of all stages are generated. > * query parameter withSummaries=true is to show both task metrics summary > information in percentile distribution (see example of a single stage in > [^taskMetricsDistributions.json] ) and executor metrics summary information > in percentile distribution (see example of a single stage in > [^executorMetricsDistributions.json] ). The default value is > withSummaries=false. > * query parameter taskStatus is to show only those tasks with the specified > status within their corresponding stages. This parameter can be set when > details=true (i.e. this parameter will be ignored when details=false). > The output is an aggregate of all stages meeting the filtering conditions for > a given application. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org