AngersZhuuuu opened a new pull request #31611:
URL: https://github.com/apache/spark/pull/31611


   ### What changes were proposed in this pull request?
   For a specific stage, it is useful to show the task metrics in percentile 
distribution.  This information can help users know whether or not there is a 
skew/bottleneck among tasks in a given stage.  We list an example in 
taskMetricsDistributions.json
   
   Similarly, it is useful to show the executor metrics in percentile 
distribution for a specific stage. This information can show whether or not 
there is a skewed load on some executors.  We list an example in 
executorMetricsDistributions.json
   
   We define withSummaries query parameter in the REST API for a specific stage 
as:
   
   
applications/<application_id>/<application_attempt/stages/<stage_id>/<stage_attempt>?withSummaries=[true|false]
   
   When withSummaries=true, both task metrics in percentile distribution and 
executor metrics in percentile distribution are included in the REST API 
output.  The default value of withSummaries is false, i.e. no metrics 
percentile distribution will be included in the REST API output.
   
    
   
   
   ### Why are the changes needed?
   For a specific stage, it is useful to show the task metrics in percentile 
distribution.  This information can help users know whether or not there is a 
skew/bottleneck among tasks in a given stage.  We list an example in 
taskMetricsDistributions.json
   
   
   ### Does this PR introduce _any_ user-facing change?
   User can  use  below restful API to get task metrics distribution and 
executor metrics distribution for indivial stage
   ```
   
applications/<application_id>/<application_attempt/stages/<stage_id>/<stage_attempt>?withSummaries=[true|false]
   ```
   
   ### How was this patch tested?
   Added UT


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to