GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/23105

    [SPARK-26140] Pull TempShuffleReadMetrics creation out of shuffle reader

    ## What changes were proposed in this pull request?
    This patch defines an internal Spark interface for reporting shuffle 
metrics and uses that in shuffle reader. Before this patch, shuffle metrics is 
tied to a specific implementation (using a thread local temporary data 
structure and accumulators). After this patch, callers that define their own 
shuffle RDDs can create a custom metrics implementation.
    
    With this patch, we would be able to create a better metrics for the SQL 
layer, e.g. reporting shuffle metrics in the SQL UI, for each exchange operator.
    
    ## How was this patch tested?
    No behavior change expected, as it is a straightforward refactoring. 
Updated all existing test cases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark SPARK-26140

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23105.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23105
    
----
commit da253b57c14bc0174f0330ae6fa5d3a61647269b
Author: Reynold Xin <rxin@...>
Date:   2018-11-21T14:56:23Z

    [SPARK-26140] Pull TempShuffleReadMetrics creation out of shuffle reader

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to