[ 
https://issues.apache.org/jira/browse/SPARK-31646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17416656#comment-17416656
 ] 

Manu Zhang edited comment on SPARK-31646 at 9/17/21, 12:40 PM:
---------------------------------------------------------------

[~yzhangal],

Please check this comment  
[https://github.com/apache/spark/pull/28416#discussion_r418357988] for more 
background.

The counter reverted in this PR was just never used, or this PR was simply to 
remove some dead codes.

I didn't meant to use registeredConnections for anything different. It's 
eventually registered into ShuffleMetrics here.

[https://github.com/apache/spark/blob/master/common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java#L248]
{code:java}
      blockHandler.getAllMetrics().getMetrics().put("numRegisteredConnections", 
         shuffleServer.getRegisteredConnections()); {code}
 

As I understand it, registeredConnections (and IdleConnections) is monitored at 
channel level (TransportChannelHandler) while activeConnections 
(blockTransferRateBytes, etc) at RPC level (ExternalShuffleBlockHandler). 
Hence, these metrics are kept in two places. 

You may register your backloggedConnections in ShuffleMetrics and update it 
with "registeredConenctions - activeConnections" in ShuffleMetrics#getMetrics.

 

Your understanding of executors registering with Shuffle Service is correct but 
I don't see how it's related to your question.


was (Author: mauzhang):
[~yzhangal],

Please check this comment  
[https://github.com/apache/spark/pull/28416#discussion_r418357988] for more 
background.

The counter reverted in this PR was just never used, or this PR was simply to 
remove some dead codes.

I didn't meant to use registeredConnections for anything different. It's 
eventually registered into ShuffleMetrics here.

[https://github.com/apache/spark/blob/master/common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java#L248]
{code:java}
      blockHandler.getAllMetrics().getMetrics().put("numRegisteredConnections", 
         shuffleServer.getRegisteredConnections()); {code}
 

As I understand it, registeredConnections (and IdleConnections) is monitored at 
channel level (TransportChannelHandler) while activeConnections 
(blockTransferRateBytes, etc) at RPC level (ExternalShuffleBlockHandler). 
Hence, these metrics are kept in two places. 

You may register your backloggedConnections in ShuffleMetrics and update it 
with "registeredConenctions - activeConnections" in 

ShuffleMetrics#getMetrics.

 

Your understanding of executors registering with Shuffle Service is correct but 
I don't see how it's related to your question.

> Remove unused registeredConnections counter from ShuffleMetrics
> ---------------------------------------------------------------
>
>                 Key: SPARK-31646
>                 URL: https://issues.apache.org/jira/browse/SPARK-31646
>             Project: Spark
>          Issue Type: Improvement
>          Components: Deploy, Shuffle, Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Manu Zhang
>            Assignee: Manu Zhang
>            Priority: Minor
>             Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to