Manu Zhang created SPARK-31219:
----------------------------------

             Summary: YarnShuffleService doesn't close idle netty channel
                 Key: SPARK-31219
                 URL: https://issues.apache.org/jira/browse/SPARK-31219
             Project: Spark
          Issue Type: Improvement
          Components: Shuffle
    Affects Versions: 2.4.5, 3.0.0
            Reporter: Manu Zhang


Recently, we find our YarnShuffleService has a lot of [half-open 
connections|https://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html]
 where shuffle servers' connections are active while clients have already 
closed. 

For example, from server's `ss -nt sport = :7337` output we have
{code:java}
ESTAB 0 0 server:7337 client:port

{code}
However, on client `ss -nt dport =: 7337 | grep server` would return nothing.

Looking at the code,  `YarnShuffleService` creates a `TransportContext` with 
`closeIdleConnections` set to false.
{code:java}
public class YarnShuffleService extends AuxiliaryService {
  ...
  @Override  protected void serviceInit(Configuration conf) throws Exception { 
    ...     
    transportContext = new TransportContext(transportConf, blockHandler); 
    ...
  }
  ...
}

public class TransportContext implements Closeable {
  ...

  public TransportContext(TransportConf conf, RpcHandler rpcHandler) {       
    this(conf, rpcHandler, false, false);  
  }
  public TransportContext(TransportConf conf, RpcHandler rpcHandler, boolean 
closeIdleConnections) {    
    this(conf, rpcHandler, closeIdleConnections, false);  
  }
  ...
}{code}
Hence, it's possible the channel  may never get closed at server side if the 
server misses the event that the client has closed it.

I find that parameter is true for `ExternalShuffleService`.

Is there any reason for the difference here ? Will it be valuable to add a 
configuration to allow enabling closeIdleConnections ?

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to