[ https://issues.apache.org/jira/browse/SPARK-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Or updated SPARK-11866: ------------------------------ Target Version/s: 1.6.0 > RpcEnv RPC timeouts can lead to errors, leak in transport library. > ------------------------------------------------------------------ > > Key: SPARK-11866 > URL: https://issues.apache.org/jira/browse/SPARK-11866 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.6.0 > Reporter: Marcelo Vanzin > > The {{RpcEnv}} code in spark-core has its own timeout handling capabilities, > which can clash with the transport library's timeout handling in two ways > when replies to an RPC message are never sent. > - if the channel has been idle for a while, the transport library will close > the channel because it may think it's hung; this could cause other errors > since the {{RpcEnv}}-based code might not expect those channels to be closed. > - if the reply never arrives and the channel is not idle, there's state kept > in the network library that will never be cleaned up. the {{RpcEnv}}-level > timeout code should clean up that state since it's not interested in that RPC > anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org