[ 
https://issues.apache.org/jira/browse/SPARK-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-704:
----------------------------
    Component/s: Spark Core

> ConnectionManager sometimes cannot detect loss of sending connections
> ---------------------------------------------------------------------
>
>                 Key: SPARK-704
>                 URL: https://issues.apache.org/jira/browse/SPARK-704
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Charles Reiss
>            Assignee: Henry Saputra
>
> ConnectionManager currently does not detect when SendingConnections 
> disconnect except if it is trying to send through them. As a result, a node 
> failure just after a connection is initiated but before any acknowledgement 
> messages can be sent may result in a hang.
> ConnectionManager has code intended to detect this case by detecting the 
> failure of a corresponding ReceivingConnection, but this code assumes that 
> the remote host:port of the ReceivingConnection is the same as the 
> ConnectionManagerId, which is almost never true. Additionally, there does not 
> appear to be any reason to assume a corresponding ReceivingConnection will 
> exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to