[ 
https://issues.apache.org/jira/browse/STORM-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Zhong updated STORM-510:
-----------------------------
    Issue Type: Sub-task  (was: Bug)
        Parent: STORM-329

> Netty messaging client blocks transfer thread on reconnect
> ----------------------------------------------------------
>
>                 Key: STORM-510
>                 URL: https://issues.apache.org/jira/browse/STORM-510
>             Project: Apache Storm
>          Issue Type: Sub-task
>    Affects Versions: 0.9.2-incubating
>            Reporter: Robert Joseph Evans
>            Priority: Critical
>
> The latest netty client code will attempt to reestablish the connection on 
> failure as part of the send method call.  It will block until the connection 
> is established or a timeout happens, by default this is about 30 seconds, 
> which is also the default tuple timeout.  
> This is exacerbated by the read lock that is held during the send, that 
> prevents the node->socket mapping from changing while we are sending.  This 
> is mostly so that we don't close connections while we are trying to write to 
> them, which would cause an exception.  But this makes it so if there are 
> multiple workers on a node that all get rescheduled we will wait the full 30 
> seconds to timeout for each worker.
> send must be non-blocking in the current design of the worker, or it will 
> prevent other messages from being delivered, and is likely to cause many many 
> messages to timeout on a reschedule.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to