[ 
https://issues.apache.org/jira/browse/S4-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204014#comment-13204014
 ] 

Karthik Kambatla commented on S4-7:
-----------------------------------

Update on the synchronization issues in TCPEmitter - the working branch is at 
https://github.com/kambatla/s4/tree/S4-7

As Matthieu has pointed out earlier, there are some synchronization issues in 
the committed patch. These are exposed by the MultiPartitionDeliveryTest where 
each partition sends messages to every other partition. 

When we have a higher number of partitions running (say 6), occasionally, 
messages from a partition A to a partition B end up at partition C. I suspect 
this happens because ClusterNode C claims to be ClusterNode B. 

Even the UDPEmitter behaves weird with high enough number of partitions.

Kishore and Matthieu - any pointers as to how go about this?

Thanks
                
> Netty to tolerate network glitches and connection loss
> ------------------------------------------------------
>
>                 Key: S4-7
>                 URL: https://issues.apache.org/jira/browse/S4-7
>             Project: Apache S4
>          Issue Type: Bug
>            Reporter: Leo Neumeyer
>            Assignee: Karthik Kambatla
>             Fix For: 0.5
>
>         Attachments: S4-7-Robust-TCPEmitter-asynchronous-ordered.patch, 
> s4-7.patch, s4-7.patch
>
>
> NettyEmitter connects to different partitions and creates channels over which 
> it communicates to other listeners.
> It suffers from the following issues -- 
> 1. If the underlying topology changes, the channels and the associated 
> connections are not updated.
> 2. If a connection gets disconnected, it stays disconnected.
> 3. If for any reason, a connection can't be made, send() drops the message to 
> be sent.
> The solution is to - 
> 1. Maintain a bounded messageQueue for each destination partition - if a 
> connection does not exist, the message should be queued.
> 2. Maintain a map of the channel used for each destination partition - update 
> this map on changes to topology, or on send() in case of disconnections.
> 3. Every time a (re-)connection is made, send the queued messages first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to