[ 
https://issues.apache.org/jira/browse/CASSANDRA-5393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-5393:
-----------------------------------

    Attachment: 5393-v2.patch

v2 addresses a potential race condition between disconnecting the bad socket 
and re-enqueueing the failed message.
                
> Add an Ack/Retry for merkle tree sending
> ----------------------------------------
>
>                 Key: CASSANDRA-5393
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5393
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.10, 1.2.4, 2.0
>            Reporter: Jeremiah Jordan
>            Assignee: Jason Brown
>             Fix For: 1.1.11, 1.2.5, 2.0
>
>         Attachments: 5393.patch, 5393-v2.patch
>
>
> Can we add an Ack/Retry around passing merle tree's around in repair?  If the 
> following fails, the repair hangs for ever on the coordinating node.
> https://github.com/apache/cassandra/blob/cassandra-1.1.10/src/java/org/apache/cassandra/service/AntiEntropyService.java#L242
> {noformat}
>             Message message = TreeResponseVerbHandler.makeVerb(local, 
> validator);
>             if 
> (!validator.request.endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 logger.info(String.format("[repair #%s] Sending completed 
> merkle tree to %s for %s", validator.request.sessionid, 
> validator.request.endpoint, validator.request.cf));
>             ms.sendOneWay(message, validator.request.endpoint);
> {noformat}
> If the message asking for merkle tree's gets lost, coordinating node hangs 
> for ever as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to