[ 
https://issues.apache.org/jira/browse/CASSANDRA-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143320#comment-14143320
 ] 

Branimir Lambov commented on CASSANDRA-5902:
--------------------------------------------

As this appears to be the same process as a normal write would take, I created 
a new version of the patch (at the same github branch, 
https://github.com/blambov/cassandra/compare/handoff-topology) which relies on 
StorageProxy.sendToHintedEndpoints to do the replication and write the new 
hints as necessary. As a side benefit, messages to other datacentres will now 
be combined.

A special WriteOrHintResponseHandler is provided to ensure the hint is only 
deleted after all endpoints have either responded or have been hinted.

The previous version offered fine-grained rate control, which is much more 
difficult to implement now. The new version will still obey the rate in the 
longer term, but will send all copies of the hint in a single burst.

> Dealing with hints after a topology change
> ------------------------------------------
>
>                 Key: CASSANDRA-5902
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5902
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Branimir Lambov
>            Priority: Minor
>             Fix For: 2.1.1
>
>
> Hints are stored and delivered by destination node id.  This allows them to 
> survive IP changes in the target, while making "scan all the hints for a 
> given destination" an efficient operation.  However, we do not detect and 
> handle new node assuming responsibility for the hinted row via bootstrap 
> before it can be delivered.
> I think we have to take a performance hit in this case -- we need to deliver 
> such a hint to *all* replicas, since we don't know which is the "new" one.  
> This happens infrequently enough, however -- requiring first the target node 
> to be down to create the hint, then the hint owner to be down long enough for 
> the target to both recover and stream to a new node -- that this should be 
> okay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to