[ 
https://issues.apache.org/jira/browse/CASSANDRA-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143474#comment-14143474
 ] 

Benedict commented on CASSANDRA-5902:
-------------------------------------

I don't think this behaves as you expect right now; it looks like no new 
hinting will be done under any circumstance, and the original hint will not be 
deleted in the event that any end point fails to respond. It's possible I'm 
missing something obvious though.

Take a look at...
Hint writing: WriteCallbackInfo.shouldHint(), MessagingService.expiringMap
Hint deletion: CallbackInfo.isFailureCallback(), IAsyncCallbackWithFailure, 
MessagingService.expiringMap

It seems that a new IAsyncCallbackWithFailure that both hints and decrements 
the callback count, so that the deletion is definitely called eventually is 
what's necessary. 

Separately, it's not clear to me we should be stopping hint replay to the 
target if one of these extra hints fails to be delivered, since they're 
unrelated. This could cause hints to not be delivered before their ttl expires 
unnecessarily, which would be bad for consistency.


> Dealing with hints after a topology change
> ------------------------------------------
>
>                 Key: CASSANDRA-5902
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5902
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Branimir Lambov
>            Priority: Minor
>             Fix For: 2.1.1
>
>
> Hints are stored and delivered by destination node id.  This allows them to 
> survive IP changes in the target, while making "scan all the hints for a 
> given destination" an efficient operation.  However, we do not detect and 
> handle new node assuming responsibility for the hinted row via bootstrap 
> before it can be delivered.
> I think we have to take a performance hit in this case -- we need to deliver 
> such a hint to *all* replicas, since we don't know which is the "new" one.  
> This happens infrequently enough, however -- requiring first the target node 
> to be down to create the hint, then the hint owner to be down long enough for 
> the target to both recover and stream to a new node -- that this should be 
> okay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to