[ 
https://issues.apache.org/jira/browse/CASSANDRA-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140895#comment-14140895
 ] 

Jonathan Ellis commented on CASSANDRA-5902:
-------------------------------------------

The hint contract is that repair should be unnecessary for downtime lasting 
less than the hint window.  So, writing with ONE and calling it good isn't 
enough.

I do think that ALL is good enough for 99% of scenarios in this already small 
corner case, but Benedict's suggestion of writing new hints for any replicas 
that are down shouldn't be much harder.  (Note that we should check for 
liveness up front and short circuit the timeout if FD already knows it's down.)

> Dealing with hints after a topology change
> ------------------------------------------
>
>                 Key: CASSANDRA-5902
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5902
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Branimir Lambov
>            Priority: Minor
>             Fix For: 2.1.1
>
>
> Hints are stored and delivered by destination node id.  This allows them to 
> survive IP changes in the target, while making "scan all the hints for a 
> given destination" an efficient operation.  However, we do not detect and 
> handle new node assuming responsibility for the hinted row via bootstrap 
> before it can be delivered.
> I think we have to take a performance hit in this case -- we need to deliver 
> such a hint to *all* replicas, since we don't know which is the "new" one.  
> This happens infrequently enough, however -- requiring first the target node 
> to be down to create the hint, then the hint owner to be down long enough for 
> the target to both recover and stream to a new node -- that this should be 
> okay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to