[ 
https://issues.apache.org/jira/browse/CASSANDRA-3546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-3546:
----------------------------------------

    Attachment: 3546.patch

Oops, that'll teach me to not even compile before submitting. Patch updated.
                
> Hinted handoffs isn't delivered if/when HintedHandOffManager ends up in 
> invalid state.
> --------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3546
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.3
>            Reporter: Fredrik L Stigbäck
>            Assignee: Sylvain Lebresne
>             Fix For: 1.0.6
>
>         Attachments: 3546.patch
>
>
> Running Cassandra 1.0.3.
> I've done some testing with 2 nodes (node A, node B), replication factor 2.
> I take node A down, writing some data to node B and then take node A up.
> Sometimes hints aren't delivered when node A comes up.
> I've done some debugging in org.apache.cassandra.db.HintedHandOffManager and 
> sometimes node B ends up in a strange state in method 
> org.apache.cassandra.db.HintedHandOffManager.deliverHints(final InetAddress 
> to), where org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries 
> already has node A in it's Set and therefore no hints will ever be delivered 
> to node A.
> The only reason for this that I can see is that in 
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(InetAddress
>  endpoint) the hintStore.isEmpty() check returns true and the endpoint (node 
> A)  isn't removed from 
> org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries. Then no hints 
> will ever be delivered again until node B is restarted.
> During what conditions will hintStore.isEmpty() return true?
> Shouldn't the hintStore.isEmpty() check be inside the try {} finally{} 
> clause, removing the endpoint from queuedDeliveries in the finally block?
> {code}
> public void deliverHints(final InetAddress to)
> {
>     logger_.debug("deliverHints to {}", to);
>     if (!queuedDeliveries.add(to))
>         return;
>     .......
> }
> {code}
> {code}
> private void deliverHintsToEndpoint(InetAddress endpoint) 
>     throws IOException, DigestMismatchException, InvalidRequestException, 
> TimeoutException, InterruptedException
> {
>      ColumnFamilyStore hintStore = 
> Table.open(Table.SYSTEM_TABLE).getColumnFamilyStore(HINTS_CF);
>      if (hintStore.isEmpty())
>          return; // nothing to do, don't confuse users by logging a no-op 
> handoff
>      try
>      {
>          ......
>      }
>      finally
>      {
>          queuedDeliveries.remove(endpoint);
>      }
> }
> {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to