[ https://issues.apache.org/jira/browse/CASSANDRA-3546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160851#comment-13160851 ]
Fredrik L Stigbäck commented on CASSANDRA-3546: ----------------------------------------------- The patch is broken, hintStore must be declared outside the try {} finally {} clause. > Hinted handoffs isn't delivered if/when HintedHandOffManager ends up in > invalid state. > -------------------------------------------------------------------------------------- > > Key: CASSANDRA-3546 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3546 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.0.3 > Reporter: Fredrik L Stigbäck > Assignee: Sylvain Lebresne > Fix For: 1.0.6 > > Attachments: 3546.patch > > > Running Cassandra 1.0.3. > I've done some testing with 2 nodes (node A, node B), replication factor 2. > I take node A down, writing some data to node B and then take node A up. > Sometimes hints aren't delivered when node A comes up. > I've done some debugging in org.apache.cassandra.db.HintedHandOffManager and > sometimes node B ends up in a strange state in method > org.apache.cassandra.db.HintedHandOffManager.deliverHints(final InetAddress > to), where org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries > already has node A in it's Set and therefore no hints will ever be delivered > to node A. > The only reason for this that I can see is that in > org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(InetAddress > endpoint) the hintStore.isEmpty() check returns true and the endpoint (node > A) isn't removed from > org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries. Then no hints > will ever be delivered again until node B is restarted. > During what conditions will hintStore.isEmpty() return true? > Shouldn't the hintStore.isEmpty() check be inside the try {} finally{} > clause, removing the endpoint from queuedDeliveries in the finally block? > {code} > public void deliverHints(final InetAddress to) > { > logger_.debug("deliverHints to {}", to); > if (!queuedDeliveries.add(to)) > return; > ....... > } > {code} > {code} > private void deliverHintsToEndpoint(InetAddress endpoint) > throws IOException, DigestMismatchException, InvalidRequestException, > TimeoutException, InterruptedException > { > ColumnFamilyStore hintStore = > Table.open(Table.SYSTEM_TABLE).getColumnFamilyStore(HINTS_CF); > if (hintStore.isEmpty()) > return; // nothing to do, don't confuse users by logging a no-op > handoff > try > { > ...... > } > finally > { > queuedDeliveries.remove(endpoint); > } > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira