[ https://issues.apache.org/jira/browse/CASSANDRA-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Bailey updated CASSANDRA-1216: ----------------------------------- Attachment: 0001-Modify-removeToken-to-be-similar-to-decommission.patch 0002-Fixes-to-old-tests.patch 0003-Additional-unit-tests-for-removeToken.patch Some fixes and tests added. There is one thing that still needs to be fixed. * Currently the call to removeToken blocks either: ** until all nodes confirm that they have replicated the data for the dead node. ** or a timeout is reached * I'm not sure what the timeout for this should be. Additionally when nodes throughout the ring attempt to replicate data there should be a similar timeout before they give up on a source and retry. * Also clients may timeout before the timeout is even reached or all the data is replicated. I'm not sure how the user will be able to determine if the remove finished correctly or repair should be run. > removetoken drops node from ring before re-replicating its data is finished > --------------------------------------------------------------------------- > > Key: CASSANDRA-1216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1216 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Jonathan Ellis > Assignee: Nick Bailey > Fix For: 0.7.0 > > Attachments: > 0001-Modify-removeToken-to-be-similar-to-decommission.patch, > 0002-Fixes-to-old-tests.patch, > 0003-Additional-unit-tests-for-removeToken.patch > > > this means that if something goes wrong during the re-replication (e.g. a > source node is restarted) there is (a) no indication that anything has gone > wrong and (b) no way to restart the process (other than the Big Hammer of > running repair) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.