[ https://issues.apache.org/jira/browse/CASSANDRA-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902523#action_12902523 ]
Nick Bailey commented on CASSANDRA-1216: ---------------------------------------- I believe the only consequences of calling removeToken on another node when the coordinator goes down would be that the entire operation would be repeated. So any data that was transferred before would be transferred again. I think this is the right behavior since there is no way of knowing what was transferred before the coordinator went down. It might be useful to add a 'force' option though. If the coordinator goes down and the token gets stuck in a REMOVING state you may want to force removal rather than redoing the entire operation. It should be possible to remove the timeout so that removeToken blocks until the transfer is completely finished. The code for streaming in the remote data blocks until all streams are complete and the code for sending a confirmation to the coordinator will keep retrying until it is received or the coordinator dies. I think this would work if a check was added so that you can only call removeToken a second time if the coordinator is down. It wouldn't handle two calls that occurred before the state made its way through gossip though. > removetoken drops node from ring before re-replicating its data is finished > --------------------------------------------------------------------------- > > Key: CASSANDRA-1216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1216 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 0.7 beta 1 > Reporter: Jonathan Ellis > Assignee: Nick Bailey > Fix For: 0.7 beta 2 > > Attachments: 0001-Add-callbacks-to-streaming.patch, > 0002-Modify-removeToken-to-be-similar-to-decommission.patch, > 0003-Fixes-to-old-tests.patch, 0004-Additional-tests-for-removeToken.patch > > > this means that if something goes wrong during the re-replication (e.g. a > source node is restarted) there is (a) no indication that anything has gone > wrong and (b) no way to restart the process (other than the Big Hammer of > running repair) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.