[jira] [Comment Edited] (CASSANDRA-5916) gossip and tokenMetadata get hostId out of sync on failed replace_node with the same IP address

2013-10-17 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13798428#comment-13798428
 ] 

Brandon Williams edited comment on CASSANDRA-5916 at 10/17/13 9:13 PM:
---

v4 fixes the NPE and throws when autobootstrap is disabled.  The second issue 
wasn't because of the replace_address, but because of checks in sendGossip.  v4 
just manually sends the message to all seeds.  Depending on how many seeds you 
had, that may also fix the last issue (if the node being replaced is the only 
seed, obviously that can't work.)


was (Author: brandon.williams):
v4 fixes the NP3 and throws when autobootstrap is disabled.  The second issue 
wasn't because of the replace_address, but because of checks in sendGossip.  v4 
just manually sends the message to all seeds.  Depending on how many seeds you 
had, that may also fix the last issue (if the node being replaced is the only 
seed, obviously that can't work.)

 gossip and tokenMetadata get hostId out of sync on failed replace_node with 
 the same IP address
 ---

 Key: CASSANDRA-5916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5916
 Project: Cassandra
  Issue Type: Bug
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 1.2.12

 Attachments: 5916.txt, 5916-v2.txt, 5916-v3.txt, 5916-v4.txt


 If you try to replace_node an existing, live hostId, it will error out.  
 However if you're using an existing IP to do this (as in, you chose the wrong 
 uuid to replace on accident) then the newly generated hostId wipes out the 
 old one in TMD, and when you do try to replace it replace_node will complain 
 it does not exist.  Examination of gossipinfo still shows the old hostId, 
 however now you can't replace it either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-5916) gossip and tokenMetadata get hostId out of sync on failed replace_node with the same IP address

2013-10-07 Thread Ravi Prasad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788570#comment-13788570
 ] 

Ravi Prasad edited comment on CASSANDRA-5916 at 10/7/13 9:39 PM:
-

bq. That is true regardless of shadow mode though, since hibernate is a dead 
state and the node doesn't go live to reset the hint timer until the replace 
has completed.

my understanding is, due to the generation change of the replacing node, 
gossiper.handleMajorStateChange marks the node as dead, as hibernate is one of 
the DEAD_STATES. So, the other nodes marks the replacing node as dead before 
the token bootstrap starts, hence should be storing hints to the replacing node 
from that point.  Am i reading it wrong? 


was (Author: ravilr):
That is true regardless of shadow mode though, since hibernate is a dead state 
and the node doesn't go live to reset the hint timer  until the replace has 
completed.

my understanding is due to the generation change of the replacing node, 
gossiper.handleMajorStateChange marks the node as dead, as hibernate is one of 
the DEAD_STATES. So, the other nodes marks the replacing node as dead before 
the token bootstrap starts, hence should be storing hints to the replacing node 
from that point.

 gossip and tokenMetadata get hostId out of sync on failed replace_node with 
 the same IP address
 ---

 Key: CASSANDRA-5916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5916
 Project: Cassandra
  Issue Type: Bug
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 1.2.11

 Attachments: 5916.txt


 If you try to replace_node an existing, live hostId, it will error out.  
 However if you're using an existing IP to do this (as in, you chose the wrong 
 uuid to replace on accident) then the newly generated hostId wipes out the 
 old one in TMD, and when you do try to replace it replace_node will complain 
 it does not exist.  Examination of gossipinfo still shows the old hostId, 
 however now you can't replace it either.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-5916) gossip and tokenMetadata get hostId out of sync on failed replace_node with the same IP address

2013-08-22 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747853#comment-13747853
 ] 

Brandon Williams edited comment on CASSANDRA-5916 at 8/22/13 8:22 PM:
--

This isn't so much a problem with retrying the replace, as it is with the same 
IP address (which won't work at all currently.) The reason for this is that by 
using the same IP address, the replacing node itself changes the HOST_ID, and 
then can't find the old one.  It's not just as simple as not advertising a new 
HOST_ID either, since by not having one but modifying STATUS we wipe out any 
existing HOST_ID as well.

  was (Author: brandon.williams):
This isn't so much a problem with retrying the replace, as it is with the 
same IP address (which won't at all currently.) The reason for this is that by 
using the same IP address, the replacing node itself changes the HOST_ID, and 
then can't find the old one.  It's not just as simple as not advertising a new 
HOST_ID either, by not having one but modifying STATUS we wipe out any existing 
HOST_ID as well.
  
 gossip and tokenMetadata get hostId out of sync on failed replace_node with 
 the same IP address
 ---

 Key: CASSANDRA-5916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5916
 Project: Cassandra
  Issue Type: Bug
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 1.2.9


 If you try to replace_node an existing, live hostId, it will error out.  
 However if you're using an existing IP to do this (as in, you chose the wrong 
 uuid to replace on accident) then the newly generated hostId wipes out the 
 old one in TMD, and when you do try to replace it replace_node will complain 
 it does not exist.  Examination of gossipinfo still shows the old hostId, 
 however now you can't replace it either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira