[ 
https://issues.apache.org/jira/browse/CASSANDRA-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401471#comment-13401471
 ] 

Brandon Williams commented on CASSANDRA-4347:
---------------------------------------------

I can reproduce this.  The problem seems to be that the new node knows to evict 
the node it has replaced, but while the rest of the cluster recognizes the IP 
change, it fails the fat client expiration check and never removes the old IP 
from gossip.  Eventually the new node's quarantine period expires, and it sees 
the old node again via gossip, causing the looping (but harmless) messages.

I suspect there is another tricky problem with the pernicious hasToken that we 
removed in CASSANDRA-3747.  This won't reproduce in 1.1, and without an 
imminent (or likely) 1.0.11 release, I'm hesitant to risk breaking anything 
else here while there is a workaround available.
                
> IP change of node requires assassinate to really remove old IP
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-4347
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4347
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.10
>         Environment: RHEL6, 64bit
>            Reporter: Karl Mueller
>            Assignee: Brandon Williams
>            Priority: Minor
>         Attachments: LocationInfo-hd-279-Data.db, 
> dev-cass-post-assassinate-gossipinfo.txt, 
> kaos-cass00-gossipinfo-postmove.txt, kaos-cass03-gossipinfo-postmove.txt
>
>
> In changing the IP addresses of nodes one-by-one, the node successfully moves 
> itself and its token.  Everything works properly.
> However, the node which had its IP changed (but NOT other nodes in the ring) 
> continues to have some type of state associated with the old IP and produces 
> log messages like this:
>  INFO [GossipStage:1] 2012-06-15 15:25:01,490 Gossiper.java (line 838) Node 
> /10.12.9.157 is now part of the cluster
>  INFO [GossipStage:1] 2012-06-15 15:25:01,490 Gossiper.java (line 804) 
> InetAddress /10.12.9.157 is now UP
>  INFO [GossipStage:1] 2012-06-15 15:25:01,491 StorageService.java (line 1017) 
> Nodes /10.12.9.157 and dev-cass01.sv.walmartlabs.com/10.93.15.11 have the 
> same token 113427455640312821154458202477256070484.  Ignoring /10.12.9.157
>  INFO [GossipTasks:1] 2012-06-15 15:25:11,373 Gossiper.java (line 818) 
> InetAddress /10.12.9.157 is now dead.
>  INFO [GossipTasks:1] 2012-06-15 15:25:32,380 Gossiper.java (line 632) 
> FatClient /10.12.9.157 has been silent for 30000ms, removing from gossip
>  INFO [GossipStage:1] 2012-06-15 15:26:32,490 Gossiper.java (line 838) Node 
> /10.12.9.157 is now part of the cluster
>  INFO [GossipStage:1] 2012-06-15 15:26:32,491 Gossiper.java (line 804) 
> InetAddress /10.12.9.157 is now UP
>  INFO [GossipStage:1] 2012-06-15 15:26:32,491 StorageService.java (line 1017) 
> Nodes /10.12.9.157 and dev-cass01.sv.walmartlabs.com/10.93.15.11 have the 
> same token 113427455640312821154458202477256070484.  Ignoring /10.12.9.157
>  INFO [GossipTasks:1] 2012-06-15 15:26:42,402 Gossiper.java (line 818) 
> InetAddress /10.12.9.157 is now dead.
>  INFO [GossipTasks:1] 2012-06-15 15:27:03,410 Gossiper.java (line 632) 
> FatClient /10.12.9.157 has been silent for 30000ms, removing from gossip
>  INFO [GossipStage:1] 2012-06-15 15:28:04,533 Gossiper.java (line 838) Node 
> /10.12.9.157 is now part of the cluster
> Other nodes do NOT have the old IP showing up in logs.  It's only the node 
> that moved.
> The old IP doesn't show up in ring anywhere or in any other fashion.  The 
> cluster seems to be fully operational, so I think it's just a cleanup issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to