Hi Alex, We wrote a blog post on this topic late last year: http://thelastpickle.com/blog/2018/09/18/assassinate.html.
In short, you will need to run the assassinate command on each node simultaneously a number of times in quick succession. This will generate a number of messages requesting all nodes completely forget there used to be an entry within the gossip state for the given IP address. Regards, Anthony On Thu, 4 Apr 2019 at 03:32, Alex <m...@aca-o.com> wrote: > Same result it seems: > Welcome to JMX terminal. Type "help" for available commands. > $>open localhost:7199 > #Connection to localhost:7199 is opened > $>bean org.apache.cassandra.net:type=Gossiper > #bean is set to org.apache.cassandra.net:type=Gossiper > $>run unsafeAssassinateEndpoint 192.168.1.18 > #calling operation unsafeAssassinateEndpoint of mbean > org.apache.cassandra.net:type=Gossiper > #RuntimeMBeanException: java.lang.NullPointerException > > > There not much more to see in log files : > WARN [RMI TCP Connection(10)-127.0.0.1] 2019-04-03 16:25:13,626 > Gossiper.java:575 - Assassinating /192.168.1.18 via gossip > INFO [RMI TCP Connection(10)-127.0.0.1] 2019-04-03 16:25:13,627 > Gossiper.java:585 - Sleeping for 30000ms to ensure /192.168.1.18 does > not change > INFO [RMI TCP Connection(10)-127.0.0.1] 2019-04-03 16:25:43,628 > Gossiper.java:1029 - InetAddress /192.168.1.18 is now DOWN > INFO [RMI TCP Connection(10)-127.0.0.1] 2019-04-03 16:25:43,631 > StorageService.java:2324 - Removing tokens [..] for /192.168.1.18 > > > > > Le 03.04.2019 17:10, Nick Hatfield a écrit : > > Run assassinate the old way. I works very well... > > > > wget -q -O jmxterm.jar > > > http://downloads.sourceforge.net/cyclops-group/jmxterm-1.0-alpha-4-uber.jar > > > > java -jar ./jmxterm.jar > > > > $>open localhost:7199 > > > > $>bean org.apache.cassandra.net:type=Gossiper > > > > $>run unsafeAssassinateEndpoint 192.168.1.18 > > > > $>quit > > > > > > Happy deleting > > > > -----Original Message----- > > From: Alex [mailto:m...@aca-o.com] > > Sent: Wednesday, April 03, 2019 10:42 AM > > To: user@cassandra.apache.org > > Subject: Assassinate fails > > > > Hello, > > > > Short story: > > - I had to replace a dead node in my cluster > > - 1 week after, dead node is still seen as DN by 3 out of 5 nodes > > - dead node has null host_id > > - assassinate on dead node fails with error > > > > How can I get rid of this dead node ? > > > > > > Long story: > > I had a 3 nodes cluster (Cassandra 3.9) ; one node went dead. I built > > a new node from scratch and "replaced" the dead node using the > > information from this page > > > https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsReplaceNode.html > . > > It looked like the replacement went ok. > > > > I added two more nodes to strengthen the cluster. > > > > A few days have passed and the dead node is still visible and marked > > as "down" on 3 of 5 nodes in nodetool status: > > > > -- Address Load Tokens Owns (effective) Host ID > > Rack > > UN 192.168.1.9 16 GiB 256 35.0% > > 76223d4c-9d9f-417f-be27-cebb791cddcc rack1 > > UN 192.168.1.12 16.09 GiB 256 34.0% > > 719601e2-54a6-440e-a379-c9cf2dc20564 rack1 > > UN 192.168.1.14 14.16 GiB 256 32.6% > > d8017a03-7e4e-47b7-89b9-cd9ec472d74f rack1 > > UN 192.168.1.17 15.4 GiB 256 34.1% > > fa238b21-1db1-47dc-bfb7-beedc6c9967a rack1 > > DN 192.168.1.18 24.3 GiB 256 33.7% null > > rack1 > > UN 192.168.1.22 19.06 GiB 256 30.7% > > 09d24557-4e98-44c3-8c9d-53c4c31066e1 rack1 > > > > Its host ID is null, so I cannot use nodetool removenode. Moreover > > nodetool assassinate 192.168.1.18 fails with : > > > > error: null > > -- StackTrace -- > > java.lang.NullPointerException > > > > And in system.log: > > > > INFO [RMI TCP Connection(16)-127.0.0.1] 2019-03-27 17:39:38,595 > > Gossiper.java:585 - Sleeping for 30000ms to ensure /192.168.1.18 does > > not change INFO [CompactionExecutor:547] 2019-03-27 17:39:38,669 > > AutoSavingCache.java:393 - Saved KeyCache (27316 items) in 163 ms INFO > > [IndexSummaryManager:1] 2019-03-27 17:40:03,620 > > IndexSummaryRedistribution.java:75 - Redistributing index summaries > > INFO [RMI TCP Connection(16)-127.0.0.1] 2019-03-27 17:40:08,597 > > Gossiper.java:1029 - InetAddress /192.168.1.18 is now DOWN INFO [RMI > > TCP Connection(16)-127.0.0.1] 2019-03-27 17:40:08,599 > > StorageService.java:2324 - Removing tokens [-1061369577393671924,...] > > ERROR [GossipStage:1] 2019-03-27 17:40:08,600 CassandraDaemon.java:226 > > - Exception in thread Thread[GossipStage:1,5,main] > > java.lang.NullPointerException: null > > > > > > In system.peers, the dead node shows and has the same ID as the > > replacing node : > > > > cqlsh> select peer, host_id from system.peers; > > > > peer | host_id > > --------------+-------------------------------------- > > 192.168.1.18 | 09d24557-4e98-44c3-8c9d-53c4c31066e1 > > 192.168.1.22 | 09d24557-4e98-44c3-8c9d-53c4c31066e1 > > 192.168.1.9 | 76223d4c-9d9f-417f-be27-cebb791cddcc > > 192.168.1.14 | d8017a03-7e4e-47b7-89b9-cd9ec472d74f > > 192.168.1.12 | 719601e2-54a6-440e-a379-c9cf2dc20564 > > > > Dead node and replacing node have different tokens in system.peers. > > > > I should add that I also tried decommission on a node that still > > 192.168.1.18 in its peers. - it is still marked as "leaving" 5 days > > later. Nothing in notetool netstats or nodetool compactionstats. > > > > > > Thank you for taking the time to read this. Hope you can help. > > > > Alex > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: user-h...@cassandra.apache.org > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: user-h...@cassandra.apache.org > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >