Gossiper ConcurrentModificationException after Decommissioning
--------------------------------------------------------------

                 Key: CASSANDRA-1494
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1494
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.6.5
         Environment: Linux 2.6.33.8-149.fc13.x86_64 #1 SMP Tue Aug 17 22:53:15 
UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
            Reporter: Dan Retzlaff
            Priority: Critical


After decommissioning 192.168.2.147, the Gossiper caused a 
ConcurrentModificationException in 192.168.2.55. This cascaded into 
192.168.2.55 thinking that 192.168.2.148 and 192.168.2.149 repeatedly went UP 
and then DOWN. Eventually this left so many intranode (storage port) TCP 
connections in CLOSE_WAIT that other nodes started failing with "too many open 
files" exceptions.

 INFO [Timer-0] 2010-09-08 17:00:02,398 Gossiper.java (line 402) FatClient 
/192.168.2.147 has been silent for 3600000ms, removing from gossip
ERROR [Timer-0] 2010-09-08 17:00:02,418 Gossiper.java (line 99) Gossip error
java.util.ConcurrentModificationException
        at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
        at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:383)
        at 
org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:93)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
 INFO [Timer-0] 2010-09-08 17:00:12,398 Gossiper.java (line 180) InetAddress 
/192.168.2.148 is now dead.
 INFO [Timer-0] 2010-09-08 17:00:14,399 Gossiper.java (line 180) InetAddress 
/192.168.2.149 is now dead.
 INFO [GMFD:1] 2010-09-08 17:00:19,400 Gossiper.java (line 578) InetAddress 
/192.168.2.149 is now UP
 INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:19,400 HintedHandOffManager.java 
(line 165) Started hinted handoff for endPoint /192.168.2.149
 INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:19,401 HintedHandOffManager.java 
(line 222) Finished hinted handoff of 0 rows to endpoint /192.168.2.149
 INFO [Timer-0] 2010-09-08 17:00:20,399 Gossiper.java (line 180) InetAddress 
/192.168.2.149 is now dead.
 INFO [GMFD:1] 2010-09-08 17:00:43,409 Gossiper.java (line 578) InetAddress 
/192.168.2.148 is now UP
 INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:43,409 HintedHandOffManager.java 
(line 165) Started hinted handoff for endPoint /192.168.2.148
 INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:43,410 HintedHandOffManager.java 
(line 222) Finished hinted handoff of 0 rows to endpoint /192.168.2.148
 INFO [Timer-0] 2010-09-08 17:00:44,404 Gossiper.java (line 180) InetAddress 
/192.168.2.148 is now dead.
 INFO [GMFD:1] 2010-09-08 17:01:18,415 Gossiper.java (line 578) InetAddress 
/192.168.2.149 is now UP

(UP/DOWN cycle repeats until the target node *really* goes DOWN due to too many 
TCP sockets in CLOSE_WAIT.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to