cluster split due to schema disagreement ----------------------------------------
Key: CASSANDRA-3463 URL: https://issues.apache.org/jira/browse/CASSANDRA-3463 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.7 Reporter: Radim Kolar i found interesting situation in 2 node cluster. Replication factor is 1. gossip (nodetool ring) thinks on both nodes that they are both up. Address DC Rack Status State Load Owns Token 99070591730234615865843651857942052864 ****.104.18 datacenter1 rack1 Up Normal 19.36 GB 41.77% 0 ****.99.40 datacenter1 rack1 Up Normal 26.24 GB 58.23% one node works fine, while second thinks that other node is down even if his gossip correctly recognizes other node as up. Problem is in schema agreement, but i dont know if logs contains enough information to discover why nodes could not reach schema agreement. [default@test] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 9f2b5be0-06e2-11e1-0000-d14dd490cdf6: [****.104.18] UNREACHABLE: [****.99.40] INFO [GossipTasks:1] 2011-11-06 18:49:56,325 Gossiper.java (line 716) InetAddress /*****99.40 is now dead. INFO [GossipStage:1] 2011-11-06 18:50:01,345 Gossiper.java (line 702) InetAddress /*****99.40 is now UP INFO [GossipTasks:1] 2011-11-06 18:50:02,331 Gossiper.java (line 716) InetAddress /*****99.40 is now dead. INFO [GossipStage:1] 2011-11-06 18:50:06,444 Gossiper.java (line 702) InetAddress /*****99.40 is now UP INFO [GossipTasks:1] 2011-11-06 18:50:07,336 Gossiper.java (line 716) InetAddress /*****99.40 is now dead. INFO [GossipStage:1] 2011-11-06 18:50:11,544 Gossiper.java (line 702) InetAddress /*****99.40 is now UP INFO [GossipTasks:1] 2011-11-06 18:50:12,341 Gossiper.java (line 716) InetAddress /*****99.40 is now dead. INFO [GossipStage:1] 2011-11-06 18:50:16,644 Gossiper.java (line 702) InetAddress /*****99.40 is now UP INFO [GossipTasks:1] 2011-11-06 18:50:17,347 Gossiper.java (line 716) InetAddress /*****99.40 is now dead. INFO [GossipStage:1] 2011-11-06 18:50:31,944 Gossiper.java (line 702) InetAddress /*****99.40 is now UP INFO [GossipTasks:1] 2011-11-06 18:50:32,362 Gossiper.java (line 716) InetAddress /*****99.40 is now dead. INFO [GossipStage:1] 2011-11-06 18:50:37,044 Gossiper.java (line 702) InetAddress /*****99.40 is now UP ERROR [HintedHandoff:6] 2011-11-06 18:50:42,010 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[HintedHandoff:6,1,main] java.lang.RuntimeException: java.lang.RuntimeException: Could not reach schema agreement with /*****99.40 in 60000ms at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Caused by: java.lang.RuntimeException: Could not reach schema agreement with /*****99.40 in 60000ms at org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:293) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:304) at org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89) at org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:397) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more ERROR [HintedHandoff:6] 2011-11-06 18:50:42,028 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[HintedHandoff:6,1,main] java.lang.RuntimeException: java.lang.RuntimeException: Could not reach schema agreement with /*****99.40 in 60000ms at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Caused by: java.lang.RuntimeException: Could not reach schema agreement with /*****99.40 in 60000ms at org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:293) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:304) at org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89) at org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:397) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira