[ https://issues.apache.org/jira/browse/CASSANDRA-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145524#comment-13145524 ]
Radim Kolar commented on CASSANDRA-3463: ---------------------------------------- Restarting failed node fixed this. > cluster split due to schema disagreement > ---------------------------------------- > > Key: CASSANDRA-3463 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3463 > Project: Cassandra > Issue Type: Bug > Affects Versions: 0.8.7 > Reporter: Radim Kolar > > i found interesting situation in 2 node cluster. Replication factor is 1. > gossip (nodetool ring) thinks on both nodes that they are both up. > Address DC Rack Status State Load Owns > Token > > 99070591730234615865843651857942052864 > ****.104.18 datacenter1 rack1 Up Normal 19.36 GB 41.77% > 0 > ****.99.40 datacenter1 rack1 Up Normal 26.24 GB 58.23% > one node works fine, while second thinks that other node is down even if his > gossip correctly recognizes other node as up. Problem is in schema agreement, > but i dont know if logs contains enough information to discover why nodes > could not reach schema agreement. > [default@test] describe cluster; > Cluster Information: > Snitch: org.apache.cassandra.locator.SimpleSnitch > Partitioner: org.apache.cassandra.dht.RandomPartitioner > Schema versions: > 9f2b5be0-06e2-11e1-0000-d14dd490cdf6: [****.104.18] > UNREACHABLE: [****.99.40] > INFO [GossipTasks:1] 2011-11-06 18:49:56,325 Gossiper.java (line 716) > InetAddress /*****99.40 is now dead. > INFO [GossipStage:1] 2011-11-06 18:50:01,345 Gossiper.java (line 702) > InetAddress /*****99.40 is now UP > INFO [GossipTasks:1] 2011-11-06 18:50:02,331 Gossiper.java (line 716) > InetAddress /*****99.40 is now dead. > INFO [GossipStage:1] 2011-11-06 18:50:06,444 Gossiper.java (line 702) > InetAddress /*****99.40 is now UP > INFO [GossipTasks:1] 2011-11-06 18:50:07,336 Gossiper.java (line 716) > InetAddress /*****99.40 is now dead. > INFO [GossipStage:1] 2011-11-06 18:50:11,544 Gossiper.java (line 702) > InetAddress /*****99.40 is now UP > INFO [GossipTasks:1] 2011-11-06 18:50:12,341 Gossiper.java (line 716) > InetAddress /*****99.40 is now dead. > INFO [GossipStage:1] 2011-11-06 18:50:16,644 Gossiper.java (line 702) > InetAddress /*****99.40 is now UP > INFO [GossipTasks:1] 2011-11-06 18:50:17,347 Gossiper.java (line 716) > InetAddress /*****99.40 is now dead. > INFO [GossipStage:1] 2011-11-06 18:50:31,944 Gossiper.java (line 702) > InetAddress /*****99.40 is now UP > INFO [GossipTasks:1] 2011-11-06 18:50:32,362 Gossiper.java (line 716) > InetAddress /*****99.40 is now dead. > INFO [GossipStage:1] 2011-11-06 18:50:37,044 Gossiper.java (line 702) > InetAddress /*****99.40 is now UP > ERROR [HintedHandoff:6] 2011-11-06 18:50:42,010 AbstractCassandraDaemon.java > (line 139) Fatal exception in thread Thread[HintedHandoff:6,1,main] > java.lang.RuntimeException: java.lang.RuntimeException: Could not reach > schema agreement with /*****99.40 in 60000ms > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:679) > Caused by: java.lang.RuntimeException: Could not reach schema agreement with > /*****99.40 in 60000ms > at > org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:293) > at > org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:304) > at > org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89) > at > org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:397) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > ... 3 more > ERROR [HintedHandoff:6] 2011-11-06 18:50:42,028 AbstractCassandraDaemon.java > (line 139) Fatal exception in thread Thread[HintedHandoff:6,1,main] > java.lang.RuntimeException: java.lang.RuntimeException: Could not reach > schema agreement with /*****99.40 in 60000ms > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:679) > Caused by: java.lang.RuntimeException: Could not reach schema agreement with > /*****99.40 in 60000ms > at > org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:293) > at > org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:304) > at > org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89) > at > org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:397) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > ... 3 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira