[ 
https://issues.apache.org/jira/browse/CASSANDRA-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145368#comment-13145368
 ] 

Radim Kolar commented on CASSANDRA-3463:
----------------------------------------

Now situation changed a bit:

 INFO [GossipTasks:1] 2011-11-07 13:32:54,596 Gossiper.java (line 716) 
InetAddress /****.99.40 is now dead.
 INFO [GossipStage:1] 2011-11-07 13:32:55,163 Gossiper.java (line 702) 
InetAddress /***.99.40 is now UP
 INFO [HintedHandoff:7] 2011-11-07 13:33:25,046 HintedHandOffManager.java (line 
323) Started hinted handoff for endpoint /***.99.40
 INFO [HintedHandoff:7] 2011-11-07 13:33:45,090 HintedHandOffManager.java (line 
357) Could not complete hinted handoff to /***.99.40
 INFO [HintedHandoff:7] 2011-11-07 13:33:45,090 HintedHandOffManager.java (line 
379) Finished hinted handoff of 0 rows to endpoint /****.99.40

but still one node thinks that 99.40 is unreachable, even it is up and no 
schema disagreement in last hintedhandoff delivery attempt.
                
> cluster split due to schema disagreement
> ----------------------------------------
>
>                 Key: CASSANDRA-3463
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3463
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.8.7
>            Reporter: Radim Kolar
>
> i found interesting situation in 2 node cluster. Replication factor is 1.
> gossip (nodetool ring) thinks on both nodes that they are both up.
> Address         DC          Rack        Status State   Load            Owns   
>  Token
>                                                                               
>  99070591730234615865843651857942052864
> ****.104.18     datacenter1 rack1       Up     Normal  19.36 GB        41.77% 
>  0
> ****.99.40    datacenter1 rack1       Up     Normal  26.24 GB        58.23%  
> one node works fine, while second thinks that other node is down even if his 
> gossip correctly recognizes other node as up. Problem is in schema agreement, 
> but i dont know if logs contains enough information to discover why nodes 
> could not reach schema agreement.
> [default@test] describe cluster;
> Cluster Information:
>    Snitch: org.apache.cassandra.locator.SimpleSnitch
>    Partitioner: org.apache.cassandra.dht.RandomPartitioner
>    Schema versions:
>         9f2b5be0-06e2-11e1-0000-d14dd490cdf6: [****.104.18]
>         UNREACHABLE: [****.99.40]
>  INFO [GossipTasks:1] 2011-11-06 18:49:56,325 Gossiper.java (line 716) 
> InetAddress /*****99.40 is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:01,345 Gossiper.java (line 702) 
> InetAddress /*****99.40 is now UP
>  INFO [GossipTasks:1] 2011-11-06 18:50:02,331 Gossiper.java (line 716) 
> InetAddress /*****99.40 is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:06,444 Gossiper.java (line 702) 
> InetAddress /*****99.40 is now UP
>  INFO [GossipTasks:1] 2011-11-06 18:50:07,336 Gossiper.java (line 716) 
> InetAddress /*****99.40 is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:11,544 Gossiper.java (line 702) 
> InetAddress /*****99.40 is now UP
>  INFO [GossipTasks:1] 2011-11-06 18:50:12,341 Gossiper.java (line 716) 
> InetAddress /*****99.40 is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:16,644 Gossiper.java (line 702) 
> InetAddress /*****99.40 is now UP
>  INFO [GossipTasks:1] 2011-11-06 18:50:17,347 Gossiper.java (line 716) 
> InetAddress /*****99.40 is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:31,944 Gossiper.java (line 702) 
> InetAddress /*****99.40 is now UP
>  INFO [GossipTasks:1] 2011-11-06 18:50:32,362 Gossiper.java (line 716) 
> InetAddress /*****99.40 is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:37,044 Gossiper.java (line 702) 
> InetAddress /*****99.40 is now UP
> ERROR [HintedHandoff:6] 2011-11-06 18:50:42,010 AbstractCassandraDaemon.java 
> (line 139) Fatal exception in thread Thread[HintedHandoff:6,1,main]
> java.lang.RuntimeException: java.lang.RuntimeException: Could not reach 
> schema agreement with /*****99.40 in 60000ms
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)
> Caused by: java.lang.RuntimeException: Could not reach schema agreement with 
> /*****99.40 in 60000ms
>         at 
> org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:293)
>         at 
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:304)
>         at 
> org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89)
>         at 
> org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:397)
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         ... 3 more
> ERROR [HintedHandoff:6] 2011-11-06 18:50:42,028 AbstractCassandraDaemon.java 
> (line 139) Fatal exception in thread Thread[HintedHandoff:6,1,main]
> java.lang.RuntimeException: java.lang.RuntimeException: Could not reach 
> schema agreement with /*****99.40 in 60000ms
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)
> Caused by: java.lang.RuntimeException: Could not reach schema agreement with 
> /*****99.40 in 60000ms
>         at 
> org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:293)
>         at 
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:304)
>         at 
> org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89)
>         at 
> org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:397)
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         ... 3 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to