[jira] [Comment Edited] (CASSANDRA-13042) The two cassandra nodes suddenly encounter hints each other and failed replaying.

2017-02-27 Thread Greg Doermann (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886128#comment-15886128
 ] 

Greg Doermann edited comment on CASSANDRA-13042 at 2/27/17 5:06 PM:


We are getting a constant loop of restarts on the affected servers.  It tries 
to send the server the hints and ends up restarting when all of them fail:

{code}
INFO  [HintedHandoff:5] 2017-02-27 16:52:13,959 HintedHandOffManager.java:367 - 
Started hinted handoff for host: ---000- with IP: 
/1.1.1.1
ERROR [HintedHandoff:10] 2017-02-27 16:52:13,961 CassandraDaemon.java:185 - 
Exception in thread Thread[HintedHandoff:10,1,main]
java.lang.AssertionError: null
at 
org.apache.cassandra.net.WriteCallbackInfo.(WriteCallbackInfo.java:49) 
~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:639)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:703) 
~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:474)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:354)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:93)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:566)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_74]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_74]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_74]
INFO  [HintedHandoff:5] 2017-02-27 16:52:19,465 HintedHandOffManager.java:486 - 
Failed replaying hints to /1.1.1.1; aborting (523 delivered), error : Operation 
timed out - received only 0 responses.
{code}



was (Author: gdoermann):
We are getting a constant loop of restarts on the affected servers.  It tries 
to send the server the hints and ends up restarting when all of them fail:

{quote}
INFO  [HintedHandoff:5] 2017-02-27 16:52:13,959 HintedHandOffManager.java:367 - 
Started hinted handoff for host: ---000- with IP: 
/1.1.1.1
ERROR [HintedHandoff:10] 2017-02-27 16:52:13,961 CassandraDaemon.java:185 - 
Exception in thread Thread[HintedHandoff:10,1,main]
java.lang.AssertionError: null
at 
org.apache.cassandra.net.WriteCallbackInfo.(WriteCallbackInfo.java:49) 
~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:639)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:703) 
~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:474)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:354)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:93)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:566)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_74]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_74]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_74]
INFO  [HintedHandoff:5] 2017-02-27 16:52:19,465 HintedHandOffManager.java:486 - 
Failed replaying hints to /1.1.1.1; aborting (523 delivered), error : Operation 
timed out - received only 0 responses.
{quote}


> The two cassandra nodes suddenly encounter hints each other and failed 
> replaying.
> -
>
> Key: CASSANDRA-13042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13042
> Project: Cassandra
>  Issue Type: Bug
>Reporter: YheonHo.Choi
>Priority: Critical
> Attachments: out_2.2.2.1.txt, out_2.2.2.2.txt
>
>
> Although there are no changes to cassandra, two node suddenly encounter hints 
> and failed replaying.
> Any commands like disablethrift, disablegossip can not solve the above 
> problem and the only way was restart.
> When we check the status of cluster, all nodes are looks UN but 
> describecluster show unreachable each other.
> H

[jira] [Comment Edited] (CASSANDRA-13042) The two cassandra nodes suddenly encounter hints each other and failed replaying.

2017-02-27 Thread Greg Doermann (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886128#comment-15886128
 ] 

Greg Doermann edited comment on CASSANDRA-13042 at 2/27/17 5:05 PM:


We are getting a constant loop of restarts on the affected servers.  It tries 
to send the server the hints and ends up restarting when all of them fail:

{quote}
INFO  [HintedHandoff:5] 2017-02-27 16:52:13,959 HintedHandOffManager.java:367 - 
Started hinted handoff for host: ---000- with IP: 
/1.1.1.1
ERROR [HintedHandoff:10] 2017-02-27 16:52:13,961 CassandraDaemon.java:185 - 
Exception in thread Thread[HintedHandoff:10,1,main]
java.lang.AssertionError: null
at 
org.apache.cassandra.net.WriteCallbackInfo.(WriteCallbackInfo.java:49) 
~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:639)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:703) 
~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:474)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:354)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:93)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:566)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_74]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_74]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_74]
INFO  [HintedHandoff:5] 2017-02-27 16:52:19,465 HintedHandOffManager.java:486 - 
Failed replaying hints to /1.1.1.1; aborting (523 delivered), error : Operation 
timed out - received only 0 responses.
{quote}



was (Author: gdoermann):
We are getting a constant loop of restarts on the affected servers.  It tries 
to send the server the hints and ends up restarting when all of them fail:

```INFO  [HintedHandoff:5] 2017-02-27 16:52:13,959 
HintedHandOffManager.java:367 - Started hinted handoff for host: 
---000- with IP: /1.1.1.1
ERROR [HintedHandoff:10] 2017-02-27 16:52:13,961 CassandraDaemon.java:185 - 
Exception in thread Thread[HintedHandoff:10,1,main]
java.lang.AssertionError: null
at 
org.apache.cassandra.net.WriteCallbackInfo.(WriteCallbackInfo.java:49) 
~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:639)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:703) 
~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:474)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:354)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:93)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:566)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_74]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_74]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_74]
INFO  [HintedHandoff:5] 2017-02-27 16:52:19,465 HintedHandOffManager.java:486 - 
Failed replaying hints to /1.1.1.1; aborting (523 delivered), error : Operation 
timed out - received only 0 responses.
```


> The two cassandra nodes suddenly encounter hints each other and failed 
> replaying.
> -
>
> Key: CASSANDRA-13042
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13042
> Project: Cassandra
>  Issue Type: Bug
>Reporter: YheonHo.Choi
>Priority: Critical
> Attachments: out_2.2.2.1.txt, out_2.2.2.2.txt
>
>
> Although there are no changes to cassandra, two node suddenly encounter hints 
> and failed replaying.
> Any commands like disablethrift, disablegossip can not solve the above 
> problem and the only way was restart.
> When we check the status of cluster, all nodes are looks UN but 
> describecluster show unreachable each other.
> Here's t