[ https://issues.apache.org/jira/browse/CASSANDRA-10477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999379#comment-14999379 ]
Ariel Weisberg commented on CASSANDRA-10477: -------------------------------------------- This assertion makes it look like the node either think it's broadcast address is that of another node, or alternatively it is connecting with itself which is causing it to submit hints to itself which is what the assertion is checking for. Neither condition should occur. If you could also get me the output of "netstat -tlnp" along with node tool status, ring, and netstats when the problem occurs that would be helpful. You could do it now just to see if something shows up, but definitely when the problem occurs. > java.lang.AssertionError in StorageProxy.submitHint > --------------------------------------------------- > > Key: CASSANDRA-10477 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10477 > Project: Cassandra > Issue Type: Bug > Environment: CentOS 6, Oracle JVM 1.8.45 > Reporter: Severin Leonhardt > Assignee: Ariel Weisberg > Fix For: 2.1.x > > > A few days after updating from 2.0.15 to 2.1.9 we have the following log > entry on 2 of 5 machines: > {noformat} > ERROR [EXPIRING-MAP-REAPER:1] 2015-10-07 17:01:08,041 > CassandraDaemon.java:223 - Exception in thread > Thread[EXPIRING-MAP-REAPER:1,5,main] > java.lang.AssertionError: /192.168.11.88 > at > org.apache.cassandra.service.StorageProxy.submitHint(StorageProxy.java:949) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.net.MessagingService$5.apply(MessagingService.java:383) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.net.MessagingService$5.apply(MessagingService.java:363) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at org.apache.cassandra.utils.ExpiringMap$1.run(ExpiringMap.java:98) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_45] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [na:1.8.0_45] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [na:1.8.0_45] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_45] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] > {noformat} > 192.168.11.88 is the broadcast address of the local machine. > When this is logged the read request latency of the whole cluster becomes > very bad, from 6 ms/op to more than 100 ms/op according to OpsCenter. Clients > get a lot of timeouts. We need to restart the affected Cassandra node to get > back normal read latencies. It seems write latency is not affected. > Disabling hinted handoff using {{nodetool disablehandoff}} only prevents the > assert from being logged. At some point the read latency becomes bad again. > Restarting the node where hinted handoff was disabled results in the read > latency being better again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)