[ 
https://issues.apache.org/jira/browse/CASSANDRA-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6948:
--------------------------------

    Description: 
Since ExpiringMap.shutdown() shuts down the static executor service, it cannot 
be restarted (and in fact reset() makes no attempt to do so). As such callbacks 
that receive no response are never removed from the map, and eventually either 
than server will run out of memory or will loop around the integer space and 
start reusing messageids that have not been expired, causing assertions to be 
thrown and messages to fail to be sent. It appears that this situation only 
arises on bootstrap or node replacement, as MessagingService is shutdown before 
being attached to the listen address.

This can cause the following errors to begin occurring in the log:

ERROR [Native-Transport-Requests:7636] 2014-03-28 13:32:10,638 
ErrorMessage.java (line 222) Unexpected exception during request
java.lang.AssertionError: Callback already exists for id -1665979622! 
(CallbackInfo(target=/10.106.160.84, 
callback=org.apache.cassandra.service.WriteResponseHandler@5d36d8ea, 
serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
        at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
        at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
        at 
org.apache.cassandra.service.StorageProxy.mutateCounter(StorageProxy.java:984)
        at 
org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:449)
        at 
org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:524)
        at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:521)
        at 
org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:505)
        at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188)
        at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:358)
        at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131)
        at 
org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)
        at 
org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43)
        at 
org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
ERROR [ReplicateOnWriteStage:102766] 2014-03-28 13:32:10,638 
CassandraDaemon.java (line 196) Exception in thread 
Thread[ReplicateOnWriteStage:102766,5,main]
java.lang.AssertionError: Callback already exists for id -1665979620! 
(CallbackInfo(target=/10.106.160.84, 
callback=org.apache.cassandra.service.WriteResponseHandler@3bdb1a75, 
serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
        at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
        at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
        at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:806)
        at 
org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.java:1074)
        at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1896)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

  was:
We are using the datastax driver and have seen the following exceptions appear 
periodically in our cluster in "groups".  The client will gradually start 
sending less write requests to the nodes where this occurs the most often.  
This may be related to CASSANDRA-6476.  Exception from logs:

ERROR [Native-Transport-Requests:7636] 2014-03-28 13:32:10,638 
ErrorMessage.java (line 222) Unexpected exception during request
java.lang.AssertionError: Callback already exists for id -1665979622! 
(CallbackInfo(target=/10.106.160.84, 
callback=org.apache.cassandra.service.WriteResponseHandler@5d36d8ea, 
serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
        at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
        at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
        at 
org.apache.cassandra.service.StorageProxy.mutateCounter(StorageProxy.java:984)
        at 
org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:449)
        at 
org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:524)
        at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:521)
        at 
org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:505)
        at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188)
        at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:358)
        at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131)
        at 
org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)
        at 
org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43)
        at 
org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
ERROR [ReplicateOnWriteStage:102766] 2014-03-28 13:32:10,638 
CassandraDaemon.java (line 196) Exception in thread 
Thread[ReplicateOnWriteStage:102766,5,main]
java.lang.AssertionError: Callback already exists for id -1665979620! 
(CallbackInfo(target=/10.106.160.84, 
callback=org.apache.cassandra.service.WriteResponseHandler@3bdb1a75, 
serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
        at 
org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
        at 
org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
        at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:806)
        at 
org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.java:1074)
        at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1896)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)


> After Bootstrap or Replace node startup, EXPIRING_MAP_REAPER is shutdown and 
> cannot be restarted, causing callbacks to collect indefinitely
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6948
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6948
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Keith Wright
>            Assignee: Benedict
>         Attachments: 6948.debug.txt, Screen Shot 2014-03-28 at 11.27.56 
> AM.png, Screen Shot 2014-03-28 at 11.29.24 AM.png, cassandra.log.min, 
> cassandra.yaml, logs.old.tar.gz, logs.tar.gz, system.log.1.gz, system.log.gz
>
>
> Since ExpiringMap.shutdown() shuts down the static executor service, it 
> cannot be restarted (and in fact reset() makes no attempt to do so). As such 
> callbacks that receive no response are never removed from the map, and 
> eventually either than server will run out of memory or will loop around the 
> integer space and start reusing messageids that have not been expired, 
> causing assertions to be thrown and messages to fail to be sent. It appears 
> that this situation only arises on bootstrap or node replacement, as 
> MessagingService is shutdown before being attached to the listen address.
> This can cause the following errors to begin occurring in the log:
> ERROR [Native-Transport-Requests:7636] 2014-03-28 13:32:10,638 
> ErrorMessage.java (line 222) Unexpected exception during request
> java.lang.AssertionError: Callback already exists for id -1665979622! 
> (CallbackInfo(target=/10.106.160.84, 
> callback=org.apache.cassandra.service.WriteResponseHandler@5d36d8ea, 
> serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
>       at 
> org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
>       at 
> org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
>       at 
> org.apache.cassandra.service.StorageProxy.mutateCounter(StorageProxy.java:984)
>       at 
> org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:449)
>       at 
> org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:524)
>       at 
> org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:521)
>       at 
> org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:505)
>       at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188)
>       at 
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:358)
>       at 
> org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131)
>       at 
> org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)
>       at 
> org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43)
>       at 
> org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:744)
> ERROR [ReplicateOnWriteStage:102766] 2014-03-28 13:32:10,638 
> CassandraDaemon.java (line 196) Exception in thread 
> Thread[ReplicateOnWriteStage:102766,5,main]
> java.lang.AssertionError: Callback already exists for id -1665979620! 
> (CallbackInfo(target=/10.106.160.84, 
> callback=org.apache.cassandra.service.WriteResponseHandler@3bdb1a75, 
> serializer=org.apache.cassandra.db.WriteResponse$WriteResponseSerializer@6ed37f0b))
>       at 
> org.apache.cassandra.net.MessagingService.addCallback(MessagingService.java:549)
>       at 
> org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:601)
>       at 
> org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:806)
>       at 
> org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.java:1074)
>       at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1896)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to