Gary, This occurred over the weekend again. Given we didn't have customers at desks there was more time to gain logs.
Netstat shows an awful lot of: tcp 0 0 127.0.0.1:61612 127.0.0.1:58428 SYN_RECV tcp 0 0 127.0.0.1:61612 127.0.0.1:57881 SYN_RECV tcp 0 0 127.0.0.1:61612 127.0.0.1:56600 SYN_RECV tcp 0 0 127.0.0.1:61612 127.0.0.1:59161 SYN_RECV tcp 0 40 127.0.0.1:58048 127.0.0.1:61612 FIN_WAIT1 tcp 0 40 127.0.0.1:57805 127.0.0.1:61612 FIN_WAIT1 tcp 0 40 127.0.0.1:58907 127.0.0.1:61612 FIN_WAIT1 tcp 0 40 127.0.0.1:58964 127.0.0.1:61612 FIN_WAIT1 61612 is our STOMP connector, which at this point is no longer accepting connections yet the web console is alive and kicking. All our current clients use STOMP. According to the logs several hours beforehand the following occurred: 2011-07-10 00:59:36,673 | INFO | Outbound.Account.22373 Inactive for longer than 60000 ms - removing ... | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[zorin] Scheduler 2011-07-10 01:00:01,590 | INFO | Outbound.Account.20821 Inactive for longer than 60000 ms - removing ... | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[zorin] Scheduler 2011-07-10 01:00:01,687 | INFO | Outbound.Account.22686 Inactive for longer than 60000 ms - removing ... | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[zorin] Scheduler 2011-07-10 01:00:14,411 | INFO | Transport failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///127.0.0.1:53766 2011-07-10 01:00:14,411 | INFO | Transport failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///127.0.0.1:53768 2011-07-10 01:00:14,411 | INFO | Transport failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///127.0.0.1:53769 Many more of these are logged then: 2011-07-10 01:00:59,451 | WARN | Async error occurred: java.lang.OutOfMemoryError: Java heap space | org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: tcp:///127.0.0.1:54121 java.lang.OutOfMemoryError: Java heap space 2011-07-10 01:00:59,451 | WARN | Exception occurred processing: null: java.lang.OutOfMemoryError: Java heap space | org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ Connection Dispatcher: /127.0.0.1:54121 2011-07-10 01:00:59,451 | WARN | Async error occurred: java.lang.IllegalStateException: Cannot remove a consumer that had not been registered: ID:quarrel-46206-1309767925003-2:522192:-1:1 | org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport: tcp:///127.0.0.1:54121 java.lang.IllegalStateException: Cannot remove a consumer that had not been registered: ID:quarrel-46206-1309767925003-2:522192:-1:1 at org.apache.activemq.broker.TransportConnection.processRemoveConsumer(TransportConnection.java:571) at org.apache.activemq.command.RemoveInfo.visit(RemoveInfo.java:76) at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:306) at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:179) at org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.java:69) at org.apache.activemq.transport.stomp.StompTransportFilter.sendToActiveMQ(StompTransportFilter.java:81) at org.apache.activemq.transport.stomp.ProtocolConverter.sendToActiveMQ(ProtocolConverter.java:140) at org.apache.activemq.transport.stomp.ProtocolConverter.onStompUnsubscribe(ProtocolConverter.java:468) at org.apache.activemq.transport.stomp.ProtocolConverter.onStompCommand(ProtocolConverter.java:190) at org.apache.activemq.transport.stomp.StompTransportFilter.onCommand(StompTransportFilter.java:70) at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83) at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:220) at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) at java.lang.Thread.run(Thread.java:636) 2011-07-10 01:00:59,451 | WARN | Exception occurred processing: null: java.lang.IllegalStateException: Cannot remove a consumer that had not been registered: ID:quarrel-46206-1309767925003-2:522192:-1:1 | org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ Connection Dispatcher: /127.0.0.1:54121 2011-07-10 01:00:59,453 | INFO | Transport failed: java.net.SocketException: Broken pipe | org.apache.activemq.broker.TransportConnection.Transport | Async Exception Handler Repeatedly. Then: 2011-07-10 01:08:38,431 | WARN | Exception occurred processing: null: java.lang.IllegalStateException: Timer already cancelled. | org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ Connection Dispatcher: /127.0.0.1:53858 2011-07-10 01:08:38,432 | INFO | Transport failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///127.0.0.1:54143 2011-07-10 01:08:38,432 | INFO | Transport failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///127.0.0.1:54139 Repeatedly, then my ssl:// bridge to our hub machine restarted itself. At this point STOMP is essentially dead. I seem to remember the last time that the logs mentioned this stuff occuring around 0100. May be co-incidence of course. Unfortunately I have no idea what could be causing this - the only two recent changes where: 1. Skip DLQ for expiring non-persistent messages 2. Enable GC for queues. Given the severity I have disabled the GC based on your thoughts last time - over the weekend we would expect the GC to kick in more then "usual". Would appreciate any further insight as to whether this deserved a new issue on it's own. James On 4 July 2011 12:31, Gary Tully <gary.tu...@gmail.com> wrote: > It is just a hunch based on the root cause of that issue, a removed > destination reference remaining in the connection contex. > > Feel free to open a separate issue, to be convinced we need a test case. > > On 4 July 2011 11:02, James Green <james.mk.gr...@gmail.com> wrote: > > How convinced are you? > > > > I'm minded to file this as a seperate bug given the heap exceptions being > > thrown and the Stomp connectivity that dies completely. What is not > > explained in the bug report you list are the "other effects" of the bug > > Timothy has fixed and whether there may be any overlap in the behaviours > > being observed. > > > > James > > > > On 4 July 2011 10:44, Gary Tully <gary.tu...@gmail.com> wrote: > > > >> this looks like a bug, but I think it may be an instance of > >> https://issues.apache.org/jira/browse/AMQ-2524 which has just been > >> resolved so it is available in the most current 5.6-SNAPSHOT > >> > >> On 4 July 2011 09:31, James Green <james.mk.gr...@gmail.com> wrote: > >> > Getting these: > >> > > >> > 2011-07-03 07:41:54,330 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:53546 > >> > 2011-07-03 07:49:05,893 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:42055 > >> > 2011-07-03 07:49:05,894 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:42045 > >> > 2011-07-03 08:06:55,985 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:43304 > >> > 2011-07-03 08:06:55,987 | INFO | Outbound.Account.22556 Inactive for > >> longer > >> > than 60000 ms - removing ... | org.apache.activemq.broker.region.Queue > | > >> > ActiveMQ Broker[zorin] Scheduler > >> > 2011-07-03 08:06:59,566 | WARN | Async error occurred: > >> > java.lang.OutOfMemoryError: Java heap space | > >> > org.apache.activemq.broker.TransportConnection.Service | ActiveMQ > >> Transport: > >> > tcp:///127.0.0.1:42987 > >> > 2011-07-03 08:06:59,566 | INFO | Outbound.Account.21011 Inactive for > >> longer > >> > than 60000 ms - removing ... | org.apache.activemq.broker.region.Queue > | > >> > ActiveMQ Broker[zorin] Scheduler > >> > 2011-07-03 08:06:59,567 | WARN | Exception occurred processing: > >> > null: java.lang.OutOfMemoryError: Java heap space | > >> > org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ > >> Connection > >> > Dispatcher: /127.0.0.1:42987 > >> > > >> > 2011-07-03 09:00:51,867 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:57612 > >> > 2011-07-03 09:00:51,868 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:57634 > >> > 2011-07-03 09:00:51,868 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:57610 > >> > 2011-07-03 09:00:54,148 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:57643 > >> > 2011-07-03 09:00:56,406 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:57644 > >> > 2011-07-03 09:00:57,535 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:57646 > >> > 2011-07-03 09:00:57,536 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:57659 > >> > 2011-07-03 09:00:57,536 | INFO | Transport failed: > java.io.EOFException > >> | > >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > >> > Transport: tcp:///127.0.0.1:57676 > >> > 2011-07-03 13:32:36,995 | INFO | Outbound.Account.200237 Inactive for > >> > longer than 60000 ms - removing ... | > >> > org.apache.activemq.broker.region.Queue | ActiveMQ Broker[zorin] > >> Scheduler > >> > 2011-07-03 23:58:46,994 | INFO | account_events Inactive for longer > than > >> > 60000 ms - removing ... | org.apache.activemq.broker.region.Queue | > >> ActiveMQ > >> > Broker[zorin] Scheduler > >> > > >> > The web console responds fine, the Stomp connector throws the > exception. > >> The > >> > AMQ config shows -Xms512M -Xmx512M > >> > -Dorg.apache.activemq.UseDedicatedTaskRunner=true > >> > > >> > This system has been running without incident for several months. Two > >> weeks > >> > ago I changed the config to enable gcInactiveDestinations="true" on > >> queues. > >> > Since then we've had two Stomp "crashes" inside one week. > >> > > >> > Does this GC process require much more RAM or something? Or am I > likely > >> > looking at a bug? > >> > > >> > Thanks, > >> > > >> > James > >> > > >> > >> > >> > >> -- > >> http://fusesource.com > >> http://blog.garytully.com > >> > > > > > > -- > http://fusesource.com > http://blog.garytully.com >