Gary,

This occurred over the weekend again. Given we didn't have customers at
desks there was more time to gain logs.

Netstat shows an awful lot of:

tcp        0      0 127.0.0.1:61612         127.0.0.1:58428         SYN_RECV
tcp        0      0 127.0.0.1:61612         127.0.0.1:57881         SYN_RECV
tcp        0      0 127.0.0.1:61612         127.0.0.1:56600         SYN_RECV
tcp        0      0 127.0.0.1:61612         127.0.0.1:59161         SYN_RECV
tcp        0     40 127.0.0.1:58048         127.0.0.1:61612
FIN_WAIT1
tcp        0     40 127.0.0.1:57805         127.0.0.1:61612
FIN_WAIT1
tcp        0     40 127.0.0.1:58907         127.0.0.1:61612
FIN_WAIT1
tcp        0     40 127.0.0.1:58964         127.0.0.1:61612
FIN_WAIT1

61612 is our STOMP connector, which at this point is no longer accepting
connections yet the web console is alive and kicking. All our current
clients use STOMP.

According to the logs several hours beforehand the following occurred:

2011-07-10 00:59:36,673 | INFO  | Outbound.Account.22373 Inactive for longer
than 60000 ms - removing ... | org.apache.activemq.broker.region.Queue |
ActiveMQ Broker[zorin] Scheduler
2011-07-10 01:00:01,590 | INFO  | Outbound.Account.20821 Inactive for longer
than 60000 ms - removing ... | org.apache.activemq.broker.region.Queue |
ActiveMQ Broker[zorin] Scheduler
2011-07-10 01:00:01,687 | INFO  | Outbound.Account.22686 Inactive for longer
than 60000 ms - removing ... | org.apache.activemq.broker.region.Queue |
ActiveMQ Broker[zorin] Scheduler
2011-07-10 01:00:14,411 | INFO  | Transport failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///127.0.0.1:53766
2011-07-10 01:00:14,411 | INFO  | Transport failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///127.0.0.1:53768
2011-07-10 01:00:14,411 | INFO  | Transport failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///127.0.0.1:53769

Many more of these are logged then:

2011-07-10 01:00:59,451 | WARN  | Async error occurred:
java.lang.OutOfMemoryError: Java heap space |
org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport:
tcp:///127.0.0.1:54121
java.lang.OutOfMemoryError: Java heap space
2011-07-10 01:00:59,451 | WARN  | Exception occurred processing:
null: java.lang.OutOfMemoryError: Java heap space |
org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ Connection
Dispatcher: /127.0.0.1:54121
2011-07-10 01:00:59,451 | WARN  | Async error occurred:
java.lang.IllegalStateException: Cannot remove a consumer that had not been
registered: ID:quarrel-46206-1309767925003-2:522192:-1:1 |
org.apache.activemq.broker.TransportConnection.Service | ActiveMQ Transport:
tcp:///127.0.0.1:54121
java.lang.IllegalStateException: Cannot remove a consumer that had not been
registered: ID:quarrel-46206-1309767925003-2:522192:-1:1
    at
org.apache.activemq.broker.TransportConnection.processRemoveConsumer(TransportConnection.java:571)
    at org.apache.activemq.command.RemoveInfo.visit(RemoveInfo.java:76)
    at
org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:306)
    at
org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:179)
    at
org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.java:69)
    at
org.apache.activemq.transport.stomp.StompTransportFilter.sendToActiveMQ(StompTransportFilter.java:81)
    at
org.apache.activemq.transport.stomp.ProtocolConverter.sendToActiveMQ(ProtocolConverter.java:140)
    at
org.apache.activemq.transport.stomp.ProtocolConverter.onStompUnsubscribe(ProtocolConverter.java:468)
    at
org.apache.activemq.transport.stomp.ProtocolConverter.onStompCommand(ProtocolConverter.java:190)
    at
org.apache.activemq.transport.stomp.StompTransportFilter.onCommand(StompTransportFilter.java:70)
    at
org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83)
    at
org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:220)
    at
org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
    at java.lang.Thread.run(Thread.java:636)
2011-07-10 01:00:59,451 | WARN  | Exception occurred processing:
null: java.lang.IllegalStateException: Cannot remove a consumer that had not
been registered: ID:quarrel-46206-1309767925003-2:522192:-1:1 |
org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ Connection
Dispatcher: /127.0.0.1:54121
2011-07-10 01:00:59,453 | INFO  | Transport failed:
java.net.SocketException: Broken pipe |
org.apache.activemq.broker.TransportConnection.Transport | Async Exception
Handler

Repeatedly. Then:

2011-07-10 01:08:38,431 | WARN  | Exception occurred processing:
null: java.lang.IllegalStateException: Timer already cancelled. |
org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ Connection
Dispatcher: /127.0.0.1:53858
2011-07-10 01:08:38,432 | INFO  | Transport failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///127.0.0.1:54143
2011-07-10 01:08:38,432 | INFO  | Transport failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///127.0.0.1:54139

Repeatedly, then my ssl:// bridge to our hub machine restarted itself. At
this point STOMP is essentially dead.

I seem to remember the last time that the logs mentioned this stuff occuring
around 0100. May be co-incidence of course. Unfortunately I have no idea
what could be causing this - the only two recent changes where:

1. Skip DLQ for expiring non-persistent messages
2. Enable GC for queues.

Given the severity I have disabled the GC based on your thoughts last time -
over the weekend we would expect the GC to kick in more then "usual". Would
appreciate any further insight as to whether this deserved a new issue on
it's own.

James

On 4 July 2011 12:31, Gary Tully <gary.tu...@gmail.com> wrote:

> It is just a hunch based on the root cause of that issue, a removed
> destination reference remaining in the connection contex.
>
> Feel free to open a separate issue, to be convinced we need a test case.
>
> On 4 July 2011 11:02, James Green <james.mk.gr...@gmail.com> wrote:
> > How convinced are you?
> >
> > I'm minded to file this as a seperate bug given the heap exceptions being
> > thrown and the Stomp connectivity that dies completely. What is not
> > explained in the bug report you list are the "other effects" of the bug
> > Timothy has fixed and whether there may be any overlap in the behaviours
> > being observed.
> >
> > James
> >
> > On 4 July 2011 10:44, Gary Tully <gary.tu...@gmail.com> wrote:
> >
> >> this looks like a bug, but I think it may be an instance of
> >> https://issues.apache.org/jira/browse/AMQ-2524 which has just been
> >> resolved so it is available in the most current 5.6-SNAPSHOT
> >>
> >> On 4 July 2011 09:31, James Green <james.mk.gr...@gmail.com> wrote:
> >> > Getting these:
> >> >
> >> > 2011-07-03 07:41:54,330 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:53546
> >> > 2011-07-03 07:49:05,893 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:42055
> >> > 2011-07-03 07:49:05,894 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:42045
> >> > 2011-07-03 08:06:55,985 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:43304
> >> > 2011-07-03 08:06:55,987 | INFO  | Outbound.Account.22556 Inactive for
> >> longer
> >> > than 60000 ms - removing ... | org.apache.activemq.broker.region.Queue
> |
> >> > ActiveMQ Broker[zorin] Scheduler
> >> > 2011-07-03 08:06:59,566 | WARN  | Async error occurred:
> >> > java.lang.OutOfMemoryError: Java heap space |
> >> > org.apache.activemq.broker.TransportConnection.Service | ActiveMQ
> >> Transport:
> >> > tcp:///127.0.0.1:42987
> >> > 2011-07-03 08:06:59,566 | INFO  | Outbound.Account.21011 Inactive for
> >> longer
> >> > than 60000 ms - removing ... | org.apache.activemq.broker.region.Queue
> |
> >> > ActiveMQ Broker[zorin] Scheduler
> >> > 2011-07-03 08:06:59,567 | WARN  | Exception occurred processing:
> >> > null: java.lang.OutOfMemoryError: Java heap space |
> >> > org.apache.activemq.transport.stomp.ProtocolConverter | ActiveMQ
> >> Connection
> >> > Dispatcher: /127.0.0.1:42987
> >> >
> >> > 2011-07-03 09:00:51,867 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:57612
> >> > 2011-07-03 09:00:51,868 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:57634
> >> > 2011-07-03 09:00:51,868 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:57610
> >> > 2011-07-03 09:00:54,148 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:57643
> >> > 2011-07-03 09:00:56,406 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:57644
> >> > 2011-07-03 09:00:57,535 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:57646
> >> > 2011-07-03 09:00:57,536 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:57659
> >> > 2011-07-03 09:00:57,536 | INFO  | Transport failed:
> java.io.EOFException
> >> |
> >> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> >> > Transport: tcp:///127.0.0.1:57676
> >> > 2011-07-03 13:32:36,995 | INFO  | Outbound.Account.200237 Inactive for
> >> > longer than 60000 ms - removing ... |
> >> > org.apache.activemq.broker.region.Queue | ActiveMQ Broker[zorin]
> >> Scheduler
> >> > 2011-07-03 23:58:46,994 | INFO  | account_events Inactive for longer
> than
> >> > 60000 ms - removing ... | org.apache.activemq.broker.region.Queue |
> >> ActiveMQ
> >> > Broker[zorin] Scheduler
> >> >
> >> > The web console responds fine, the Stomp connector throws the
> exception.
> >> The
> >> > AMQ config shows -Xms512M -Xmx512M
> >> > -Dorg.apache.activemq.UseDedicatedTaskRunner=true
> >> >
> >> > This system has been running without incident for several months. Two
> >> weeks
> >> > ago I changed the config to enable  gcInactiveDestinations="true" on
> >> queues.
> >> > Since then we've had two Stomp "crashes" inside one week.
> >> >
> >> > Does this GC process require much more RAM or something? Or am I
> likely
> >> > looking at a bug?
> >> >
> >> > Thanks,
> >> >
> >> > James
> >> >
> >>
> >>
> >>
> >> --
> >> http://fusesource.com
> >> http://blog.garytully.com
> >>
> >
>
>
>
> --
> http://fusesource.com
> http://blog.garytully.com
>

Reply via email to