odd, yea, you would expect to stop in org.apache.activemq.broker.TransportConnection.TransportConnection(...).new DefaultTransportListener() {...}.onException(IOException)
does the connection still appear active with netstat or does the consumer still appear in the console? On 31 May 2010 10:14, <daniel.stu...@attensity.com> wrote: > Hi, > > I changed the implementation from receiveNoWait() to receive(10000) but it > did not change anything in the behavior. > > After the client crashes I can still see the delivered message in the Queue > (using browse) but no receive() call can get this message again, it seems to > be stuck in the queue. > > > I set breakpoints (over 20) to all onException() methods of implementations > of TransportListener in the ActiveMQ. No breakpoint is triggered when the > client crashes. However, if I set up a TransportListener in my JUnit test > (in method testSendReceiveOnCrash()) then there onException() is triggered > with the following exception: > java.io.EOFException Client transport error: > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:211) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:203) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:186) > at java.lang.Thread.run(Thread.java:619) > > > Shouldn't this exception end up somewhere in the ActiveMQ server code ? > > > Bye, > Daniel > > -----Ursprüngliche Nachricht----- > Von: Gary Tully [mailto:gary.tu...@gmail.com] > Gesendet: Freitag, 28. Mai 2010 17:25 > An: users@activemq.apache.org > Betreff: Re: Failover Question > > I just had a cursory look at the code and I think the receiveNoWait() call > may be part of the problem. > > receiveNoWait does not work well with activemq just after a consumer has > been created. It can take some time for the consumer to register and > dispatch to occur and it ocurrs async to the receiveNoWait call. > Either use a small timeout receive(1000) or loop while receivedMessage == > null for a few iterations. > > On 27 May 2010 17:15, <daniel.stu...@attensity.com> wrote: > > > Hi ActiveMQ Team, > > > > > > > > in the eclipse open source project SMILA we use ActiveMQ (version 5.3.2) > > to implement a producer/consumer pattern with JMS. The basic setup is as > > follows: > > > > - the software runs in a cluster of machines (usually between 4 > > and 16) > > > > - we use the Pure Master/Slave configuration for Queue failover > > > > - a producer creates a large data chunk in a data repository > > and creates a JMS message containing the Id of the created chunk of data > > > > - a consumer receives a JMS message and processes the data > > chunk with the given Id. Some consumers also function an producers as > > they create a new data chunk and another JMS message > > > > - all machines in the cluster work as producers and consumers > > > > > > > > > > > > In general this works fine, but we have problems on a machine failure. > > For simplicity assume that one machine (except for the Master or Slave) > > has a hardware failure and crashes. Also assume that this machine was > > currently processing a received JMS message. The Session from which the > > message was received was not committed yet, as the session is only > > committed if the processing of the data was successful. Otherwise it is > > rolled back. > > > > Now as the machine crashes the session is neither committed nor rolled > > back. How can we assure that any messages that were delivered but not > > committed or rolled back are redelivered or put into the DLQ? > > > > > > > > > > > > Our first assumption was that if the connection of a session drops all > > not committed messages of that session are automatically redelivered. > > Unfortunately this was not the case. Does this only work in certain > > scenarios with specific settings ? > > > > > > > > > > > > The second idea was to set TTL for each message, so that when TTL is > > reached the message goes into the DLQ and can be consumed there (e.g. by > > another consumer that creates a copy of the message in the actual > > queue). This would automatically cover the machine crash described > > above, as sending no commit or rollback eventually leads to reaching the > > set TTL of the message. However during tests we had strange behavior for > > messages that were processed by the crashing machine: > > > > - some messages were handled correctly (they were moved to the > > DLQ) > > > > - other messages simply disappeared, in JMX console these > > messages were shown as dequeued which should only be the case if the > > session was committed. There were no exceptions in the log files. > > > > > > > > Is there anything that has to be addressed, either in the configuration > > or our code for this to work correctly? > > > > > > > > Besides this TTL has a drawback, as it is set when the message is > > created. The processing of our data takes quite a while and we also have > > to assure the processing in a certain time frame. Producers are > > generally faster than Consumers, so the number of enqueued messages > > increases. So by setting TTL we cannot assure that a message is consumed > > in a certain time frame but only that it is available for the set time. > > Are there any mechanisms that would allow us the set a "processing > > timeout" or "commit timeout" by that a message must be committed or it > > is sent to the DLQ ? > > > > > > > > BTW, what about the parameter maxInactivityDuration ? Does it have any > > effect on opened sessions/transactions ? We also set this but it did not > > seem to have any effect. > > > > > > > > > > > > Some information on our environment: > > > > - ActiveMQ 5.3.2 > > > > - JDK 1.6.0_20 > > > > - Equinox OSGi container (eclipse 3.5) > > > > - Linux Open Suse 11.1 > > > > - Connection-URL: > > failover://(tcp://masterhost:61616,tcp://slavehost:61616)?randomize=fals > > e > > > > > > > > > > > > It would be great if you could share your thoughts on this issue. > > > > > > > > Bye, > > > > Daniel > > > > > > > > > > > > > > > -- > http://blog.garytully.com > > Open Source Integration > http://fusesource.com > -- http://blog.garytully.com Open Source Integration http://fusesource.com