We are using ActiveMQ version 5.9.0 and using the Failover Transport with a single host and suspect that a JMS connection failure (transportInterrupted)/re-connection (transportResumed) is sometimes causing a thread which is dispatching a message to the Broker to become stuck/hung. Once this occurs no messages can be consumed or dispatched from that connection. This is a relatively rare occurrence in production but it does happen in this one environment from time to time. The client host where this occurs is in Hong Kong and the connection is a bit unreliable and does cause broker disconnects a few times a day. At the end of this message are two examples of thread dumps for two different scenarios, but with the same end result that we are no longer able to dispatch messages to the broker on a particular connection. In “Example 1”, we have a case where the ActiveMQ Task-25 thread appears to be stuck and blocking our EventDispatcher thread with the dispatcher thread marked as being in a BLOCKED state. Note how the task thread appears to be inside the FailoverTransport.doReconnect function in both examples:
In “Example 2” we have a case where the ActiveMQ Task-22 thread appears to be stuck and blocking other threads from sending, though in this example the other threads are not reported as being in a BLOCKED state. “org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:176) org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:304) org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:286) org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:168) org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:84) org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiator.java:74) org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1002)” >From example 1: “org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:176) org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:304) org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:286) org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:168) org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:84) org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiator.java:74) org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1002)” Example #2 looks to be very similar to the post from 02/14 08:46am in the following thread http://activemq.2283324.n4.nabble.com/Hung-Producer-td4678060.html but I did not find any follow ups to this post in JIRA. In both cases we saw numerous JMS Transport interruptions/reconnections messages throughout the day (via TransportListener) logged though we do not know the exact relation of timing for these errors occurring vs. when the threads become stuck. We had to restart the client service to resume operation for both cases. When this issue occurs it appears to only affect one of the several connections that we maintain to the broker from that client. Our other connections from the same client are still able to send and receive messages as normal. Also noteworthy is that in this environment we are using SSL transport as the connector using synchronous sends. The connection URL looks like: failover:(ssl://hostName:61617?socketBufferSize=131072)?jms.useCompression=true&jms.alwaysSyncSend=true&jms.prefetchPolicy.all=30 We plan to retest in production with the latest 5.13.3 Broker some in the future. Example 1 ==================================================================== 3030 EventDispatcher-PL BLOCKED on java.lang.Object@149f2b3 owned by: ActiveMQ Task-25 Total blocked: 8 Total waited: 280178 Stack trace: org.apache.activemq.transport.failover.FailoverTransport.oneway(FailoverTransport.java:556) org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68) org.apache.activemq.transport.ResponseCorrelator.asyncRequest(ResponseCorrelator.java:81) org.apache.activemq.transport.ResponseCorrelator.request(ResponseCorrelator.java:86) org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1394) org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1333) org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1811) org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:289) org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:224) org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:300) com.ci.pl.agent.core.EventDispatcher$BrokerDispatcher.dispatch(EventDispatcher.java:550) com.ci.pl.agent.core.EventDispatcher$DispatcherContext.dispatch(EventDispatcher.java:337) com.ci.pl.agent.core.EventDispatcher$DispatchWorker.run(EventDispatcher.java:279) java.lang.Thread.run(Unknown Source) 139576139576 ActiveMQ Task-25 RUNNABLE Total blocked: 0 Total waited: 1 Stack trace: java.net.SocketInputStream.socketRead0(Native Method) java.net.SocketInputStream.socketRead(Unknown Source) java.net.SocketInputStream.read(Unknown Source) java.net.SocketInputStream.read(Unknown Source) sun.security.ssl.InputRecord.readFully(Unknown Source) sun.security.ssl.InputRecord.read(Unknown Source) sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source) sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown Source) sun.security.ssl.SSLSocketImpl.writeRecord(Unknown Source) sun.security.ssl.AppOutputStream.write(Unknown Source) org.apache.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:115) java.io.DataOutputStream.flush(Unknown Source) org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:176) org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:304) org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:286) org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:168) org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:84) org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiator.java:74) org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1002) org.apache.activemq.transport.failover.FailoverTransport$2.iterate(FailoverTransport.java:136) org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129) org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47) java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) java.lang.Thread.run(Unknown Source) Example 2 ===================================================================== 3030 EventDispatcher-PL WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@cdb4ee Total blocked: 33 Total waited: 110108 Stack trace: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(Unknown Source) java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source) java.util.concurrent.ArrayBlockingQueue.take(Unknown Source) org.apache.activemq.transport.FutureResponse.getResult(FutureResponse.java:40) org.apache.activemq.transport.ResponseCorrelator.request(ResponseCorrelator.java:87) org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1394) org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1333) org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1811) org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:289) org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:224) org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:300) com.ci.pl.agent.core.EventDispatcher$BrokerDispatcher.dispatch(EventDispatcher.java:550) com.ci.pl.agent.core.EventDispatcher$DispatcherContext.dispatch(EventDispatcher.java:337) com.ci.pl.agent.core.EventDispatcher$DispatchWorker.run(EventDispatcher.java:279) java.lang.Thread.run(Unknown Source) 5252 CloseAccessedFile-107-PL WAITING on java.util.concurrent.SynchronousQueue$TransferQueue@6e76e5 Total blocked: 311 Total waited: 78110 Stack trace: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(Unknown Source) java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(Unknown Source) java.util.concurrent.SynchronousQueue$TransferQueue.transfer(Unknown Source) java.util.concurrent.SynchronousQueue.put(Unknown Source) com.ci.pl.agent.core.EventDispatcher.dispatch(EventDispatcher.java:148) com.ci.pl.agent.AgentServices.dispatch(AgentServices.java:327) com.ci.pl.instruction.CloseAccessedFile.perform(CloseAccessedFile.java:372) com.ci.pl.instruction.Command.run(Command.java:137) com.ci.pl.agent.core.CommandInvoker$InvocationWorker.run(CommandInvoker.java:698) java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) java.util.concurrent.FutureTask.run(Unknown Source) java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) java.lang.Thread.run(Unknown Source) 130130 RebuildFile-107-PL WAITING on java.util.concurrent.SynchronousQueue$TransferQueue@6e76e5 Total blocked: 272 Total waited: 20511 Stack trace: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(Unknown Source) java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(Unknown Source) java.util.concurrent.SynchronousQueue$TransferQueue.transfer(Unknown Source) java.util.concurrent.SynchronousQueue.put(Unknown Source) com.ci.pl.agent.core.EventDispatcher.dispatch(EventDispatcher.java:148) com.ci.pl.agent.AgentServices.dispatch(AgentServices.java:327) com.ci.pl.instruction.RebuildFile.perform(RebuildFile.java:441) com.ci.pl.instruction.Command.run(Command.java:137) com.ci.pl.agent.core.CommandInvoker$InvocationWorker.run(CommandInvoker.java:698) java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) java.util.concurrent.FutureTask.run(Unknown Source) java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) java.lang.Thread.run(Unknown Source) 143143 TransferAccessedFile-107-PL WAITING on java.util.concurrent.SynchronousQueue$TransferQueue@6e76e5 Total blocked: 208 Total waited: 20462 Stack trace: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(Unknown Source) java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(Unknown Source) java.util.concurrent.SynchronousQueue$TransferQueue.transfer(Unknown Source) java.util.concurrent.SynchronousQueue.put(Unknown Source) com.ci.pl.agent.core.EventDispatcher.dispatch(EventDispatcher.java:148) com.ci.pl.agent.AgentServices.dispatch(AgentServices.java:327) com.ci.pl.instruction.TransferAccessedFile.perform(TransferAccessedFile.java:652) com.ci.pl.instruction.Command.run(Command.java:137) com.ci.pl.agent.core.CommandInvoker$InvocationWorker.run(CommandInvoker.java:698) java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) java.util.concurrent.FutureTask.run(Unknown Source) java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) java.lang.Thread.run(Unknown Source) 7313373133 ActiveMQ Task-22 RUNNABLE Total blocked: 0 Total waited: 1 Stack trace: java.net.SocketInputStream.socketRead0(Native Method) java.net.SocketInputStream.socketRead(Unknown Source) java.net.SocketInputStream.read(Unknown Source) java.net.SocketInputStream.read(Unknown Source) sun.security.ssl.InputRecord.readFully(Unknown Source) sun.security.ssl.InputRecord.read(Unknown Source) sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source) sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown Source) sun.security.ssl.SSLSocketImpl.writeRecord(Unknown Source) sun.security.ssl.AppOutputStream.write(Unknown Source) org.apache.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:115) java.io.DataOutputStream.flush(Unknown Source) org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:176) org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:304) org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:286) org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:168) org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:84) org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiator.java:74) org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1002) org.apache.activemq.transport.failover.FailoverTransport$2.iterate(FailoverTransport.java:136) org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129) org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47) java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) java.lang.Thread.run(Unknown Source) Any help or suggestions as to what may be happening would be appreciated? Thanks, Rob -- View this message in context: http://activemq.2283324.n4.nabble.com/Connection-sometimes-hang-dispatching-message-when-interrupted-failover-transport-resumes-tp4711802.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.