I tried the 3 patches matteo rulli provided in AMQ5260. It seems reduced the chance of running into deadlock but it does not completely eliminate it. I can still see FutureResponse.getResult being waiting forever. Therefore I added another patch to make getResult can timeout after 30 secs and throw Exception.
With these 4 patches, it passed our network failure test. All the connections are fully recovered after the network failure. It takes 25-30 mins though, but it's much better than being blocked forever. Name: ActiveMQ Transport: tcp:///10.130.156.205:46138@61616 State: WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@482aaa1a Total blocked: 5 Total waited: 6 Stack trace: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) org.apache.activemq.transport.FutureResponse.getResult(FutureResponse.java:40) org.apache.activemq.transport.ResponseCorrelator.request(ResponseCorrelator.java:87) org.apache.activemq.network.DemandForwardingBridgeSupport.addSubscription(DemandForwardingBridgeSupport.java:911) org.apache.activemq.network.DemandForwardingBridgeSupport.addConsumerInfo(DemandForwardingBridgeSupport.java:1184) org.apache.activemq.network.DemandForwardingBridgeSupport.serviceRemoteConsumerAdvisory(DemandForwardingBridgeSup -- View this message in context: http://activemq.2283324.n4.nabble.com/ActiveMQ-duplex-network-connector-dead-lock-5-13-1-5-11-1-tp4708953p4708972.html Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
