Hi all, I have been having some issues which are currently show-stoppers with the use of Durable subscriptions with Active MQ Topics for our large scale integration project.
I've been writing the subscriber in C#, but the issue also remains for the Java implementation. The standard approach for establishing a durable subscriber is to perform all the standard steps for setting up a subscriber, along with setting a ClientID (to provide the unique ID for the subscriber application), and calling CreateDurableConsumer on the session. On the first attempt, the subscription is established, and the messages are correctly received. If the subscriber is then shutdown in a controlled manner manor, and the connection is correctly stopped and closed, then subsequent restarts of the subscriber will perform as expected. Great, all works fine. However, under real failure scenarios (machine goes pop, or goes offline for some reason resulting in a subscriber restart) the connection doesn't have chance to correctly terminate the connection with the broker - essentially an "uncontrolled shutdown". This is where the problem arises. If the subscriber now attempts to establish a durable subscription with the same ClientID and name as before, the broker returns a 'Client XXX already connected' error, and prevents the connection from being made - even though the previous client/subscriber is not actually connected, or even running. This doesn’t seem to be time bound either - even waiting for a period of time (minutes) and retrying, will produce the same results, so it's not the socket in TIME_WAIT state which is causing it. After further investigation, I've discovered the following: Using Jconsole to look into the state of the broker, it seems that, following an uncontrolled client disconnect (as previously performed) the previously created Connection instance is still classed (by the broker) as being both live and connected, although it blatantly isn't connected (or even live), because the client is no longer there. This is persisted by the broker, and never seems to be cleared (until a broker restart, which is unacceptable in an enterprise scale environment just to recover from a single subscriber failure) The broker should detect the socket disconnect from the failed subscriber, and clean up the connection status in the broker. Another observation is that if the connection is manually cleared using Jconsole (using the relevant operation on the connection instance), the subscriber can indeed reconnect using the durable subscription. Another observation is that this only happens if NO messages are published to the topic during the subscriber downtime. If however a message is published to the topic during the subscriber downtime, the broker will detect that the subscriber is no longer live, and clear up the connection. This results in the subscriber being able to reconnect successfully. However, in production environments, we cannot guarantee that a message will be sent on a topic during the subscriber downtime - although most topics will have high utilisation, some have low throughput - but this cannot be relied upon, and the failure of a single durable subscription will result in the failure of the complete subscribing application. It seems that all the ActiveMQ unit tests (or the ones I've looked at) to test the durability of the connection, perform orderly shutdown of the connection during the test. This results in the broker correctly cleaning the connection status, and the remaining tests being successful. Under other JMS implementations (namely Tibco EMS but I've performed similar in the past with JBossMQ), this doesn't happen. Many JMS resources specify that if a durable subscription is attempted and one is already established, then the existing subscription is overwritten, and the new one is established. This doesn't seem to be the case with ActiveMQ - instead it throws an exception. My main questions to the ActiveMQ forum are: 1) Is there a workaround for this to allow subsequent durable subscriptions to work following an "uncontrolled" subscriber shutdown? 2) Does ActiveMQ have a configuration parameter to allow subsequent durable subscriptions to overwrite existing ones (even if the existing ones are actually dead connections) 3) Is there anything within ActiveMQ which can periodically test the connections in the broker to see if they are still live - if not, then clean them up to overcome this problem 4) Has anybody else experienced this issue in a production quality environment or otherwise - I've seen many posts to do with 'Client XXX already connected' but nothing which resolves the issue other than 'fixed in the 4.1…. Release'. We are using 4.1.1 so we should see the fix - this sounds like another issue which has slipped though the net. Any feedback on this would be much appreciated. Kind regards Simon Vicary Integration and Technical Delivery Lead. -- View this message in context: http://www.nabble.com/ActiveMQ-and-Durable-Topic-subscriptions-after-subscriber-is-uncleanly-terminated-tf4102045s2354.html#a11665143 Sent from the ActiveMQ - User mailing list archive at Nabble.com.