[ https://issues.apache.org/activemq/browse/AMQ-443?page=all ]
Hiram Chirino resolved AMQ-443:
-------------------------------
Fix Version: 4.0
Resolution: Fixed
4.0 Has implemented a more robust keepalive solution. KeepAlive packets are
only sent when the transport has been idle. Also, while the transport is
performing a blocking opperation it is not considered idle.
> ReliableTransport / KeepAlive algorithm does not work properly.
> ---------------------------------------------------------------
>
> Key: AMQ-443
> URL: https://issues.apache.org/activemq/browse/AMQ-443
> Project: ActiveMQ
> Type: Bug
> Components: Transport, Broker
> Versions: 3.2, 3.2.1
> Environment: Solaris 8 / 10. JDK 1.5
> Reporter: Kevin Yaussy
> Fix For: 4.0
> Attachments: KeepAliveDaemon.java, ReliableTransportChannel.java
>
>
> The current implementation of KeepAliveDaemon.java will sometimes force
> disconnections on well behaved connections. The problem may arrise if there
> is a connection which goes away, and the KeepAlive send to that channel
> blocks while attempting to reconnect. If this reconnection takes a while,
> then other channels that were responding fine may get their connections
> broken. This happens due to the following code in KeepAliveDaemon.java:
> if ((channel.getLastReceiptTimestamp() +
> channel.getKeepAliveTimeout() * 2) < System.currentTimeMillis()) {
> or
> } else if ((channel.getLastReceiptTimestamp() +
> channel.getKeepAliveTimeout()) < System.currentTimeMillis()) {
> The fact that the receipt timestamp is checked against
> System.currentTimeMillis() causes the code to break otherwise good
> connections. If a KeepAlive send (in examineChannel) for a broken channel
> takes longer than some good channel's KeepAliveTimeout, then the good
> connection gets broken.
> This can, in turn, cause some pretty bad behavior in the Broker. While
> testing and diagnosing this problem, I could some brokers in a network of
> brokers stuck. The sequence of events during recovery, which get interrupted
> due to closing the connections, would sometimes lead to the broker hanging
> waiting for a receipt, such as during an addConsumer (which eventually calls
> syncSendWithReceipt).
> I have redone the logic in KeepAliveDaemon.java (which required a small
> change to ReliableTransportChannel as well). This now seems to work.
> I'm a bit concerned about the blocking calls, though. This may be a
> different issue / bug. I thought it looked like there was a mechanism to
> cancel outstanding receipt waiters - but, every once in a while that
> mechanism would not get called. This results in the broker basically getting
> stuck, and does not ever really recover.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira