[
https://issues.apache.org/jira/browse/AMQ-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Timothy Bish resolved AMQ-3719.
-------------------------------
Resolution: Fixed
Assignee: Timothy Bish
Fixed on trunk, thanks for doing the leg work on this one.
> Tracked command IOException causes FailoverTransport to hang until failure
> occurs for untracked command
> -------------------------------------------------------------------------------------------------------
>
> Key: AMQ-3719
> URL: https://issues.apache.org/jira/browse/AMQ-3719
> Project: ActiveMQ
> Issue Type: Bug
> Components: Transport
> Environment: Intel(R) Core(TM) i5 CPU M 540 @2.53GHz
> 8 GB, 64-bit
> Reporter: Martin Serrano
> Assignee: Timothy Bish
> Priority: Critical
> Fix For: 5.6.0
>
> Attachments: amq-3719.patch
>
>
> I have only encountered this failure when the broker is experiencing heavy
> load and a new connection attempt is made.
> * The FailoverTransport tracks commands that have been issued so that it can
> restore the state upon a failure/reconnect event.
> * If an IOException occurs when sending a tracked command, the oneway()
> method returns, assuming that the IOException is indicative of a transport
> failure and will result in a failure/reconnect event.
> * Some IOExceptions (like WireFormatNegotiation timesouts) are not always
> indicative of transport failure however. In this case since no subsequent
> failure/reconnect event occurs, the command will never be resent. If this is
> a synchronous command (like that generated by starting a connection) the
> calling thread will hang.
> Incidentally, my reading of the code is that only non-tracked commands can
> generate the IOException that triggers the handleTransportFailure command.
> Is that what we really want?
> My belief is that the IOExceptions should always result in the triggering of
> the handleTransportFailure, regardless of origin.
> I will attach a unit test and fix shortly. The test will often fail (i.e.
> hang) without the fix, but not always since I use a
> wireFormat.maxInactivityDurationInitalDelay=1 option to trigger the behavior.
> If the system runs fast enough, it sometimes will not get the timeout. I
> wasn't sure exactly how such a test should be written...The test will fail if
> connection does not succeed within 60s
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira