Deadlock while sending a message after failover within a consumer
-----------------------------------------------------------------
Key: AMQNET-289
URL: https://issues.apache.org/activemq/browse/AMQNET-289
Project: ActiveMQ .Net
Issue Type: Bug
Components: ActiveMQ
Affects Versions: 1.4.1
Environment: Windows 7 64 bits
Reporter: Morgan Martinet
Assignee: Jim Gomes
Priority: Critical
Scenario:
- I have one producer that sends a request (with a temporary queue specified in
the Reply-to attribute) to a consumer, in a separate process.
- both, the producer and the consumer, use the following connection string:
failover:(tcp://localhost:61616)?timeout=3000
- the consumer, when processing the request, waits 10 seconds then sends a
response back, using the Reply-To attribute.
- immediately after the message has been sent, while the consumer is waiting
for 10 secs, I restart the ActiveMQ broker.
- once the the consumer wakes up and tries to send its reply, it will deadlock
because of the failover.
We have managed to identify the resources that deadlock:
Thread1 - lock(reconnectMutex)
(c:\Temp\Apache\NMS.ActiveMQ\1.4.1\src\main\csharp\Transport\Failover\FailoverTransport.cs:
line 366)
Thread1 - wait on lock(this.consumers.SyncRoot)
(c:\Temp\Apache\NMS.ActiveMQ\1.4.1\src\main\csharp\Session.cs: line 830)
Thread2 - lock(this.consumers.SyncRoot)
(c:\Temp\Apache\NMS.ActiveMQ\1.4.1\src\main\csharp\SessionExecutor.cs: line 147)
Thread2 - wait on lock(reconnectMutex)
(c:\Temp\Apache\NMS.ActiveMQ\1.4.1\src\main\csharp\Transport\Failover\FailoverTransport.cs:
line 531)
Patch:
I managed to find a simple fix for this, by moving the consumer dispatch out of
the this.consumers.SyncRoot lock in SessionExecutor.cs:
{{
public void Dispatch(MessageDispatch dispatch)
{
try
{
MessageConsumer consumer = null;
lock(this.consumers.SyncRoot)
{
if(this.consumers.Contains(dispatch.ConsumerId))
{
consumer = this.consumers[dispatch.ConsumerId] as
MessageConsumer;
}
// Note that consumer.Dispatch(...) was moved below, outside of the lock.
}
// If the consumer is not available, just ignore the message.
// Otherwise, dispatch the message to the consumer.
if(consumer != null) {
consumer.Dispatch(dispatch);
}
}
catch(Exception ex)
{
Tracer.DebugFormat("Caught Exception While Dispatching: {0}",
ex.Message );
}
}
}}
Note that I ran the unit tests before my patch and I got 3 failures. Then I got
the same failures with my patch. So, I hope it didn't break anything but I'll
let you find the best solution...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.