Potential deadlocks during failover
-----------------------------------

                 Key: QPID-144
                 URL: http://issues.apache.org/jira/browse/QPID-144
             Project: Qpid
          Issue Type: Bug
          Components: Dot Net Client, Java Client
            Reporter: Steven Shaw


There's a certain need for "failover safety" in the implemenation of public 
client api methods. Any method that blocks for a response frame should be 
wrapped in a FailoverSupport. FailoverSupport automates the retrying after 
catching a FailoverException (a RuntimeException).

Methods that block waiting for a response frame are now easier to identify 
because they all call AMQProtocolHandler.syncWrite() (SyncWrite in the .NET 
client)

Currently the only methods employing FailoverSupport are 
AMQConnection.createSession, AMQSession.createConsumerImpl and 
createProducerImpl.

AMQConnection.createSession has 3 calls to syncWrite so certainly needs to be 
wrapped in FailoverSupport. No problem there.

AMQSession.createConsumerImpl/createProducerImpl neither call syncWrite. Unless 
there is some other important way in which they block, they don't really need 
to be wrapped in the FailoverSupport. It does no harm however.

The following methods use syncWrite() but are not wrapped in a FailoverSupport:
  AMQSession's commit(), rollback(), close()
  AMQConnection.close() via AMQProtocolHandler.closeConnection()
  BasicMessageConsumer.close()
These need to be protected/wrapped in a FailoverSupport. Note that commit() and 
rollback() are not currently protected by a lock on failoverMutex either.

Perhaps StateManager.attainState is the only other method that blocks for "a 
response frame". In this case a series of response frames that result in the 
state changing. The only use of attainState is in 
AMQConnection.makeBrokerConnection. It would appear to need to be wrapped in a 
FailoverSupport as otherwise the FailoverException will escape. Since this is 
failing-over during connection some care may be required. Note that the 
makeBrokerConnection is used at 3 different sites.

In addition sendAcknowledgement appear to need to lock the failoverMutex.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to