> On May 13, 2013, 3:22 p.m., Robbie Gemmell wrote:
> > http://svn.apache.org/repos/asf/qpid/trunk/qpid/java/client/src/main/java/org/apache/qpid/client/BasicMessageConsumer.java,
> >  lines 744-745
> > <https://reviews.apache.org/r/10738/diff/2/?file=288993#file288993line744>
> >
> >     What impact does dropping the message on the floor here have on the 
> > client?
> >     
> >     Earlier points in the call hierarchy seem to make effort to do other 
> > things with the message when detecting session/consumer close, so is there 
> > any impact from not doing so? E.g a message getting stuck acquired for a 
> > now-closed consumer?
> >     
> >     Does any particular attention need paid to the overridden 0-10 specific 
> > version of this method?
> 
> rajith attapattu wrote:
>     The message will be dropped if a consumer (or session is closed or 
> closing).
>     When a consumer is closed, any messages acquired but not acknowledged 
> should be be made available to another consumer by the broker.
>     These messages will be marked redelivered.
>     Is this the same situation for 0-8/0-9 ?
>     
>     The situation is the same as a consumer closing with a bunch of unacked 
> messages when in CLIENT_ACK mode.
>     Alternatively we could reject the message (release in 0-10 terms). But I 
> don't think this is required, given that we will be closing the consumer 
> anyways.
>     
>     
>     >Earlier points in the call hierarchy seem to make effort to do other 
> things with the message when detecting session/consumer close
>     By "other things" are you referring to a reject? In 0-10 AFIK you don't 
> need to do it. 
>     The situation is the same as a consumer closing with a bunch of unacked 
> messages when in CLIENT_ACK mode.
>     
>     >Does any particular attention need paid to the overridden 0-10 specific 
> version of this method?
>     IMO adding "if (!(isClosed() || isClosing()))" is required to prevent the 
> 0-10 specific method from issuing credit (and receiving more messages, that 
> will eventually get released).
>     There is a chance where the consumer could be marked closed, but not yet 
> reached the point of sending a cancel, by the time messageFlow() is called.
>     The above check should prevent it.
> 
> Robbie Gemmell wrote:
>     "Did you mean race condition (instead of deadlock)?"
>     
>     I really meant deadlock. One of the uses of _messageDeliveryLock being 
> removed was added as part of the fix for a deadlock 
> (https://issues.apache.org/jira/browse/QPID-3911) and so I wonder what effect 
> removing it again will have. It may be the other changes mean it isn't a 
> problem, but I think it needs to be closely considered. 
>     
>     "The message will be dropped if a consumer (or session is closed or 
> closing).
>     When a consumer is closed, any messages acquired but not acknowledged 
> should be be made available to another consumer by the broker.
>     These messages will be marked redelivered.
>     Is this the same situation for 0-8/0-9 ?"
>     
>     That isn't actually the case, transferred messages are the responsibility 
> of the session and remain acquired after consumer close until such point as 
> the session explicitly does something with them. I believe this is true of 
> 0-8/9/91 as well, and probably explains the origin of the reject/release code 
> I referred to earlier in the call tree than the changes would now allow the 
> message to be silently dropped.
>     
>     From the 0-10 spec: "Canceling a subscription MUST NOT affect pending 
> transfers. A transfer made prior to canceling transfers to the destination 
> MUST be able to be accepted, released, acquired, or rejected after the 
> subscription is canceled."
> 
> rajith attapattu wrote:
>     Indeed, Gordon mentioned that to me and the 2nd patch (the one before I 
> did today) takes care of rejecting messages from the consumer. We don't need 
> to do the same when the session is closing, as when the session ends, the any 
> unacked messages are put back on the queue.

As for QPID-3911,
There is a deadlock, albeit a bit different (involving the same locks) from 
QPID-3911, that do happen in similar circumstances.
However this deadlock appears to manifest with or without this patch, which 
leads me to believe that _messageDeliveryLock is not the right solution for 
QPID-3911.
Sadly the solution for QPID-3911 made it worse as there are at least two 
distinct cases of deadlocks involving _messageDeliveryLock.
1. Btw _lock and _messageDeliveryLock
2. Btw _messageDeliveryLock and _failoverMutex.

We definitely need to find a solution for the deadlocks (at least 3 cases) btw 
failoverMutex and _lock (in AMQSession), which seems to be the root of all evil 
:)
We might have to drop one lock (most likely _lock) and see if we can provide an 
alternative strategy to guarantee correct behaviour.


- rajith


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10738/#review20483
-----------------------------------------------------------


On May 21, 2013, 2:48 p.m., rajith attapattu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10738/
> -----------------------------------------------------------
> 
> (Updated May 21, 2013, 2:48 p.m.)
> 
> 
> Review request for qpid, Robbie Gemmell, Weston Price, and Rob Godfrey.
> 
> 
> Description
> -------
> 
> There are at least 3 cases where the deadlock btw _messageDeliveryLock and 
> _failoverMutex surfaces. Among them sending a message inside onMessage() and 
> the session being closed due to an error (causing the deadlock) seems to come 
> up a lot in production environments. There is also a deadlock btw 
> _messageDeliveryLock and _lock (AMQSession.java) which happens less 
> frequently.
> The messageDeliveryLock is used to ensure that we don't close the session in 
> the middle of a message delivery. In order to do this we hold the lock across 
> onMessage().
> This causes several issues in addition to the potential to deadlock. If an 
> onMessage call takes longer/wedged then you cannot close the session or 
> failover will not happen until it returns as the same thread is holding the 
> failoverMutex. 
> 
> Based on an idea from Rafi, I have come up with a solution to get rid of 
> _messageDeliveryLock and instead use an alternative strategy to achieve 
> similar functionality.
> In order to ensure that close() doesn't proceed until the message deliveries 
> currently in progress completes, an atomic counter is used to keep track of 
> message deliveries in progress.
> The close() will wait until the count falls to zero before proceeding. No new 
> deliveries will be initiated bcos the close method will mark the session as 
> closed.
> The wait has a timeout to ensure that a longer running or wedged onMessage() 
> will not hold up session close.
> There is a slim chance that before a session being marked as closed a message 
> delivery could be initiated, but not yet gotten to the point of updating the 
> counter, hence waitForMsgDeliveryToFinish() will see it as zero and proceed 
> with close. But in comparison to the issues with _messageDeliveryLock, I 
> believe it's acceptable.
> 
> There is an issue if MessageConsumer close is called outside of Session 
> close. This can be solved in a similar manner. I will wait until the current 
> review is complete and then post the solution for the MessageConsumer close.
> I will commit them both together.
> 
> 
> This addresses bug QPID-4574.
>     https://issues.apache.org/jira/browse/QPID-4574
> 
> 
> Diffs
> -----
> 
>   
> http://svn.apache.org/repos/asf/qpid/trunk/qpid/java/client/src/main/java/org/apache/qpid/client/AMQConnection.java
>  1484812 
>   
> http://svn.apache.org/repos/asf/qpid/trunk/qpid/java/client/src/main/java/org/apache/qpid/client/AMQSession.java
>  1484812 
>   
> http://svn.apache.org/repos/asf/qpid/trunk/qpid/java/client/src/main/java/org/apache/qpid/client/BasicMessageConsumer.java
>  1484812 
>   
> http://svn.apache.org/repos/asf/qpid/trunk/qpid/java/client/src/main/java/org/apache/qpid/client/BasicMessageConsumer_0_10.java
>  1484812 
>   
> http://svn.apache.org/repos/asf/qpid/trunk/qpid/java/client/src/main/java/org/apache/qpid/client/util/ConditionManager.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10738/diff/
> 
> 
> Testing
> -------
> 
> Java test suite, tests from customers and QE around the deadlock situation.
> 
> 
> Thanks,
> 
> rajith attapattu
> 
>

Reply via email to