Difficult recovery on broker death
----------------------------------

                 Key: QPID-3757
                 URL: https://issues.apache.org/jira/browse/QPID-3757
             Project: Qpid
          Issue Type: Bug
          Components: C++ Client
    Affects Versions: 0.12
         Environment: RHEL 4.7 and RHEL 6.2
            Reporter: Rob Springer
            Priority: Minor


When using the old API (which might render this bug as invalid, if the old API 
is completely deprecated), if the broker dies, it's not possible to recover 
Subscription and LocalQueue variables unless you follow a precise workaround 
procedure.  

The problem is:
   If the broker dies and is then respawned, if one attempts to reconnect to 
the new broker and doesn't create a new Session (i.e., use the old one), bad 
things happen (since Session doesn't yet support resume(), I assume that's 
expected behavior).
   If, however, one tries to create new Session, new SubscriptionManager, and 
new Subscription objects, an assertion failure is generated (backtrace 
attached).
   After reading the backtrace, I believe the following is happening:
1) In recovery, we attempt to assign a new Subscription to the previous 
Subscription variable (i.e., "sub = subMgr->subscribe()")
2) That causes the refcount for the old Subscription to fall to 0, causing it 
to be cleaned up.
3) As part of that cleanup, the associated SubscriptionImpl object goes to 
destroy its (std::auto_ptr<ScopedDivert>) demuxRule member.
4) That demuxRule member maintains a reference to a Demux object, demuxer, 
which exists inside the Session object. Since the Session object has been 
re-created, that old reference is invalid & results in the assertion.

Thus, we have a fatal circle - we need to create a new Session object to be 
able to proceed, but when we do so, we render ourselves unable to re-use 
Subscription variables.

Gordon proposed a workaround which does solve the problem for me, in practice, 
and that is to assign "null" Subscription and LocalQueue objects to those 
variables before re-creating the Session object. Unfortunately, this won't be 
clear to any new users, so if anyone is still using the old API, they might be 
likely to encounter it.

I'll attach an example showing the problem and the fix as well as snippets from 
my backtrace shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Reply via email to