Difficult recovery on broker death
----------------------------------
Key: QPID-3757
URL: https://issues.apache.org/jira/browse/QPID-3757
Project: Qpid
Issue Type: Bug
Components: C++ Client
Affects Versions: 0.12
Environment: RHEL 4.7 and RHEL 6.2
Reporter: Rob Springer
Priority: Minor
When using the old API (which might render this bug as invalid, if the old API
is completely deprecated), if the broker dies, it's not possible to recover
Subscription and LocalQueue variables unless you follow a precise workaround
procedure.
The problem is:
If the broker dies and is then respawned, if one attempts to reconnect to
the new broker and doesn't create a new Session (i.e., use the old one), bad
things happen (since Session doesn't yet support resume(), I assume that's
expected behavior).
If, however, one tries to create new Session, new SubscriptionManager, and
new Subscription objects, an assertion failure is generated (backtrace
attached).
After reading the backtrace, I believe the following is happening:
1) In recovery, we attempt to assign a new Subscription to the previous
Subscription variable (i.e., "sub = subMgr->subscribe()")
2) That causes the refcount for the old Subscription to fall to 0, causing it
to be cleaned up.
3) As part of that cleanup, the associated SubscriptionImpl object goes to
destroy its (std::auto_ptr<ScopedDivert>) demuxRule member.
4) That demuxRule member maintains a reference to a Demux object, demuxer,
which exists inside the Session object. Since the Session object has been
re-created, that old reference is invalid & results in the assertion.
Thus, we have a fatal circle - we need to create a new Session object to be
able to proceed, but when we do so, we render ourselves unable to re-use
Subscription variables.
Gordon proposed a workaround which does solve the problem for me, in practice,
and that is to assign "null" Subscription and LocalQueue objects to those
variables before re-creating the Session object. Unfortunately, this won't be
clear to any new users, so if anyone is still using the old API, they might be
likely to encounter it.
I'll attach an example showing the problem and the fix as well as snippets from
my backtrace shortly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project: http://qpid.apache.org
Use/Interact: mailto:[email protected]