[
https://issues.apache.org/jira/browse/QPID-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12801815#action_12801815
]
Alan Conway commented on QPID-2341:
-----------------------------------
We would only be marking state that is actually replicated by the cluster.
The changes would be to add a base class, and add a 1-line function call to
some member functions. In principle we only need to mark the "innermost"
modifiers, e.g.
Foo {
f() { if (...) g() ... } // modifies state by calling g()
g() { assertReplicationSafe(); someMemberVar =x } // modifies state directly
}
I haven't done the analysis to determine just how many assertions that would be.
> On a subsiduary note - how is someone to know if a piece of state they just
> added is replication safe?
I'm hoping this proposal will make that easier: The presence of Replicated base
class and assertReplicationSafe() in code act as notices that your working in
an area that is replicated by the cluster.
> Annotate replicated broker classes with assertions.
> ---------------------------------------------------
>
> Key: QPID-2341
> URL: https://issues.apache.org/jira/browse/QPID-2341
> Project: Qpid
> Issue Type: Improvement
> Components: C++ Broker
> Affects Versions: 0.6
> Reporter: Alan Conway
> Assignee: Alan Conway
>
> A clustered broker maintains consistency of replicated objects by only
> modifying them in a "replication safe" thread context: while receiving an
> update or dispatching cluster events.
> A repeated source of cluster bugs is broker code that unwittingly modifies
> replicated objects in an unsafe context such as a timer thread. These bugs
> are intermittent race conditions that are hard to track down.
> Proposal: annotate broker code with assertions to identify code that modifies
> replicated state and log/abort if such code is called in an unsafe context:
> // New class:
> namespace broker {
> class Replicated {
> protected:
> void assertReplicationSafe();
> }
> // Existing classes
> class Queue : public Replicated { // Mark Queue as state that may be
> replicated.
> void someQueueModifier() {
> assertReplicationSafe(); // This function should only be called in
> replication-safe context.
> }
> The assertion is cheap: just testing a thread-local boolean value. In a
> non-clustered broker it does nothing.
> This technique has already proven valuable in debugging a recent bug, putting
> the assertions permanently in the code should speed debugging of future bugs.
> This would be the beginning of a formal contract between the broker code and
> the cluster that should make things more maintainable in the long run.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project: http://qpid.apache.org
Use/Interact: mailto:[email protected]