[ https://issues.apache.org/jira/browse/QPID-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047244#comment-13047244 ]
jirapos...@reviews.apache.org commented on QPID-3079: ----------------------------------------------------- ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/860/#review801 ----------------------------------------------------------- /branches/qpid-3079/qpid/cpp/src/qpid/broker/PersistableQueue.h <https://reviews.apache.org/r/860/#comment1747> Ok, discussed this with Kim - his understanding of this pointer is that it is owned by the persistent queue, and the queue should be responsible for calling delete on it. If that is the case, then there is a bug in that the store is referencing this context after the queue is destroyed. I will work with Kim on this to better understand the root cause and fix it properly. /branches/qpid-3079/qpid/cpp/src/qpid/broker/Queue.cpp <https://reviews.apache.org/r/860/#comment1749> "All you need to know is when they all complete" - not true: we need to know when _each_ dequeue completes. This is not like the enqueue case - we need to known when each dequeue is complete as the Message.Accept may be only interested in that one particular dequeue event. Which is what QPID-3079 is specifically trying to solve. /branches/qpid-3079/qpid/cpp/src/qpid/broker/SemanticState.cpp <https://reviews.apache.org/r/860/#comment1748> Because of this: https://bugzilla.redhat.com/show_bug.cgi?id=698721 Store will not complete the dequeue for up to one second *if* it doesn't have enough work to perform immediately. Broker needs to force fast completion on Execution.Sync - it does this by issuing a flush() on the command. And that's why a simple counter won't work here - we need to track the queues that are in the process of dequeuing specifically in case a flush is called. /branches/qpid-3079/qpid/cpp/src/qpid/broker/SemanticState.cpp <https://reviews.apache.org/r/860/#comment1750> Which is what I'm trying to do - at least at the SemanticState level (with the message accept context). Its just that the message.accept context has to be able to track N outstanding dequeues in order to know when it is complete (could be a counter), or if the SessionState flushes the command (counter not sufficient - it needs list of messages or queues to flush). Let me step back a bit - there are three layers of abstraction introduced by this patch: 1) SessionState/SessionContext needs to allow for async commands with an abstract api (i.e. - SessionState doesn't care about the command specifics, it just needs to know when it is complete, and be able to flush the command if needed) See changes in SessionContext. 2) SemanticState uses one of these abstract "async command contexts" for the Message.Accept command. This context must monitor a _set_ of outstanding dequeues in order to 1) know when the command is complete, and 2) which queues/messages need flushing should SessionState requires a flush. 3) The Queue needs to provide a callback when a particular message is dequeued (not when all dequeues complete for a given message). The callback needs to identify the queue and the particular message that has completed dequeue. #3 could be tracked by the message instead, but we still need to know when -each- dequeue completes, so then the message would have to track each of its concurrent dequeue's state (and target queue). I think that's exactly what I've got Queue doing now anyways, and I think representing that in the Queue is more "natural" - but that's more my artistic opinion :) - Kenneth On 2011-06-08 20:31:05, Kenneth Giusti wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/860/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-06-08 20:31:05) bq. bq. bq. Review request for qpid, Alan Conway, Gordon Sim, and Kim van der Riet. bq. bq. bq. Summary bq. ------- bq. bq. Modifies the broker's handling of Message.Accept to hold off the completion of the command until all messages related to the accept have completed dequeue. This particularly applies to persistent messages, as the store::dequeue() operation must complete before the message is considered fully dequeued. bq. bq. Note this bugfix requires some changes to the broker's store module interface: previously, the store only identified the message when a dequeue was completed. This is not enough information - the queue from which is was removed must also be identified (the message may be in the process of being dequeued on several queues at once). bq. bq. bq. This addresses bug qpid-3079. bq. https://issues.apache.org/jira/browse/qpid-3079 bq. bq. bq. Diffs bq. ----- bq. bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/AsyncCompletion.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/DeliveryRecord.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/DeliveryRecord.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/Message.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/Message.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/MessageStore.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/MessageStoreModule.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/MessageStoreModule.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/NullMessageStore.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/NullMessageStore.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/PersistableMessage.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/PersistableMessage.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/PersistableQueue.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/Queue.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/Queue.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/RecoverableQueue.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/RecoveryManagerImpl.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/SemanticState.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/SessionContext.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/SessionState.h 1124895 bq. /branches/qpid-3079/qpid/cpp/src/qpid/broker/SessionState.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/tests/AsyncCompletion.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/tests/QueueTest.cpp 1124895 bq. /branches/qpid-3079/qpid/cpp/src/tests/TestMessageStore.h 1124895 bq. bq. Diff: https://reviews.apache.org/r/860/diff bq. bq. bq. Testing bq. ------- bq. bq. broker unit tests, store unit tests (modified jboss store). Still needs to be vetted on non-linux, and have latest trunk merged in. bq. bq. bq. Thanks, bq. bq. Kenneth bq. bq. > message.accept command should be completed on a per-dequeue basis > ----------------------------------------------------------------- > > Key: QPID-3079 > URL: https://issues.apache.org/jira/browse/QPID-3079 > Project: Qpid > Issue Type: Bug > Components: C++ Broker > Affects Versions: 0.8, 0.9 > Reporter: Ken Giusti > Assignee: Ken Giusti > Fix For: 0.11 > > Attachments: proposal.txt > > > ** Overview > Asynchronous completion means that command execution is initiated in one > thread > (a client connection thread) and completed in a different thread. > When the async store is loaded, message.transfer commands are > completed by a store thread that does the async write. > ** Issues with asynchronous completion code as of revision r1029686 > *** Not really asynchronous > IncompleteMessageList::process blocks the connection thread till all > outstanding async commands (messages) for the session are complete. > With the new cluster, this could deadlock since it is blocking a Poller > thread. > *** Race condition for message.transfer > > Sketch of the current code: > // Called in connection thread > PersistableMessage::enqueueAsync { ++counter; } > // Called in store thread once message is written. > PersistableMessage::enqueueComplete { if (--counter == 0) notifyCompleted(); } > The intent is that notify be called once per message, after the > message has been written to each queue it was routed to. > However of a message is routed to N queues, it's possible for > notifyCompleted to be called up to N times. The store thread could > call notifyCompleted for the first queue before the connection thread > has called enqueueAsync for the second queue, and so on. > *** No asynchronous completion for message.accept > We do not currently delay completion of message.accept until the > message is deleted from the async store. This could cause duplicate > delivery if the broker crashes after completing the message but > before it is removed from store. > There is code in PersistableMessage to maintain a counter for dequeues > analogous to to the async enqueue code but this is incorrect. > Completion of transfer is triggered when all enqueues for a message are > complete. > Completion of accept is triggered for *each* dequeue from a queue > independently. > Furthermore a single accept can reference many messages, so it can't be > associated with a message. > ** New requirements > The new cluster design will need to participate in async completion, e.g. > an accept cannot be comlpeted until the message is > - removed from store (if present) AND > - replicated to the cluster (if present) as dequeued > The new cluster also needs to asynchronously complete binding commands > (declare, bind, delete) when they are replicated to the cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- Apache Qpid - AMQP Messaging Implementation Project: http://qpid.apache.org Use/Interact: mailto:dev-subscr...@qpid.apache.org