Client does not ensure connection is closed before attempting failover
----------------------------------------------------------------------

                 Key: QPID-1949
                 URL: https://issues.apache.org/jira/browse/QPID-1949
             Project: Qpid
          Issue Type: Bug
    Affects Versions: M4, 0.5
            Reporter: Martin Ritchie


* Summary:
 * A user has reported message loss from their application. On bouncing of
 * the broker the 'lost' messages are delivered to the broker.
 *
 * Note:
 * The client was using Spring so that may influence the situation.
 *
 * Issue:
 * The log files show 7 instances of the following which result in 7
 * missing messages.
 *
 * The client log files show:
 *
 * The broker log file show:
 *
 *
 * 7 missing messages have delivery tags 5-11. Which says that they are
 * sequentially the next message from the broker.
 *
 * The only way for the 'without a handler' log to occur is if the consumer
 * has been removed from the look up table of the dispatcher.
 * And the only way for the 'null message' log to occur on the broker is is
 * if the message does not exist in the unacked-map
 *
 * The consumer is only removed from the list during session
 * closure and failover.
 *
 * If the session was closed then the broker would requeue the unacked
 * messages so the potential exists to have an empty map but the broker
 * will not send a message out after the unacked map has been cleared.
 *
 * When failover occurs the _consumer map is cleared and the consumers are
 * resubscribed. This is down without first stopping any existing
 * dispatcher so there exists the potential to receive a message after
 * the _consumer map has been cleared which is how the 'without a handler'
 * log statement occurs.
 *
 * Scenario:
 *
 * Looking over logs the sequence that best fits the events is as follows:
 * - Something causes Mina to be delayed causing the WriteTimoutException.
 * - This exception is recevied by AMQProtocolHandler#exceptionCaught
 * - As the WriteTimeoutException is an IOException this will cause
 * sessionClosed to be called to start failover.
 * + This is potentially the issues here. All IOExceptions are treated
 * as connection failure events.
 * - Failover Runs
 * + Failover assumes that the previous connection has been closed.
 * + Failover binds the existing objects (AMQConnection/Session) to the
 * new connection objects.
 * - Everything is reported as being successfully failed over.
 * However, what is neglected is that the original connection has not
 * been closed.
 * + So what occurs is that the broker sends a message to the consumer on
 * the original connection, as it was not notified of the client
 * failing over.
 * As the client failover reuses the original AMQSession and Dispatcher
 * the new messages the broker sends to the old consumer arrives at the
 * client and is processed by the same AMQSession and Dispatcher.
 * However, as the failover process cleared the _consumer map and
 * resubscribe the consumers the Dispatcher does not recognise the
 * delivery tag and so logs the 'without a handler' message.
 * - The Dispatcher then attempts to reject the message, however,
 * + The AMQSession/Dispatcher pair have been swapped to using a new Mina
 * ProtocolSession as part of the failover process so the reject is
 * sent down the second connection. The broker receives the Reject
 * request but as the Message was sent on a different connection the
 * unacknowledgemap is empty and a 'message is null' log message
 * produced.
 *
 * Test Strategy:
 *
 * It should be easy to demonstrate if we can send an IOException to
 * AMQProtocolHandler#exceptionCaught and then try sending a message.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org

Reply via email to