[ 
https://issues.apache.org/jira/browse/QPID-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Conway updated QPID-2220:
------------------------------

    Description: 
If every member of a persistent cluster crashes then manual intervention is 
required to identify which store is most up-to-date, so it can be used to 
recover. We need to provide tools to assist in this identification.

The cluster can save a config-change counter with each config change (cluster 
membership change). In recovery, the broker with the highest config-change 
counter has the best store. However if the last brokers in the cluster crash so 
close together that none can record a config-change we need an additional 
decider.

The store at http://qpidcomponents.org/download.html#persistence maintains a 
global Record Identifier (RID), a 64 bit value that is incremented for each 
enqueue and dequeue. If the cluster stores  (config-change,RID) pairs then in 
recovery we can use actual-RID - RID at config-change as a tiebreaker.

Proposed change to MessageStore API:
  /** Returns a monotonically increasing value reflecting the number of changes 
to the store.
  * The value can wrap-around to 0.
  * Stores need not implement this function, they can simply return 0.
  */
  uint64_t getChangeCounter();

The default implementation just returns 0  and the cluster must fall back to 
relying on config-change counts.

  was:
If every member of a persistent cluster crashes then manual intervention is 
required to identify which store is most up-to-date, so it can be used to 
recover.
We need to provide tools to assist in this identification.

The cluster can save a config-change counter with each config change. In 
recovery, the broker with the highest config-change counter has the best store. 
However if the last brokers in the cluster crash so close together that none 
can record a config-change we need an additional decider.

The store at http://qpidcomponents.org/download.html#persistence maintains a 
global counter called the RecordIdentifier (RID) that is incremented for each 
enqueue and dequeue. If the cluster stores  (config-change,RID) pairs then in 
recovery we can use actual-RID - RID at config-change as a tiebreaker.

Is it reasonable to provide access to this counter in the generic MessageStore 
API? Stores that don't implement it can simply return 0, and the cluster must 
fall back to relying on config-change counts.


> Assisign manual recovery from a complete persistent cluster crash.
> ------------------------------------------------------------------
>
>                 Key: QPID-2220
>                 URL: https://issues.apache.org/jira/browse/QPID-2220
>             Project: Qpid
>          Issue Type: Improvement
>          Components: C++ Broker
>    Affects Versions: 0.5
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>
> If every member of a persistent cluster crashes then manual intervention is 
> required to identify which store is most up-to-date, so it can be used to 
> recover. We need to provide tools to assist in this identification.
> The cluster can save a config-change counter with each config change (cluster 
> membership change). In recovery, the broker with the highest config-change 
> counter has the best store. However if the last brokers in the cluster crash 
> so close together that none can record a config-change we need an additional 
> decider.
> The store at http://qpidcomponents.org/download.html#persistence maintains a 
> global Record Identifier (RID), a 64 bit value that is incremented for each 
> enqueue and dequeue. If the cluster stores  (config-change,RID) pairs then in 
> recovery we can use actual-RID - RID at config-change as a tiebreaker.
> Proposed change to MessageStore API:
>   /** Returns a monotonically increasing value reflecting the number of 
> changes to the store.
>   * The value can wrap-around to 0.
>   * Stores need not implement this function, they can simply return 0.
>   */
>   uint64_t getChangeCounter();
> The default implementation just returns 0  and the cluster must fall back to 
> relying on config-change counts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org

Reply via email to