Igor,

Could you please elaborate - what is the whole set of information we are
going to save at checkpoint time? From what I understand this should be:
1) List of active transactions with WAL pointers of their first writes
2) List of prepared transactions with their update counters
3) Partition counter low watermark (LWM) - the smallest partition counter
before which there are no prepared transactions.

And the we send to supplier node a message: "Give me all updates starting
from that LWM plus data for that transactions which were active when I
failed".

Am I right?

On Fri, Nov 23, 2018 at 11:22 AM Seliverstov Igor <gvvinbl...@gmail.com>
wrote:

> Hi Igniters,
>
> Currently I’m working on possible approaches how to implement historical
> rebalance (delta rebalance using WAL iterator) over MVCC caches.
>
> The main difficulty is that MVCC writes changes on tx active phase while
> partition update version, aka update counter, is being applied on tx
> finish. This means we cannot start iteration over WAL right from the
> pointer where the update counter updated, but should include updates, which
> the transaction that updated the counter did.
>
> These updates may be much earlier than the point where the update counter
> was updated, so we have to be able to identify the point where the first
> update happened.
>
> The proposed approach includes:
>
> 1) preserve list of active txs, sorted by the time of their first update
> (using WAL ptr of first WAL record in tx)
>
> 2) persist this list on each checkpoint (together with TxLog for example)
>
> 4) send whole active tx list (transactions which were in active state at
> the time the node was crushed, empty list in case of graceful node stop) as
> a part of partition demand message.
>
> 4) find a checkpoint where the earliest tx exists in persisted txs and use
> saved WAL ptr as a start point or apply current approach in case the active
> tx list (sent on previous step) is empty
>
> 5) start iteration.
>
> Your thoughts?
>
> Regards,
> Igor

Reply via email to