[ 
https://issues.apache.org/jira/browse/IGNITE-19271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Semyon Danilov resolved IGNITE-19271.
-------------------------------------
    Fix Version/s: 3.0.0-beta2
       Resolution: Fixed

Fixed by IGNITE-19532

> Persist revision-safeTime mapping in meta-storage
> -------------------------------------------------
>
>                 Key: IGNITE-19271
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19271
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Ivan Bessonov
>            Assignee: Semyon Danilov
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-beta2
>
>
> IEP-98 states:
> {code:java}
> When creating a message M telling the cluster about a schema update 
> activation moment, choose the message timestamp Tm (moving safeTime forward) 
> equal to Now, but assign Tu (activation moment) contained in that M to be 
> Tm+DD {code}
> This is hard to achieve.
> h3. Problem
> We need {{{}Tu==Tm+DD{}}}. Right now, with what we have in IGNITE-19028, it's 
> not straightforward. This is because we have too many actors:
>  * There's a {_}client{_}, that chooses Tu, because it's the only actor that 
> can affect message content.
>  * There's a meta-storage {_}lease-holder{_}, or {_}leader{_}, that chooses 
> Tm.
>  * There's everybody else, who expect a correspondence between Tu and Tm.
> First two actors are important, because they have independent clocks, but 
> must coordinate the same event. This is impossible with described protocol.
> h3. Discussion
> Let's consider these two solutions:
>  # Client generates Tm.
>  # Meta-storage generates Tu.
> Option 1 is out of question, there must be only a single node at any given 
> moment in time, that's responsible for the linear order of time in messages.
> What about option 2? Since meta-storage doesn't know anything about commands 
> semantics, it can't really generate any data. So this solution doesn't work 
> either.
> h3. Solution
> Combined solution could be the following:
>  * Client sends DD as part of the command (this is not a constant, user _can_ 
> configure it, if they really feel like doing it)
>  * Meta-storage generates {{Tm}}
>  * Every node, upon receiving the update, calculates {{Tu}}
> This could work, if nodes would have never been restarted. There's one 
> problem that needs to be solved: recovering the values of {{Tm}} from the 
> (old) data upon node restart.
> This can be achieved by persisting safeTime along with revision as a part of 
> metadata, that can be retrieved back through the meta-storage service API.
> In other words:
> 1. Client sends
> {code:java}
> schema.latest   = 5
> schema.5.data   = ...
> schema.5.dd     = 30s{code}
> 2. Lease-holder adds meta-data to the command:
> {code:java}
> safeTime = 10:10
> {code}
> 3. Meta-storage listener writes the data:
> {code:java}
> revision = 33
>     schema.latest = 5
>     schema.5.data = ...
>     schema.5.dd   = 30s
> revision.33.safeTime = 10:10:00{code}
>  
> How can you read {{{}Tu{}}}:
>  * read "{{{}schema.5.dd"{}}};
>  * read its revision, it's 33;
>  * read a timestamp of revision 33 via specialized API;
>  * add two values together.
> h3. Implications and restrictions
> There's a cleanup process in the meta-storage. It will eventually remove any 
> "revision.x.safeTime" values, because corresponding revision became obsolete.
> But, we should somehow preserve timestamps of revisions that are used by 
> schemas. Such behaviour can be achieved, if components can reserve a 
> revision, and meta-storage can't compact it unless the reservation has been 
> revoked.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to