[ https://issues.apache.org/jira/browse/IGNITE-19271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Semyon Danilov resolved IGNITE-19271. ------------------------------------- Fix Version/s: 3.0.0-beta2 Resolution: Fixed Fixed by IGNITE-19532 > Persist revision-safeTime mapping in meta-storage > ------------------------------------------------- > > Key: IGNITE-19271 > URL: https://issues.apache.org/jira/browse/IGNITE-19271 > Project: Ignite > Issue Type: Improvement > Reporter: Ivan Bessonov > Assignee: Semyon Danilov > Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > IEP-98 states: > {code:java} > When creating a message M telling the cluster about a schema update > activation moment, choose the message timestamp Tm (moving safeTime forward) > equal to Now, but assign Tu (activation moment) contained in that M to be > Tm+DD {code} > This is hard to achieve. > h3. Problem > We need {{{}Tu==Tm+DD{}}}. Right now, with what we have in IGNITE-19028, it's > not straightforward. This is because we have too many actors: > * There's a {_}client{_}, that chooses Tu, because it's the only actor that > can affect message content. > * There's a meta-storage {_}lease-holder{_}, or {_}leader{_}, that chooses > Tm. > * There's everybody else, who expect a correspondence between Tu and Tm. > First two actors are important, because they have independent clocks, but > must coordinate the same event. This is impossible with described protocol. > h3. Discussion > Let's consider these two solutions: > # Client generates Tm. > # Meta-storage generates Tu. > Option 1 is out of question, there must be only a single node at any given > moment in time, that's responsible for the linear order of time in messages. > What about option 2? Since meta-storage doesn't know anything about commands > semantics, it can't really generate any data. So this solution doesn't work > either. > h3. Solution > Combined solution could be the following: > * Client sends DD as part of the command (this is not a constant, user _can_ > configure it, if they really feel like doing it) > * Meta-storage generates {{Tm}} > * Every node, upon receiving the update, calculates {{Tu}} > This could work, if nodes would have never been restarted. There's one > problem that needs to be solved: recovering the values of {{Tm}} from the > (old) data upon node restart. > This can be achieved by persisting safeTime along with revision as a part of > metadata, that can be retrieved back through the meta-storage service API. > In other words: > 1. Client sends > {code:java} > schema.latest = 5 > schema.5.data = ... > schema.5.dd = 30s{code} > 2. Lease-holder adds meta-data to the command: > {code:java} > safeTime = 10:10 > {code} > 3. Meta-storage listener writes the data: > {code:java} > revision = 33 > schema.latest = 5 > schema.5.data = ... > schema.5.dd = 30s > revision.33.safeTime = 10:10:00{code} > > How can you read {{{}Tu{}}}: > * read "{{{}schema.5.dd"{}}}; > * read its revision, it's 33; > * read a timestamp of revision 33 via specialized API; > * add two values together. > h3. Implications and restrictions > There's a cleanup process in the meta-storage. It will eventually remove any > "revision.x.safeTime" values, because corresponding revision became obsolete. > But, we should somehow preserve timestamps of revisions that are used by > schemas. Such behaviour can be achieved, if components can reserve a > revision, and meta-storage can't compact it unless the reservation has been > revoked. -- This message was sent by Atlassian Jira (v8.20.10#820010)