[jira] [Updated] (IGNITE-19047) Implement metastorage and cmg raft log re-application in async manner

Alexander Lapin (Jira) Mon, 26 Jun 2023 00:18:03 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-19047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alexander Lapin updated IGNITE-19047:
-------------------------------------
    Description: 
h3. Motivation

In order to prevent inconsistent reads, raft replays local stable raft log on 
raft node restart in see org.apache.ignite.raft.jraft.core.NodeImpl#init. 
However, occurred that such approach led to a deadlock:

1. Restart of the table waited for the raft log to replay.

2. Log record waiter for index to be ready.

3. Index manager waited the table to be created to create index on top of it, 
that is awaited in step 2.

Given issue was addressed in https://issues.apache.org/jira/browse/IGNITE-18203 
and later slightly updated in 
https://issues.apache.org/jira/browse/IGNITE-19022 in a way that 
logApplyComplition future was introduced that is completed when all local 
stable log records are applied (org.apache.ignite.raft.jraft.core.NodeImpl):

 
{code:java}
public boolean init(final NodeOptions opts) {
    ...
    // Wait committed.
    long commitIdx = logManager.getLastLogIndex();

    CompletableFuture<Long> logApplyComplition = new CompletableFuture<>();

    if (commitIdx > fsmCaller.getLastAppliedIndex()) {
        LastAppliedLogIndexListener lnsr = new LastAppliedLogIndexListener() {
            @Override
            public void onApplied( long lastAppliedLogIndex) {
                if (lastAppliedLogIndex >= commitIdx) {
                    logApplyComplition.complete(lastAppliedLogIndex);
                    ...
                }
            }
        };
    ...
}{code}
Depending on replication group type given logApplyComplition is either blocks 
access to restoring data until ready or is synchronously awaited on start.

For meta storage and CMG latter is used (org.apache.ignite.internal.raft.Loza).

 
{code:java}
public CompletableFuture<RaftGroupService> startRaftGroupNode(
        RaftNodeId nodeId,
        PeersAndLearners configuration,
        RaftGroupListener lsnr,
        RaftGroupEventsListener eventsLsnr
) throws NodeStoppingException {
    ...
    // TODO: https://issues.apache.org/jira/browse/IGNITE-19047 Meta storage 
and cmg raft log re-application in async manner
    raftServer.raftNodeReadyFuture(nodeId.groupId()).join();
    ...
} {code}
For partitions, access is blocked until local recovery is finished, see 
org.apache.ignite.internal.replicator.Replica#ready usage for more details.
h3. Definition of Done
 * Implement meta storage and cmg raft log re-application in async manner like 
it's done for partitions.

 

  was:
h3. Motivation

In order to prevent inconsistent reads, raft replays local stable raft log on 
raft node restart in see org.apache.ignite.raft.jraft.core.NodeImpl#init. 
However, occurred that such approach led to a deadlock:

1. Restart of the table waited for the raft log to replay.

2. Log record waiter for index to be ready.

3. Index manager waited the table to be created to create index on top of it, 
that is awaited in step 2.

Given issue was addressed in https://issues.apache.org/jira/browse/IGNITE-18203 
and later slightly updated in 
https://issues.apache.org/jira/browse/IGNITE-19022 in a way that 
logApplyComplition future was introduced that is completed when all local 
stable log records are applied (org.apache.ignite.raft.jraft.core.NodeImpl):

 
{code:java}
public boolean init(final NodeOptions opts) {
    ...
    // Wait committed.
    long commitIdx = logManager.getLastLogIndex();

    CompletableFuture<Long> logApplyComplition = new CompletableFuture<>();

    if (commitIdx > fsmCaller.getLastAppliedIndex()) {
        LastAppliedLogIndexListener lnsr = new LastAppliedLogIndexListener() {
            @Override
            public void onApplied( long lastAppliedLogIndex) {
                if (lastAppliedLogIndex >= commitIdx) {
                    logApplyComplition.complete(lastAppliedLogIndex);
                    ...
                }
            }
        };
    ...
}{code}
Depending on replication group type given logApplyComplition is either blocks 
access to restoring data until ready or is synchronously awaited on start.

For meta storage and CMG latter is used (org.apache.ignite.internal.raft.Loza).

 
{code:java}
public CompletableFuture<RaftGroupService> startRaftGroupNode(
        RaftNodeId nodeId,
        PeersAndLearners configuration,
        RaftGroupListener lsnr,
        RaftGroupEventsListener eventsLsnr
) throws NodeStoppingException {
    ...
    // TODO: https://issues.apache.org/jira/browse/IGNITE-19047 Meta storage 
and cmg raft log re-application in async manner
    raftServer.raftNodeReadyFuture(nodeId.groupId()).join();
    ...
} {code}
For partitions, access is blocked until local recovery is finished, see 
org.apache.ignite.internal.replicator.Replica#ready usage for more details.
h3. Definition of Done

 

 


> Implement metastorage and cmg raft log re-application in async manner
> ---------------------------------------------------------------------
>
>                 Key: IGNITE-19047
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19047
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexander Lapin
>            Priority: Major
>              Labels: ignite-3
>
> h3. Motivation
> In order to prevent inconsistent reads, raft replays local stable raft log on 
> raft node restart in see org.apache.ignite.raft.jraft.core.NodeImpl#init. 
> However, occurred that such approach led to a deadlock:
> 1. Restart of the table waited for the raft log to replay.
> 2. Log record waiter for index to be ready.
> 3. Index manager waited the table to be created to create index on top of it, 
> that is awaited in step 2.
> Given issue was addressed in 
> https://issues.apache.org/jira/browse/IGNITE-18203 and later slightly updated 
> in https://issues.apache.org/jira/browse/IGNITE-19022 in a way that 
> logApplyComplition future was introduced that is completed when all local 
> stable log records are applied (org.apache.ignite.raft.jraft.core.NodeImpl):
>  
> {code:java}
> public boolean init(final NodeOptions opts) {
>     ...
>     // Wait committed.
>     long commitIdx = logManager.getLastLogIndex();
>     CompletableFuture<Long> logApplyComplition = new CompletableFuture<>();
>     if (commitIdx > fsmCaller.getLastAppliedIndex()) {
>         LastAppliedLogIndexListener lnsr = new LastAppliedLogIndexListener() {
>             @Override
>             public void onApplied( long lastAppliedLogIndex) {
>                 if (lastAppliedLogIndex >= commitIdx) {
>                     logApplyComplition.complete(lastAppliedLogIndex);
>                     ...
>                 }
>             }
>         };
>     ...
> }{code}
> Depending on replication group type given logApplyComplition is either blocks 
> access to restoring data until ready or is synchronously awaited on start.
> For meta storage and CMG latter is used 
> (org.apache.ignite.internal.raft.Loza).
>  
> {code:java}
> public CompletableFuture<RaftGroupService> startRaftGroupNode(
>         RaftNodeId nodeId,
>         PeersAndLearners configuration,
>         RaftGroupListener lsnr,
>         RaftGroupEventsListener eventsLsnr
> ) throws NodeStoppingException {
>     ...
>     // TODO: https://issues.apache.org/jira/browse/IGNITE-19047 Meta storage 
> and cmg raft log re-application in async manner
>     raftServer.raftNodeReadyFuture(nodeId.groupId()).join();
>     ...
> } {code}
> For partitions, access is blocked until local recovery is finished, see 
> org.apache.ignite.internal.replicator.Replica#ready usage for more details.
> h3. Definition of Done
>  * Implement meta storage and cmg raft log re-application in async manner 
> like it's done for partitions.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-19047) Implement metastorage and cmg raft log re-application in async manner

Reply via email to