Re: [PR] IGNITE-26837 Introduce configuration update versioning token [ignite-3]

via GitHub Mon, 10 Nov 2025 06:48:26 -0800


sanpwc commented on code in PR #6850:
URL: https://github.com/apache/ignite-3/pull/6850#discussion_r2509287902



##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/core/NodeImpl.java:
##########
@@ -3635,6 +3673,16 @@ public void changePeersAndLearners(final Configuration 
newPeersAndLearners, long
                     return;
             }
 
+            if (this.conf.getConf().getSequenceToken() > 
newPeersAndLearners.getSequenceToken()) {
+                 LOG.info("Node {} received stale configuration for conf {}, 
existing is {}, new {}.",

Review Comment:
   According to our logging rules, we should put all params in [] at the end of 
the statement.
   Besides I'd also mention that given configuration will be ignored.



##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/core/NodeImpl.java:
##########
@@ -3635,6 +3673,16 @@ public void changePeersAndLearners(final Configuration 
newPeersAndLearners, long
                     return;
             }
 
+            if (this.conf.getConf().getSequenceToken() > 
newPeersAndLearners.getSequenceToken()) {

Review Comment:
   Why it's '>' and not '>=' ?



##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/core/NodeImpl.java:
##########
@@ -3635,6 +3673,16 @@ public void changePeersAndLearners(final Configuration 
newPeersAndLearners, long
                     return;
             }
 
+            if (this.conf.getConf().getSequenceToken() > 
newPeersAndLearners.getSequenceToken()) {
+                 LOG.info("Node {} received stale configuration for conf {}, 
existing is {}, new {}.",
+                        getNodeId(), newPeersAndLearners, 
this.conf.getConf().getSequenceToken(),  
newPeersAndLearners.getSequenceToken());
+                Status status = 
staleConfiguration(newPeersAndLearners.getSequenceToken());

Review Comment:
   Do we really need new raft error `RaftError.ESTALE `? How do you handle this 
error on the client side?



##########
modules/raft-api/src/main/java/org/apache/ignite/internal/raft/service/RaftGroupService.java:
##########
@@ -138,9 +142,11 @@ public interface RaftGroupService extends 
RaftCommandRunner {
      * @param peersAndLearners New peers and Learners of the Raft group.
      * @param term Current known leader term.
      *             If real raft group term will be different - configuration 
update will be skipped.
+     * @param sequenceToken Sequence token of the current change.
+     *
      * @return A future.
      */
-    CompletableFuture<Void> changePeersAndLearners(PeersAndLearners 
peersAndLearners, long term);
+    CompletableFuture<Void> changePeersAndLearners(PeersAndLearners 
peersAndLearners, long term, long sequenceToken);

Review Comment:
   I'd also mark term as deprecated for removal.



##########
modules/cluster-management/src/main/java/org/apache/ignite/internal/cluster/management/raft/CmgRaftService.java:
##########
@@ -338,10 +338,12 @@ public CompletableFuture<Void> updateLearners(long term) {
 
         if (newLearners.isEmpty()) {
             // Methods for working with learners do not support empty peer 
lists for some reason.
-            return raftService.changePeersAndLearnersAsync(newConfiguration, 
term)
+            // TODO: https://issues.apache.org/jira/browse/IGNITE-26855.
+            return raftService.changePeersAndLearnersAsync(newConfiguration, 
term,  0)

Review Comment:
   Below I did suggest/ask why it's '>' and not '>=' in 
`this.conf.getConf().getSequenceToken() > newConf.getSequenceToken()`. In case 
of proposed substitution CMG/MG reconfigurations will break.



##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/entity/LogEntry.java:
##########
@@ -45,6 +45,10 @@ public class LogEntry implements Checksum {
     /** true when the log has checksum **/
     private boolean hasChecksum;
 
+    private long sequenceToken;

Review Comment:
   Do we need sequenceToken in LogEntry in order to spread information about 
new applied configuration through raft group?



##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/core/NodeImpl.java:
##########
@@ -553,16 +559,20 @@ void nextStage() {
                     LOG.info("Catch up phase to change peers was successfully 
finished "
                     + "[node={}, from peers={} to peers={}, from learners={}, 
to learners={}].",
                         this.node.getNodeId(), oldPeers, newPeers, 
oldLearners, newLearners);
-                    if (this.nchanges > 0) {
+                    if (this.nchanges > 0 || oldSequenceToken != 
sequenceToken) {

Review Comment:
   What if oldSequenceToken< sequenceToken? That seems invalid right, should we 
at least assert that?



##########
modules/partition-replicator/src/main/java/org/apache/ignite/internal/partition/replicator/network/replication/ChangePeersAndLearnersAsyncReplicaRequest.java:
##########
@@ -28,4 +28,6 @@
 public interface ChangePeersAndLearnersAsyncReplicaRequest extends 
PrimaryReplicaRequest {
     /** New peers configuration to rebalance. */
     byte[] pendingAssignments();
+
+    long sequenceToken();

Review Comment:
   I'd rather add a comment here explaining what it is, and why we need it. 



##########
modules/metastorage/src/main/java/org/apache/ignite/internal/metastorage/impl/raft/MetaStorageSnapshotStorageFactory.java:
##########
@@ -75,6 +75,8 @@ public MetaStorageSnapshotStorageFactory(KeyValueStorage 
storage) {
                 .peersList(configuration.peers())
                 .oldPeersList(configuration.oldPeers())
                 .learnersList(configuration.learners())
+                .sequenceToken(configuration.sequenceToken())
+                .oldSequenceToken(configuration.oldSequenceToken())

Review Comment:
   Here and there, why do we need oldSequenceToken?



##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/core/NodeImpl.java:
##########
@@ -3701,11 +3768,16 @@ public Status resetPeers(final Configuration newPeers) {
                 LOG.warn("Node {} set peers need wait current conf changing.", 
getNodeId());
                 return new Status(RaftError.EBUSY, "Changing to another 
configuration");
             }
+
+            if (this.conf.getConf().getSequenceToken() > 
newPeers.getSequenceToken()) {
+                return staleConfiguration(newPeers.getSequenceToken());

Review Comment:
    I didn't get the point. Why in some cases we do log
   `LOG.info("Node {} received stale configuration for conf {}, existing is {}, 
new {}."` and in some cases don't?



##########
modules/raft-api/src/main/java/org/apache/ignite/internal/raft/RaftGroupConfiguration.java:
##########
@@ -40,6 +40,8 @@ public class RaftGroupConfiguration implements Serializable {
 
     private final long index;
     private final long term;
+    private final long sequenceToken;
+    private final long oldSequenceToken;

Review Comment:
   Why do we need `oldSequenceToken`?



##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/conf/Configuration.java:
##########
@@ -318,4 +358,11 @@ public void diff(final Configuration rhs, final 
Configuration included, final Co
         excluded.peers = new ArrayList<>(rhs.peers);
         excluded.peers.removeAll(this.peers);
     }
+
+    /**
+     * Update the sequence token.
+     */
+    public void updateSequenceToken(long sequenceToken) {

Review Comment:
   Do we really need it? Currently it's used only within tests, right? So it's 
either @TestOnly, or even configuration should be immutable in the token 
context. 



##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/ReplicaImpl.java:
##########
@@ -67,7 +68,7 @@ public class ReplicaImpl implements Replica {
 
     private final PlacementDriver placementDriver;
 
-    private final Function<ReplicationGroupId, CompletableFuture<byte[]>> 
getPendingAssignmentsSupplier;
+    private final Function<ReplicationGroupId, 
CompletableFuture<IgniteBiTuple<byte[], Long>>> getPendingAssignmentsSupplier;

Review Comment:
   IgniteBiTuple looks a bit ugly, I'd rather add some sort of wrapper.



##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/entity/codec/v2/LogEntryV2CodecFactory.java:
##########
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ignite.raft.jraft.entity.codec.v2;
+
+import org.apache.ignite.raft.jraft.entity.EnumOutter.EntryType;
+import org.apache.ignite.raft.jraft.entity.codec.LogEntryCodecFactory;
+import org.apache.ignite.raft.jraft.entity.codec.LogEntryDecoder;
+import org.apache.ignite.raft.jraft.entity.codec.LogEntryEncoder;
+
+/**
+ * Log entry codec implementation V2. Extends V1 format with sequence token 
support.
+ * Data format description:
+ * <ul>
+ *     <li>Magic header, 1 byte. {@link #MAGIC}</li>
+ *     <li>Log entry type, formally a var-long, effectively 1 byte. {@link 
EntryType}</li>
+ *     <li>Log index, var-long, from 1 to 9 bytes.</li>
+ *     <li>Log term, var-long, from 1 to 9 bytes.</li>
+ *     <li>Checksum, 8 bytes, Little Endian.</li>
+ *     <li><b>Sequence token, var-long, from 1 to 9 bytes. (NEW in V2)</b></li>
+ *     <li><b>Old sequence token, var-long, from 1 to 9 bytes. (NEW in 
V2)</b></li>
+ *     <li>If type is not {@link EntryType#ENTRY_TYPE_DATA}:<ul>
+ *         <li>Number of peers, var-long. Following it is the array of 
peers:</li>
+ *         <li>Length of the peer name, 2 bytes, Little Endian.</li>
+ *         <li>ASCII characters of the peer name, according to the read 
count.</li>
+ *         <li>Peer index, var-long.</li>
+ *         <li>Peer priority + 1, var-long.</li>
+ *         <li>... same block repeats for "oldPeers", "learners" and 
"oldLearners".</li>
+ *     </ul></li>
+ *     <li>If type is not {@link EntryType#ENTRY_TYPE_CONFIGURATION}:<ul>
+ *         <li>The rest of the {@code byte[]} is the data payload.</li>
+ *     </ul></li>
+ * </ul>
+ */
+public class LogEntryV2CodecFactory implements LogEntryCodecFactory {

Review Comment:
   It's not related to AI, however what will happen if oldNode will receive and 
a LogEntry v1? Do we have a tests for that?



##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/rpc/impl/cli/AddLearnersRequestProcessor.java:
##########
@@ -60,9 +61,11 @@ protected Message processRequest0(final CliRequestContext 
ctx, final AddLearners
             addingLearners.add(peer);
         }
 
+        long sequenceToken = request.sequenceToken() != null ? 
request.sequenceToken() : Configuration.NO_SEQUENCE_TOKEN;

Review Comment:
   Why do we need this code?



##########
modules/raft-api/src/test/java/org/apache/ignite/internal/raft/RaftGroupConfigurationSerializerTest.java:
##########
@@ -88,5 +93,22 @@ void v2CanBeDeserialized() {
         assertThat(restoredConfig.learners(), is(List.of("learner1", 
"learner2")));
         assertThat(restoredConfig.oldPeers(), is(List.of("old-peer1", 
"old-peer2")));
         assertThat(restoredConfig.oldLearners(), is(List.of("old-learner1", 
"old-learner2")));
+        assertThat(restoredConfig.sequenceToken(), is(0L));

Review Comment:
   Somewhere, I've asked why we do need `oldSequenceToken`. If it'll occur that 
we need one, we should also `assertThat(restoredConfig.oldSequenceToken(), 
is(0L));`. Here and there.



##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/rpc/impl/cli/ChangePeersAndLearnersRequestProcessor.java:
##########
@@ -93,6 +95,7 @@ protected Message processRequest0(final CliRequestContext 
ctx, final ChangePeers
                     .newPeersList(toStringList(conf.getPeers()))
                     .oldLearnersList(toStringList(oldLearners))
                     .newLearnersList(toStringList(conf.getLearners()))
+                    .sequenceToken(sequenceToken)

Review Comment:
   We we do have sequenceToken in response? And why only in 
ChangePeersAndLearnersResponse?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] IGNITE-26837 Introduce configuration update versioning token [ignite-3]

Reply via email to