[ https://issues.apache.org/jira/browse/IGNITE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mikhail Efremov updated IGNITE-22928: ------------------------------------- Ignite Flags: (was: Docs Required,Release Notes Required) Labels: ignite-3 (was: ) > Fix testZoneReplicaListener > --------------------------- > > Key: IGNITE-22928 > URL: https://issues.apache.org/jira/browse/IGNITE-22928 > Project: Ignite > Issue Type: Improvement > Reporter: Mikhail Efremov > Assignee: Mikhail Efremov > Priority: Major > Labels: ignite-3 > > *Description* > The issue with test is \{{TestPlacementDriver}} that returns only one node > that may not be in replication group at least at start of the test and thus > have no any replica and raft entities. It leads to \{{NPE}} in the follow > code from \{{PartitionReplicaLifecycleManager}}: > {code:title=|language=java|collapse=false}return localServicesStartFuture > .thenComposeAsync(v -> inBusyLock(busyLock, () -> > isLocalNodeIsPrimary(replicaGrpId)), ioExecutor) > .thenAcceptAsync(isLeaseholder -> inBusyLock(busyLock, () -> { > boolean isLocalNodeInStableOrPending = > isNodeInReducedStableOrPendingAssignments( > replicaGrpId, > stableAssignments, > pendingAssignments, > revision > ); > if (!isLocalNodeInStableOrPending && !isLeaseholder) { > return; > } > assert isLocalNodeInStableOrPending || isLeaseholder > : "The local node is outside of the replication > group [inStableOrPending=" + isLocalNodeInStableOrPending > + ", isLeaseholder=" + isLeaseholder + "]."; > // For forced assignments, we exclude dead stable nodes, > and all alive stable nodes are already in pending assignments. > // Union is not required in such a case. > Set<Assignment> newAssignments = > pendingAssignmentsAreForced || stableAssignments == null > ? pendingAssignmentsNodes > : union(pendingAssignmentsNodes, > stableAssignments.nodes()); > replicaMgr.replica(replicaGrpId) > .thenApply(Replica::raftClient) > .thenAccept(raftClient -> > raftClient.updateConfiguration(fromAssignments(newAssignments))); > }), ioExecutor); > {code} > On node that has been returning from \{{TestPlacementDriver}} will pass > \{{isLocalNodeIsPrimary}} check and all follow checks in any case, but the > node doesn't host a replication group, then there no replica future and then > \{{replicaMgr#replica}} returns \{{null}} and then \{{NPE}} on > \{{null}}-value is thrown. > The solution is to add to \{{TestPlacementDriver}} kind of mapping of > \{{ZonePartitionId}} to \{{ClusterNode}} of "primary" replica host node. But > there is an another problem: in debug we can see 25 partitions for zone 0. At > least not very suit to write 25 mappings in the map, but zone 0 is a common > public zone and is a subject of the test. Then, the solution is to reduce > default's zone partition number or add mapping for all it's partitions. > *Motivation* > The crucial test should be fixed. > *Definition of done* > The test is passed. -- This message was sent by Atlassian Jira (v8.20.10#820010)