[ 
https://issues.apache.org/jira/browse/IGNITE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22928:
-------------------------------------
    Ignite Flags:   (was: Docs Required,Release Notes Required)
          Labels: ignite-3  (was: )

> Fix testZoneReplicaListener
> ---------------------------
>
>                 Key: IGNITE-22928
>                 URL: https://issues.apache.org/jira/browse/IGNITE-22928
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Mikhail Efremov
>            Assignee: Mikhail Efremov
>            Priority: Major
>              Labels: ignite-3
>
> *Description*
> The issue with test is \{{TestPlacementDriver}} that returns only one node 
> that may not be in replication group at least at start of the test and thus 
> have no any replica and raft entities. It leads to \{{NPE}} in the follow 
> code from \{{PartitionReplicaLifecycleManager}}:
> {code:title=|language=java|collapse=false}return localServicesStartFuture
>               .thenComposeAsync(v -> inBusyLock(busyLock, () -> 
> isLocalNodeIsPrimary(replicaGrpId)), ioExecutor)
>               .thenAcceptAsync(isLeaseholder -> inBusyLock(busyLock, () -> {
>                   boolean isLocalNodeInStableOrPending = 
> isNodeInReducedStableOrPendingAssignments(
>                           replicaGrpId,
>                           stableAssignments,
>                           pendingAssignments,
>                           revision
>                   );
>                   if (!isLocalNodeInStableOrPending && !isLeaseholder) {
>                       return;
>                   }
>                   assert isLocalNodeInStableOrPending || isLeaseholder
>                           : "The local node is outside of the replication 
> group [inStableOrPending=" + isLocalNodeInStableOrPending
>                           + ", isLeaseholder=" + isLeaseholder + "].";
>                   // For forced assignments, we exclude dead stable nodes, 
> and all alive stable nodes are already in pending assignments.
>                   // Union is not required in such a case.
>                   Set<Assignment> newAssignments = 
> pendingAssignmentsAreForced || stableAssignments == null
>                           ? pendingAssignmentsNodes
>                           : union(pendingAssignmentsNodes, 
> stableAssignments.nodes());
>                   replicaMgr.replica(replicaGrpId)
>                           .thenApply(Replica::raftClient)
>                           .thenAccept(raftClient -> 
> raftClient.updateConfiguration(fromAssignments(newAssignments)));
>               }), ioExecutor);
> {code}
> On node that has been returning from \{{TestPlacementDriver}} will pass 
> \{{isLocalNodeIsPrimary}} check and all follow checks in any case, but the 
> node doesn't host a replication group, then there no replica future and then 
> \{{replicaMgr#replica}} returns \{{null}} and then \{{NPE}} on 
> \{{null}}-value is thrown.
> The solution is to add to \{{TestPlacementDriver}} kind of mapping of 
> \{{ZonePartitionId}} to \{{ClusterNode}} of "primary" replica host node. But 
> there is an another problem: in debug we can see 25 partitions for zone 0. At 
> least not very suit to write 25 mappings in the map, but zone 0 is a common 
> public zone and is a subject of the test. Then, the solution is to reduce 
> default's zone partition number or add mapping for all it's partitions. 
> *Motivation*
> The crucial test should be fixed.
> *Definition of done*
> The test is passed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to