[ https://issues.apache.org/jira/browse/IGNITE-22774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mikhail Efremov updated IGNITE-22774: ------------------------------------- Description: *Description* The goal is to remove {{raftClient}} from {{ItTxTestCluster}} due to all raft-clients are created inside replcas through {{ReplicaManager#startReplica}}, and all usages of {{raftClient}} are effectively useless: this map is used in the single common case: {code:title=|language=java|collapse=false}assertSame( txTestCluster.raftClients.get(ACC_TABLE_NAME).get(0).clusterService(), txTestCluster.getLeader(ACC_TABLE_NAME).service() ); {code} Effectively, the first statement get raft-client for a table and it's 0th partition, then extract it's {{ClusterService}}. The second statement inside of {{#getLeader}} do the similar thing: {code:title=|language=java|collapse=false}protected Loza getLeader(String tableName) { var services = raftClients.get(tableName); Peer leader = services.get(0).leader(); assertNotNull(leader); return raftServers.get(leader.consistentId()); } {code} Thus, the second statement means that we get a {{ClusterService}} from {{Loza}} instance that related to leader's raft-client. But when we put clients inside the map, we put only raft-clients from leaders of partition's replication group: {code:title=|language=java|collapse=false}if (startClient) { RaftGroupService service = RaftGroupServiceImpl .start(grpId, client, FACTORY, raftConfig, membersConf, true, executor, commandsMarshaller) .get(5, TimeUnit.SECONDS); clients.put(p, service); } else { // Create temporary client to find a leader address. ClusterService tmpSvc = cluster.get(0); RaftGroupService service = RaftGroupServiceImpl .start(grpId, tmpSvc, FACTORY, raftConfig, membersConf, true, executor, commandsMarshaller) .get(5, TimeUnit.SECONDS); Peer leader = service.leader(); service.shutdown(); ClusterService leaderSrv = cluster.stream() .filter(cluster -> cluster.topologyService().localMember().name().equals(leader.consistentId())) .findAny() .orElseThrow(); RaftGroupService leaderClusterSvc = RaftGroupServiceImpl .start(grpId, leaderSrv, FACTORY, raftConfig, membersConf, true, executor, commandsMarshaller) .get(5, TimeUnit.SECONDS); clients.put(p, leaderClusterSvc); } {code} The first branch relates to tests that requres separated node as transaction coordinator and legacy approach demanded for separated raft-client. Now it all works on replication layer and the first branch will be fully removed from [https://issues.apache.org/jira/browse/IGNITE-22691]. Then, the second branch create tempoaral raft-client just to find out a leader peer, and then get leader's {{ClusterService}} and create an instance of raft-client based on found service. Again, after beforementioned ticket there wouldn't any raft-clients instantinations, but still the logic after this branch is get a leader's raft-client and save it to the map. Then, the first statement get a leader {{ClusterService}} from the map for given table and 0th partition, and then the second statement get {{Loza}} instance for leader's node consistent ID and and ask it for a {{ClusterService}}. The both services are always identical because it's the same replication layer leader node. Then, the assertion is redundand and all the code with {{raftClients}} and {{#leader()}} are the same. The only place that didn't mention, but shouldn't be removed, but rewritten is {{getLeaderId}} that is required by {{ItTxDistributedTestThreeNodesThreeReplicas#testPrimaryReplicaDirectUpdateForExplicitTxn}} test, returns more explict by method's name {{Peer}} type. The test is internally bloking message handling on RAFT replication layer before entity will be written in RAFT-log, and then it must be done exactly on RAFT-group leader. Then, taking into the account {{raftClients}} removal, the method's body should be: {code:title=|language=java|collapse=false}int partId = 0; int partId = 0; return replicaManagers.get(extractConsistentId(cluster.get(partId))) .replica(new TablePartitionId(tables.get(tableName).tableId(), partId)) .thenApply(replica -> replica.raftClient().leader()) .join(); {code} The function {{extractConsistentId(cluster.get(partId))}} is equivalent to {{cluster.get(partId).topologyService().localMember().name()}} and {{tables}} is a mapping of tables' names to {{TableImpl}} instances that are inserted to the map in {{ItTxTestCluster#startTable}} right after instantiation. *Motivation* The map and related code are redundand, aren't working as intented and should be removed. *Definition of done* # {{raftClients}} map is removed. # {{getLeader}} and related assertions are removed. # {{getLeaderId}} method is rewriten as provided above. was: *Description* The goal is to remove {{raftClient}} from {{ItTxTestCluster}} due to all raft-clients are created inside replcas through {{ReplicaManager#startReplica}}, and all usages of {{raftClient}} are effectively useless: this map is used in the single common case: {code:title=|language=java|collapse=false}assertSame( txTestCluster.raftClients.get(ACC_TABLE_NAME).get(0).clusterService(), txTestCluster.getLeader(ACC_TABLE_NAME).service() ); {code} Effectively, the first statement get raft-client for a table and it's 0th partition, then extract it's {{ClusterService}}. The second statement inside of {{#getLeader}} do the similar thing: {code:title=|language=java|collapse=false}protected Loza getLeader(String tableName) { var services = raftClients.get(tableName); Peer leader = services.get(0).leader(); assertNotNull(leader); return raftServers.get(leader.consistentId()); } {code} Thus, the second statement means that we get a {{ClusterService}} from {{Loza}} instance that related to leader's raft-client. But when we put clients inside the map, we put only raft-clients from leaders of partition's replication group: {code:title=|language=java|collapse=false}if (startClient) { RaftGroupService service = RaftGroupServiceImpl .start(grpId, client, FACTORY, raftConfig, membersConf, true, executor, commandsMarshaller) .get(5, TimeUnit.SECONDS); clients.put(p, service); } else { // Create temporary client to find a leader address. ClusterService tmpSvc = cluster.get(0); RaftGroupService service = RaftGroupServiceImpl .start(grpId, tmpSvc, FACTORY, raftConfig, membersConf, true, executor, commandsMarshaller) .get(5, TimeUnit.SECONDS); Peer leader = service.leader(); service.shutdown(); ClusterService leaderSrv = cluster.stream() .filter(cluster -> cluster.topologyService().localMember().name().equals(leader.consistentId())) .findAny() .orElseThrow(); RaftGroupService leaderClusterSvc = RaftGroupServiceImpl .start(grpId, leaderSrv, FACTORY, raftConfig, membersConf, true, executor, commandsMarshaller) .get(5, TimeUnit.SECONDS); clients.put(p, leaderClusterSvc); } {code} The first branch relates to tests that requres separated node as transaction coordinator and legacy approach demanded for separated raft-client. Now it all works on replication layer and the first branch will be fully removed from [https://issues.apache.org/jira/browse/IGNITE-22691]. Then, the second branch create tempoaral raft-client just to find out a leader peer, and then get leader's {{ClusterService}} and create an instance of raft-client based on found service. Again, after beforementioned ticket there wouldn't any raft-clients instantinations, but still the logic after this branch is get a leader's raft-client and save it to the map. Then, the first statement get a leader {{ClusterService}} from the map for given table and 0th partition, and then the second statement get {{Loza}} instance for leader's node consistent ID and and ask it for a {{ClusterService}}. The both services are always identical because it's the same replication layer leader node. Then, the assertion is redundand and all the code with {{raftClients}} and {{#leader()}} are the same. The only place that didn't mention, but shouldn't be removed, but rewritten is {{getLeaderId}} that is required by {{ItTxDistributedTestThreeNodesThreeReplicas#testPrimaryReplicaDirectUpdateForExplicitTxn}} test, returns more explict by method's name {{Peer}} type. The test is internally bloking message handling on RAFT replication layer before entity will be written in RAFT-log, and then it must be done exactly on RAFT-group leader. Then, taking into the account {{raftClients}} removal, the method's body should be: {code:title=|language=java|collapse=false}int partId = 0; reutrn replicaManagers.get(extractConsistentId(cluster.get(partId))) .replica(new TablePartitionId(tables.get(tableName), partId)) .thenApply(replica -> replica.raftClient().leader()); {code} The function {{extractConsistentId(cluster.get(partId))}} is equivalent to {{cluster.get(partId).topologyService().localMember().name()}} and {{tables}} is a mapping of tables' names to {{TableImpl}} instances that are inserted to the map in {{ItTxTestCluster#startTable}} right after instantiation. *Motivation* The map and related code are redundand, aren't working as intented and should be removed. *Definition of done* # {{raftClients}} map is removed. # {{getLeader}} and related assertions are removed. # {{getLeaderId}} method is rewriten as provided above. > Remove raftClient map from ItTxTestCluster > ------------------------------------------ > > Key: IGNITE-22774 > URL: https://issues.apache.org/jira/browse/IGNITE-22774 > Project: Ignite > Issue Type: Improvement > Reporter: Mikhail Efremov > Assignee: Mikhail Efremov > Priority: Major > Labels: ignite-3 > > *Description* > The goal is to remove {{raftClient}} from {{ItTxTestCluster}} due to all > raft-clients are created inside replcas through > {{ReplicaManager#startReplica}}, and all usages of {{raftClient}} are > effectively useless: this map is used in the single common case: > {code:title=|language=java|collapse=false}assertSame( > txTestCluster.raftClients.get(ACC_TABLE_NAME).get(0).clusterService(), > txTestCluster.getLeader(ACC_TABLE_NAME).service() > ); > {code} > Effectively, the first statement get raft-client for a table and it's 0th > partition, then extract it's {{ClusterService}}. The second statement inside > of {{#getLeader}} do the similar thing: > {code:title=|language=java|collapse=false}protected Loza getLeader(String > tableName) { > var services = raftClients.get(tableName); > Peer leader = services.get(0).leader(); > assertNotNull(leader); > return raftServers.get(leader.consistentId()); > } > {code} > Thus, the second statement means that we get a {{ClusterService}} from > {{Loza}} instance that related to leader's raft-client. > But when we put clients inside the map, we put only raft-clients from leaders > of partition's replication group: > {code:title=|language=java|collapse=false}if (startClient) { > RaftGroupService service = RaftGroupServiceImpl > .start(grpId, client, FACTORY, raftConfig, membersConf, true, > executor, commandsMarshaller) > .get(5, TimeUnit.SECONDS); > clients.put(p, service); > } else { > // Create temporary client to find a leader address. > ClusterService tmpSvc = cluster.get(0); > RaftGroupService service = RaftGroupServiceImpl > .start(grpId, tmpSvc, FACTORY, raftConfig, membersConf, true, > executor, commandsMarshaller) > .get(5, TimeUnit.SECONDS); > Peer leader = service.leader(); > service.shutdown(); > ClusterService leaderSrv = cluster.stream() > .filter(cluster -> > cluster.topologyService().localMember().name().equals(leader.consistentId())) > .findAny() > .orElseThrow(); > RaftGroupService leaderClusterSvc = RaftGroupServiceImpl > .start(grpId, leaderSrv, FACTORY, raftConfig, membersConf, true, > executor, commandsMarshaller) > .get(5, TimeUnit.SECONDS); > clients.put(p, leaderClusterSvc); > } > {code} > The first branch relates to tests that requres separated node as transaction > coordinator and legacy approach demanded for separated raft-client. Now it > all works on replication layer and the first branch will be fully removed > from [https://issues.apache.org/jira/browse/IGNITE-22691]. Then, the second > branch create tempoaral raft-client just to find out a leader peer, and then > get leader's {{ClusterService}} and create an instance of raft-client based > on found service. Again, after beforementioned ticket there wouldn't any > raft-clients instantinations, but still the logic after this branch is get a > leader's raft-client and save it to the map. > Then, the first statement get a leader {{ClusterService}} from the map for > given table and 0th partition, and then the second statement get {{Loza}} > instance for leader's node consistent ID and and ask it for a > {{ClusterService}}. The both services are always identical because it's the > same replication layer leader node. Then, the assertion is redundand and all > the code with {{raftClients}} and {{#leader()}} are the same. > The only place that didn't mention, but shouldn't be removed, but rewritten > is {{getLeaderId}} that is required by > {{ItTxDistributedTestThreeNodesThreeReplicas#testPrimaryReplicaDirectUpdateForExplicitTxn}} > test, returns more explict by method's name {{Peer}} type. The test is > internally bloking message handling on RAFT replication layer before entity > will be written in RAFT-log, and then it must be done exactly on RAFT-group > leader. Then, taking into the account {{raftClients}} removal, the method's > body should be: > {code:title=|language=java|collapse=false}int partId = 0; > int partId = 0; > return replicaManagers.get(extractConsistentId(cluster.get(partId))) > .replica(new > TablePartitionId(tables.get(tableName).tableId(), partId)) > .thenApply(replica -> replica.raftClient().leader()) > .join(); > {code} > The function {{extractConsistentId(cluster.get(partId))}} is equivalent to > {{cluster.get(partId).topologyService().localMember().name()}} and {{tables}} > is a mapping of tables' names to {{TableImpl}} instances that are inserted to > the map in {{ItTxTestCluster#startTable}} right after instantiation. > *Motivation* > The map and related code are redundand, aren't working as intented and should > be removed. > *Definition of done* > # {{raftClients}} map is removed. > # {{getLeader}} and related assertions are removed. > # {{getLeaderId}} method is rewriten as provided above. -- This message was sent by Atlassian Jira (v8.20.10#820010)