[jira] [Assigned] (IGNITE-15889) Add 'contains' method to Record API

2024-01-15 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-15889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov reassigned IGNITE-15889:


Assignee: Mikhail Efremov

> Add 'contains' method to Record API
> ---
>
> Key: IGNITE-15889
> URL: https://issues.apache.org/jira/browse/IGNITE-15889
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Andrey Mashenkov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3, newbie
>
> There is no method in Record API with the same semantic as the 'contains' 
> method in KV views.
> Add *RecordView.contains* similar to *KeyValueView.contains*.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-23010) Raft-client's starting parametrization with start leader option

2024-08-16 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-23010:


 Summary: Raft-client's starting parametrization with start leader 
option
 Key: IGNITE-23010
 URL: https://issues.apache.org/jira/browse/IGNITE-23010
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

After IGNITE-22373 and IGNITE-22691 several tests are passing longer because of 
there unintented (for tests cases) raft-nodes strating processes and related 
leader elections. This leads to ~10-15sec more duration per test. Even if we 
pass "getLeader" flag to \{{RaftGroupServiceImpl#start}} to \{{false}} state, 
it will negotiate the slowdown effect.

*Motivation*

The main idea is to speed up tests that doesn't require raft-nodes and it's 
internal logic.

*Definition of done*

# \{{getLeader}} is passed as \{{false}} from tests to 
\{{RaftGroupServiceImpl#start}} depends on test case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-23010) Raft-client's starting parametrization with getLeader option

2024-08-16 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-23010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-23010:
-
Summary: Raft-client's starting parametrization with getLeader option  
(was: Raft-client's starting parametrization with start leader option)

> Raft-client's starting parametrization with getLeader option
> 
>
> Key: IGNITE-23010
> URL: https://issues.apache.org/jira/browse/IGNITE-23010
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>
> *Description*
> After IGNITE-22373 and IGNITE-22691 several tests are passing longer because 
> of there unintented (for tests cases) raft-nodes strating processes and 
> related leader elections. This leads to ~10-15sec more duration per test. 
> Even if we pass "getLeader" flag to \{{RaftGroupServiceImpl#start}} to 
> \{{false}} state, it will negotiate the slowdown effect.
> *Motivation*
> The main idea is to speed up tests that doesn't require raft-nodes and it's 
> internal logic.
> *Definition of done*
> # \{{getLeader}} is passed as \{{false}} from tests to 
> \{{RaftGroupServiceImpl#start}} depends on test case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-23010) Raft-client's starting parametrization with getLeader option

2024-08-16 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-23010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-23010:
-
Labels: ignite-3  (was: )

> Raft-client's starting parametrization with getLeader option
> 
>
> Key: IGNITE-23010
> URL: https://issues.apache.org/jira/browse/IGNITE-23010
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> After IGNITE-22373 and IGNITE-22691 several tests are passing longer because 
> of there unintented (for tests cases) raft-nodes strating processes and 
> related leader elections. This leads to ~10-15sec more duration per test. 
> Even if we pass "getLeader" flag to \{{RaftGroupServiceImpl#start}} to 
> \{{false}} state, it will negotiate the slowdown effect.
> *Motivation*
> The main idea is to speed up tests that doesn't require raft-nodes and it's 
> internal logic.
> *Definition of done*
> # \{{getLeader}} is passed as \{{false}} from tests to 
> \{{RaftGroupServiceImpl#start}} depends on test case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-23019) Research before IGNITE-23010

2024-08-19 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-23019:


 Summary: Research before IGNITE-23010
 Key: IGNITE-23019
 URL: https://issues.apache.org/jira/browse/IGNITE-23019
 Project: Ignite
  Issue Type: Task
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

This task is a preparation for IGNITE-23010. We had to try to change 
\{{getLeader}} flag to \{{RaftGroupServiceImpl#start}} to \{{false}} state for 
all partitions cases. Then check how much tests will fail. The idea is that 
partitions rafts aren't really need leader for raft-clietn start. If it is, 
then we shouldn't parameterize the flag, but just turn it off. 

*Motivation*

This investigation may drastically decrease complexity of the ticket. 

*Definition of done*

Results of investigation are provided a comment section of this ticket, the 
proper solution for IGNITE-23010 is provided.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-23019) Research before IGNITE-23010

2024-08-19 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-23019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-23019:
-
Labels: ignite-3  (was: )

> Research before IGNITE-23010
> 
>
> Key: IGNITE-23019
> URL: https://issues.apache.org/jira/browse/IGNITE-23019
> Project: Ignite
>  Issue Type: Task
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> This task is a preparation for IGNITE-23010. We had to try to change 
> \{{getLeader}} flag to \{{RaftGroupServiceImpl#start}} to \{{false}} state 
> for all partitions cases. Then check how much tests will fail. The idea is 
> that partitions rafts aren't really need leader for raft-clietn start. If it 
> is, then we shouldn't parameterize the flag, but just turn it off. 
> *Motivation*
> This investigation may drastically decrease complexity of the ticket. 
> *Definition of done*
> Results of investigation are provided a comment section of this ticket, the 
> proper solution for IGNITE-23010 is provided.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22622) PartitionReplicaLifecycleManager must use weakStart/weaskStop approach

2024-08-20 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22622:
-
Description: 
*Description*

This task is a possible solution for all issues that are related to Lifecycle 
Manager tests like IGNITE-22928 on the one hand. On the other hand Lifecycle 
Manager should has the similar logic on start and stop zone replicas as Table 
Manager. *Motivation*

This ticket may fix \{{ItReplicaLifecycleTest}}. 

*Definition of done*

# start of zone replicas should be wrapped with weak start.
# stop of zone replicas should be wrapped with weak stop. 

> PartitionReplicaLifecycleManager must use weakStart/weaskStop approach
> --
>
> Key: IGNITE-22622
> URL: https://issues.apache.org/jira/browse/IGNITE-22622
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> This task is a possible solution for all issues that are related to Lifecycle 
> Manager tests like IGNITE-22928 on the one hand. On the other hand Lifecycle 
> Manager should has the similar logic on start and stop zone replicas as Table 
> Manager. *Motivation*
> This ticket may fix \{{ItReplicaLifecycleTest}}. 
> *Definition of done*
> # start of zone replicas should be wrapped with weak start.
> # stop of zone replicas should be wrapped with weak stop. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22622) PartitionReplicaLifecycleManager must use weakStart/weaskStop approach

2024-08-20 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22622:
-
Description: 
*Description*

This task is a possible solution for all issues that are related to Lifecycle 
Manager tests like IGNITE-22928 on the one hand. On the other hand Lifecycle 
Manager should has the similar logic on start and stop zone replicas as Table 
Manager. 

*Motivation*

This ticket may fix {{{}ItReplicaLifecycleTest{}}}. 

*Definition of done*
 # start of zone replicas should be wrapped with weak start.
 # stop of zone replicas should be wrapped with weak stop. 

  was:
*Description*

This task is a possible solution for all issues that are related to Lifecycle 
Manager tests like IGNITE-22928 on the one hand. On the other hand Lifecycle 
Manager should has the similar logic on start and stop zone replicas as Table 
Manager. *Motivation*

This ticket may fix \{{ItReplicaLifecycleTest}}. 

*Definition of done*

# start of zone replicas should be wrapped with weak start.
# stop of zone replicas should be wrapped with weak stop. 


> PartitionReplicaLifecycleManager must use weakStart/weaskStop approach
> --
>
> Key: IGNITE-22622
> URL: https://issues.apache.org/jira/browse/IGNITE-22622
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> This task is a possible solution for all issues that are related to Lifecycle 
> Manager tests like IGNITE-22928 on the one hand. On the other hand Lifecycle 
> Manager should has the similar logic on start and stop zone replicas as Table 
> Manager. 
> *Motivation*
> This ticket may fix {{{}ItReplicaLifecycleTest{}}}. 
> *Definition of done*
>  # start of zone replicas should be wrapped with weak start.
>  # stop of zone replicas should be wrapped with weak stop. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-22622) PartitionReplicaLifecycleManager must use weakStart/weaskStop approach

2024-08-21 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov reassigned IGNITE-22622:


Assignee: Mikhail Efremov

> PartitionReplicaLifecycleManager must use weakStart/weaskStop approach
> --
>
> Key: IGNITE-22622
> URL: https://issues.apache.org/jira/browse/IGNITE-22622
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> This task is a possible solution for all issues that are related to Lifecycle 
> Manager tests like IGNITE-22928 on the one hand. On the other hand Lifecycle 
> Manager should has the similar logic on start and stop zone replicas as Table 
> Manager. 
> *Motivation*
> This ticket may fix {{{}ItReplicaLifecycleTest{}}}. 
> *Definition of done*
>  # start of zone replicas should be wrapped with weak start.
>  # stop of zone replicas should be wrapped with weak stop. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-23063) ItReplicaLifecycleTest.testStableAreWrittenAfterRestart(TestInfo) is flaky

2024-08-26 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-23063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov reassigned IGNITE-23063:


Assignee: Mikhail Efremov

> ItReplicaLifecycleTest.testStableAreWrittenAfterRestart(TestInfo) is flaky
> --
>
> Key: IGNITE-23063
> URL: https://issues.apache.org/jira/browse/IGNITE-23063
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> {code:java}
> java.lang.AssertionError: java.util.concurrent.ExecutionException: 
> java.lang.AssertionError: Raft nodes [RaftNodeId [groupId=0_part_13, 
> peer=Peer [consistentId=irlt_tsawar_2, idx=0]], RaftNodeId 
> [groupId=0_part_14, peer=Peer [consistentId=irlt_tsawar_2, idx=0]]] are 
> still running on the Ignite node irlt_tsawar_2  at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:78)
>   at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
>   at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:83)  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)  at 
> org.apache.ignite.internal.partition.replicator.ItReplicaLifecycleTest$Node.stop(ItReplicaLifecycleTest.java:1339)
>   at java.base/java.util.HashMap$Values.forEach(HashMap.java:977)  at 
> org.apache.ignite.internal.partition.replicator.ItReplicaLifecycleTest.after(ItReplicaLifecycleTest.java:303)
>  {code}
> Failed both on 
> [TC|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests?branch=%3Cdefault%3E&mode=builds&hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandBuildTestsSection=true&expandBuildProblemsSection=true#8421225]
>  and locally. Locally 4/10.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-23063) ItReplicaLifecycleTest.testStableAreWrittenAfterRestart(TestInfo) is flaky

2024-08-27 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-23063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-23063:
-
Epic Link: IGNITE-21389

> ItReplicaLifecycleTest.testStableAreWrittenAfterRestart(TestInfo) is flaky
> --
>
> Key: IGNITE-23063
> URL: https://issues.apache.org/jira/browse/IGNITE-23063
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> java.lang.AssertionError: java.util.concurrent.ExecutionException: 
> java.lang.AssertionError: Raft nodes [RaftNodeId [groupId=0_part_13, 
> peer=Peer [consistentId=irlt_tsawar_2, idx=0]], RaftNodeId 
> [groupId=0_part_14, peer=Peer [consistentId=irlt_tsawar_2, idx=0]]] are 
> still running on the Ignite node irlt_tsawar_2  at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:78)
>   at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
>   at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:83)  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)  at 
> org.apache.ignite.internal.partition.replicator.ItReplicaLifecycleTest$Node.stop(ItReplicaLifecycleTest.java:1339)
>   at java.base/java.util.HashMap$Values.forEach(HashMap.java:977)  at 
> org.apache.ignite.internal.partition.replicator.ItReplicaLifecycleTest.after(ItReplicaLifecycleTest.java:303)
>  {code}
> Failed both on 
> [TC|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests?branch=%3Cdefault%3E&mode=builds&hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandBuildTestsSection=true&expandBuildProblemsSection=true#8421225]
>  and locally. Locally 4/10.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-23109) Add a correct raft-clients shutdown process in Replica Manager

2024-08-29 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-23109:


 Summary: Add a correct raft-clients shutdown process in Replica 
Manager
 Key: IGNITE-23109
 URL: https://issues.apache.org/jira/browse/IGNITE-23109
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

For now there is no proper raft-clients (\{{RaftGroupService}}) shutdown 
process in \{{ReplicaManager}}. The last is the main place that create and thus 
must shutdown raft clients properly.

Also there is an intention to make \{{RaftGroupService}} implements 
\{{ManuallyClosable}}. It will able us to use proper closing methods like 
\{{IgniteUtils#closeAllManually}} on the one hand and unify interface on the 
another hand. The closing process should do both \{{unsubscribeLeader}} and 
\{{shutdown}} calls.

*Motivation*

The main reason is to proper closing resources such as raft clients that isn't 
doing now at all.

*Definition of done*

# \{{RaftGroupService}} implements \{{ManuallyClosable}}.
# \{{RaftGroupService#close}} consists of  \{{unsubscribeLeader}} and 
\{{shutdown}} calls.
# Raft clients are closed in \{{ReplicaManager#stopAsync}} through  
\{{IgniteUtils#closeAllManually}} method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19713) Start partition storages only for locally assigned partitions

2024-08-30 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-19713:
-
Description: 
*Description*

IGNITE-19363 introduces \{{PartitionSet}} which is populated with all 
partitions by default (not only assigned ones). Need to only set partitions 
that are assigned locally.

*Motivation*

We shouldn't start resources if there is no need.

*Definition of done*

# There is check on local assignments (may use already existed 
\{{isLocalNodeInAssignments}} method).
# If check true then and only then set a partition in \{{PartitionSet}}.

  was:IGNITE-19363 introduces PartitionSet which is populated with all 
partitions by default (not only assigned ones). Need to only set partitions 
that are assigned locally.


> Start partition storages only for locally assigned partitions
> -
>
> Key: IGNITE-19713
> URL: https://issues.apache.org/jira/browse/IGNITE-19713
> Project: Ignite
>  Issue Type: Bug
>Reporter: Semyon Danilov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> IGNITE-19363 introduces \{{PartitionSet}} which is populated with all 
> partitions by default (not only assigned ones). Need to only set partitions 
> that are assigned locally.
> *Motivation*
> We shouldn't start resources if there is no need.
> *Definition of done*
> # There is check on local assignments (may use already existed 
> \{{isLocalNodeInAssignments}} method).
> # If check true then and only then set a partition in \{{PartitionSet}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-15889) Add 'contains' method to Record API

2024-01-29 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-15889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812174#comment-17812174
 ] 

Mikhail Efremov commented on IGNITE-15889:
--

[~amashenkov] Hi, I did the pull request, but have several questions:
 # I have registered in TC, but have no access to the project and can't view 
check logs without help. Can you grant me kind of viewer access in TC? The 
username is the same as in JIRA.
 # I launched gradle test task, It failed, but the failed test relates to 
logging module {{{}ClientLoggingTest::testBasicLogging{}}}, I'm not sure should 
I to fix it or not?

Thank you~

> Add 'contains' method to Record API
> ---
>
> Key: IGNITE-15889
> URL: https://issues.apache.org/jira/browse/IGNITE-15889
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Andrey Mashenkov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3, newbie
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> There is no method in Record API with the same semantic as the 'contains' 
> method in KV views.
> Add *RecordView.contains* similar to *KeyValueView.contains*.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21387) Recovery is not possible, if node have no needed storage profile

2024-02-27 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-21387:
-
Summary: Recovery is not possible, if  node have no needed storage profile  
(was: Recovery is not possible, if node have no needed storage profile)

> Recovery is not possible, if  node have no needed storage profile
> -
>
> Key: IGNITE-21387
> URL: https://issues.apache.org/jira/browse/IGNITE-21387
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> Looks like any table try to create storages on the recovery node, even if is 
> shouldn't be here, because of zone storage profile filter.
> The isssue is reproducing by 
> ItDistributionZonesFiltersTest#testFilteredDataNodesPropagatedToStable. So, 
> test must be enabled or reworked by this ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21387) Recovery is not possible, if node have no needed storage profile

2024-02-27 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-21387:
-
Summary: Recovery is not possible, if node have no needed storage profile  
(was: Recovery is not possible, if  node have no needed storage profile)

> Recovery is not possible, if node have no needed storage profile
> 
>
> Key: IGNITE-21387
> URL: https://issues.apache.org/jira/browse/IGNITE-21387
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> Looks like any table try to create storages on the recovery node, even if is 
> shouldn't be here, because of zone storage profile filter.
> The isssue is reproducing by 
> ItDistributionZonesFiltersTest#testFilteredDataNodesPropagatedToStable. So, 
> test must be enabled or reworked by this ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-15889) Add 'contains' method to Record API

2024-02-27 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-15889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-15889:
-
Summary: Add 'contains' method to Record  API  (was: Add 'contains' method 
to Record API)

> Add 'contains' method to Record  API
> 
>
> Key: IGNITE-15889
> URL: https://issues.apache.org/jira/browse/IGNITE-15889
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Andrey Mashenkov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3, newbie
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> There is no method in Record API with the same semantic as the 'contains' 
> method in KV views.
> Add *RecordView.contains* similar to *KeyValueView.contains*.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-15889) Add 'contains' method to Record API

2024-02-27 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-15889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-15889:
-
Summary: Add 'contains' method to Record API  (was: Add 'contains' method 
to Record  API)

> Add 'contains' method to Record API
> ---
>
> Key: IGNITE-15889
> URL: https://issues.apache.org/jira/browse/IGNITE-15889
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Andrey Mashenkov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3, newbie
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> There is no method in Record API with the same semantic as the 'contains' 
> method in KV views.
> Add *RecordView.contains* similar to *KeyValueView.contains*.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21641) OOM in PartitionReplicaListenerTest

2024-03-01 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-21641:
-
Attachment: image-2024-03-01-20-36-08-577.png

> OOM in PartitionReplicaListenerTest
> ---
>
> Key: IGNITE-21641
> URL: https://issues.apache.org/jira/browse/IGNITE-21641
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
> Attachments: image-2024-03-01-12-22-32-053.png, 
> image-2024-03-01-20-36-08-577.png
>
>
> TC run failed with OOM
> Problem occurred after 
> PartitionReplicaListenerTest.testReadOnlyGetAfterRowRewrite run, 
> {noformat}
> [2024-03-01T05:12:50,629][INFO ][Test worker][PartitionReplicaListenerTest] 
> >>> Starting test: 
> PartitionReplicaListenerTest#testReadOnlyGetAfterRowRewrite, displayName: 
> [14] true, true, false, true
> [2024-03-01T05:12:50,629][INFO ][Test worker][PartitionReplicaListenerTest] 
> workDir: 
> build/work/PartitionReplicaListenerTest/testReadOnlyGetAfterRowRewrite_33496469368142283
> [2024-03-01T05:12:50,638][INFO ][Test worker][PartitionReplicaListenerTest] 
> >>> Stopping test: 
> PartitionReplicaListenerTest#testReadOnlyGetAfterRowRewrite, displayName: 
> [14] true, true, false, true, cost: 8ms.
> [05:12:50] :   [testReadOnlyGetAfterRowRewrite(boolean, 
> boolean, boolean, boolean)] 
> org.apache.ignite.internal.table.distributed.replication.PartitionReplicaListenerTest.testReadOnlyGetAfterRowRewrite([15]
>  true, true, true, false) (10m:22s)
> [05:12:50] :   [:ignite-table:test] PartitionReplicaListenerTest > 
> testReadOnlyGetAfterRowRewrite(boolean, boolean, boolean, boolean) > [15] 
> true, true, true, false STANDARD_OUT
> [05:12:50] :   [:ignite-table:test] 
> [2024-03-01T05:12:50,648][INFO ][Test worker][PartitionReplicaListenerTest] 
> >>> Starting test: 
> PartitionReplicaListenerTest#testReadOnlyGetAfterRowRewrite, displayName: 
> [15] true, true, true, false
> [05:12:50] :   [:ignite-table:test] 
> [2024-03-01T05:12:50,648][INFO ][Test worker][PartitionReplicaListenerTest] 
> workDir: 
> build/work/PartitionReplicaListenerTest/testReadOnlyGetAfterRowRewrite_33496469386328241
> [05:18:42] :   [:ignite-table:test] java.lang.OutOfMemoryError: Java 
> heap space
> [05:18:42] :   [:ignite-table:test] Dumping heap to 
> java_pid2349600.hprof ...
> [05:19:06] :   [:ignite-table:test] Heap dump file created 
> [3645526743 bytes in 24.038 secs]
> {noformat}
> https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7898564?hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandCode+Inspection=true&expandBuildProblemsSection=true&expandBuildChangesSection=true&expandBuildDeploymentsSection=false
> After analysing heap dump it appears that the reason of OOM is a problem with 
> Mockito.
>  !image-2024-03-01-12-22-32-053.png! 
> We need to investigate the reason of a problem with Mockito 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21641) OOM in PartitionReplicaListenerTest

2024-03-01 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822643#comment-17822643
 ] 

Mikhail Efremov commented on IGNITE-21641:
--

The main suspect there is while(true) loop, that is a separated issue, and 
delete-upsert-read chain that produces large a row versions chain. 
!image-2024-03-01-20-36-08-577.png!

So, the solution may be for the one hand to restrict loop's iterations count or 
set a timeout for a test, and for another hand is to figure out what's wrong 
with delete-upsert-read. The flags case is: insertFirst=true, 
upsertAfterDelete=true, committed=true, multiple=false. The reason may be in 
the if (row == null) {} statement.

> OOM in PartitionReplicaListenerTest
> ---
>
> Key: IGNITE-21641
> URL: https://issues.apache.org/jira/browse/IGNITE-21641
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
> Attachments: image-2024-03-01-12-22-32-053.png, 
> image-2024-03-01-20-36-08-577.png
>
>
> TC run failed with OOM
> Problem occurred after 
> PartitionReplicaListenerTest.testReadOnlyGetAfterRowRewrite run, 
> {noformat}
> [2024-03-01T05:12:50,629][INFO ][Test worker][PartitionReplicaListenerTest] 
> >>> Starting test: 
> PartitionReplicaListenerTest#testReadOnlyGetAfterRowRewrite, displayName: 
> [14] true, true, false, true
> [2024-03-01T05:12:50,629][INFO ][Test worker][PartitionReplicaListenerTest] 
> workDir: 
> build/work/PartitionReplicaListenerTest/testReadOnlyGetAfterRowRewrite_33496469368142283
> [2024-03-01T05:12:50,638][INFO ][Test worker][PartitionReplicaListenerTest] 
> >>> Stopping test: 
> PartitionReplicaListenerTest#testReadOnlyGetAfterRowRewrite, displayName: 
> [14] true, true, false, true, cost: 8ms.
> [05:12:50] :   [testReadOnlyGetAfterRowRewrite(boolean, 
> boolean, boolean, boolean)] 
> org.apache.ignite.internal.table.distributed.replication.PartitionReplicaListenerTest.testReadOnlyGetAfterRowRewrite([15]
>  true, true, true, false) (10m:22s)
> [05:12:50] :   [:ignite-table:test] PartitionReplicaListenerTest > 
> testReadOnlyGetAfterRowRewrite(boolean, boolean, boolean, boolean) > [15] 
> true, true, true, false STANDARD_OUT
> [05:12:50] :   [:ignite-table:test] 
> [2024-03-01T05:12:50,648][INFO ][Test worker][PartitionReplicaListenerTest] 
> >>> Starting test: 
> PartitionReplicaListenerTest#testReadOnlyGetAfterRowRewrite, displayName: 
> [15] true, true, true, false
> [05:12:50] :   [:ignite-table:test] 
> [2024-03-01T05:12:50,648][INFO ][Test worker][PartitionReplicaListenerTest] 
> workDir: 
> build/work/PartitionReplicaListenerTest/testReadOnlyGetAfterRowRewrite_33496469386328241
> [05:18:42] :   [:ignite-table:test] java.lang.OutOfMemoryError: Java 
> heap space
> [05:18:42] :   [:ignite-table:test] Dumping heap to 
> java_pid2349600.hprof ...
> [05:19:06] :   [:ignite-table:test] Heap dump file created 
> [3645526743 bytes in 24.038 secs]
> {noformat}
> https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7898564?hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandCode+Inspection=true&expandBuildProblemsSection=true&expandBuildChangesSection=true&expandBuildDeploymentsSection=false
> After analysing heap dump it appears that the reason of OOM is a problem with 
> Mockito.
>  !image-2024-03-01-12-22-32-053.png! 
> We need to investigate the reason of a problem with Mockito 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21808) CREATE ZONE syntax must work with any the case of zone name

2024-03-28 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831788#comment-17831788
 ] 

Mikhail Efremov commented on IGNITE-21808:
--

[~korlov] could you take a look too?

> CREATE ZONE syntax must work with any the case of zone name
> ---
>
> Key: IGNITE-21808
> URL: https://issues.apache.org/jira/browse/IGNITE-21808
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3, sql
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> h3. Steps to reproduce
> If we try to run such code
> {code:java}
> cluster.doInSession(0, session -> {
> session.execute(null, "CREATE ZONE test_zone");
> session.execute(null, "CREATE TABLE TEST (id INT PRIMARY KEY, 
> name INT) WITH PRIMARY_ZONE='test_zone'");
> });
> {code}
> It fails with {{Failed to validate query. Distribution zone with name 
> 'test_zone' not found,}}
> but works if zone name is in upper case
> This behaviour must be changed.
> h3. Definition of done
>  * Zone name must support any case, not only upper case
>  * Corresponding tests must be added



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22001) Throw specific exception if during writeTableAssignmentsToMetastore process was interrupted

2024-04-08 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22001:


 Summary: Throw specific exception if during 
writeTableAssignmentsToMetastore process was interrupted
 Key: IGNITE-22001
 URL: https://issues.apache.org/jira/browse/IGNITE-22001
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


h2. The problem

In {{TableManager#writeTableAssignmentsToMetastore:752}} as a result of output 
{{CompletableFuture}} returns {{{}null{}}}-value. In cases of 
{{writeTableAssignmentsToMetastore}} method's using it leads to sudden 
{{NullPointerException}} without clear understandings of the reasons of such 
situation.
h2. The solution

Instead of returning {{{}null{}}}-value re-throw more specific exception with 
given assignments list to write in metastorage, and table identifier that 
should help to investigate cases of the method interruption.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21387) Recovery is not possible, if node have no needed storage profile

2024-04-15 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837121#comment-17837121
 ] 

Mikhail Efremov commented on IGNITE-21387:
--

At the current state, the following changes are done:
 * In {{ItDistributionZonesFiltersTest}} the single 
testFilteredDataNodesPropagatedToStable test was divided on 3 separate 
tests:{color:#6b2fba}
{color}
 ** 
{{testDataNodesByFilterPropagatedToStable}} that tests only by attribute 
filtering by specified region and storage filter;
 ** 
{{testDataNodesByProfilePropagatedToStable}} that tests only by matching 
profile name;
 ** 
{{testDataNodesByFilterAndProfilePropagatedToStable}} tests both cases 
together: attribute filters and profile both.
 * In {{TableManager}} was done refactoring:
 ** exctracting assignments creation into the separate method 
{{createOrGetLocallyAssignmentsAsync}} along as 
{{writeTableAssignmentsToMetastore;
}}

> Recovery is not possible, if node have no needed storage profile
> 
>
> Key: IGNITE-21387
> URL: https://issues.apache.org/jira/browse/IGNITE-21387
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> Looks like any table try to create storages on the recovery node, even if is 
> shouldn't be here, because of zone storage profile filter.
> The isssue is reproducing by 
> ItDistributionZonesFiltersTest#testFilteredDataNodesPropagatedToStable. So, 
> test must be enabled or reworked by this ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21387) Recovery is not possible, if node have no needed storage profile

2024-04-15 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837121#comment-17837121
 ] 

Mikhail Efremov edited comment on IGNITE-21387 at 4/15/24 9:05 AM:
---

At the current state, the following changes are done:
 * In {{ItDistributionZonesFiltersTest}} the single 
testFilteredDataNodesPropagatedToStable test was divided on 3 separate tests:
 ** {{testDataNodesByFilterPropagatedToStable}} that tests only by attribute 
filtering by specified region and storage filter;

 ** {{testDataNodesByProfilePropagatedToStable}} that tests only by matching 
profile name;

 ** {{testDataNodesByFilterAndProfilePropagatedToStable}} tests both cases 
together: attribute filters and profile both.

 * In {{TableManager}} was done refactoring:
 ** extracting assignments creation into the separate method 
{{createOrGetLocallyAssignmentsAsync}} along as 
{{{}writeTableAssignmentsToMetastore{}}};
 ** the goal of the previous bullet is to introduce new methods 
{{isLocalNodeInAssignmentList}} and 
{{isLocalNodeMatchAssignment}} that should be used for checking on a node's 
TableManager and IndexManager is the node belongs to current stable assignments 
or not. If the latter, then node shouldn't create storages and indices: 
{{MvTableStorage}} and 
{{TxStateTableStorage}} on table create event in 
{{TableManager#createStorages}} and {{IndexManager#startIndexAsync}} 
correspondingly.
 

Current issue is related to versioned values' race:
 * {{TableManager.java:1395}} on {{tableVv.update(...)}}
 * {{IndexManager.java:270}} on {{handleMetastoreEventVv.update(...)}}

There is {{AssertionError}} with a message like "Causality token is outdated, 
previous token 21, got 21" when the number 21 may vary.

The reason is metastore's colling in {{createOrGetLocallyAssignmentsAsync}} 
along as 
{{{}writeTableAssignmentsToMetastore{}}}:
 * the first one calls metastore to check are stable assignments equals null in 
{{{}TableManager.java:1298{}}};
 * the last one calls metastore to write stable assignments in 
{{TableManager.java:705-706}} on {{.invoke(...)}} method.

There should be a way to call versioned value's update before addressing to 
metastore.


was (Author: JIRAUSER303791):
At the current state, the following changes are done:
 * In {{ItDistributionZonesFiltersTest}} the single 
testFilteredDataNodesPropagatedToStable test was divided on 3 separate 
tests:{color:#6b2fba}
{color}
 ** 
{{testDataNodesByFilterPropagatedToStable}} that tests only by attribute 
filtering by specified region and storage filter;
 ** 
{{testDataNodesByProfilePropagatedToStable}} that tests only by matching 
profile name;
 ** 
{{testDataNodesByFilterAndProfilePropagatedToStable}} tests both cases 
together: attribute filters and profile both.
 * In {{TableManager}} was done refactoring:
 ** exctracting assignments creation into the separate method 
{{createOrGetLocallyAssignmentsAsync}} along as 
{{writeTableAssignmentsToMetastore;
}}

> Recovery is not possible, if node have no needed storage profile
> 
>
> Key: IGNITE-21387
> URL: https://issues.apache.org/jira/browse/IGNITE-21387
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> Looks like any table try to create storages on the recovery node, even if is 
> shouldn't be here, because of zone storage profile filter.
> The isssue is reproducing by 
> ItDistributionZonesFiltersTest#testFilteredDataNodesPropagatedToStable. So, 
> test must be enabled or reworked by this ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21387) Recovery is not possible, if node have no needed storage profile

2024-04-15 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837121#comment-17837121
 ] 

Mikhail Efremov edited comment on IGNITE-21387 at 4/15/24 1:10 PM:
---

At the current state, the following changes are done:
 * In {{ItDistributionZonesFiltersTest}} the single 
testFilteredDataNodesPropagatedToStable test was divided on 3 separate tests:
 ** {{testDataNodesByFilterPropagatedToStable}} that tests only by attribute 
filtering by specified region and storage filter;
 ** {{testDataNodesByProfilePropagatedToStable}} that tests only by matching 
profile name;
 ** {{testDataNodesByFilterAndProfilePropagatedToStable}} tests both cases 
together: attribute filters and profile both.

 *  
 * In {{TableManager}} was done refactoring:
 ** extracting assignments creation into the separate method 
{{createOrGetLocallyAssignmentsAsync}} along as 
{{{}writeTableAssignmentsToMetastore{}}};
 ** the goal of the previous bullet is to introduce new methods 
{{isLocalNodeInAssignmentList}} and 
{{isLocalNodeMatchAssignment}} that should be used for checking on a node's 
TableManager and IndexManager is the node belongs to current stable assignments 
or not. If the latter, then node shouldn't create storages and indices: 
{{MvTableStorage}} and 
{{TxStateTableStorage}} on table create event in 
{{TableManager#createStorages}} and {{IndexManager#startIndexAsync}} 
correspondingly.
 

Current issue is related to versioned values' race:
 * {{TableManager.java:1395}} on {{tableVv.update(...)}}
 * {{IndexManager.java:270}} on {{handleMetastoreEventVv.update(...)}}

There is {{AssertionError}} with a message like "Causality token is outdated, 
previous token 21, got 21" when the number 21 may vary.

The reason is metastore's colling in {{createOrGetLocallyAssignmentsAsync}} 
along as 
{{{}writeTableAssignmentsToMetastore{}}}:
 * the first one calls metastore to check are stable assignments equals null in 
{{{}TableManager.java:1298{}}};
 * the last one calls metastore to write stable assignments in 
{{TableManager.java:705-706}} on {{.invoke(...)}} method.

There should be a way to call versioned value's update before addressing to 
metastore.


was (Author: JIRAUSER303791):
At the current state, the following changes are done:
 * In {{ItDistributionZonesFiltersTest}} the single 
testFilteredDataNodesPropagatedToStable test was divided on 3 separate tests:
 ** {{testDataNodesByFilterPropagatedToStable}} that tests only by attribute 
filtering by specified region and storage filter;

 ** {{testDataNodesByProfilePropagatedToStable}} that tests only by matching 
profile name;

 ** {{testDataNodesByFilterAndProfilePropagatedToStable}} tests both cases 
together: attribute filters and profile both.

 * In {{TableManager}} was done refactoring:
 ** extracting assignments creation into the separate method 
{{createOrGetLocallyAssignmentsAsync}} along as 
{{{}writeTableAssignmentsToMetastore{}}};
 ** the goal of the previous bullet is to introduce new methods 
{{isLocalNodeInAssignmentList}} and 
{{isLocalNodeMatchAssignment}} that should be used for checking on a node's 
TableManager and IndexManager is the node belongs to current stable assignments 
or not. If the latter, then node shouldn't create storages and indices: 
{{MvTableStorage}} and 
{{TxStateTableStorage}} on table create event in 
{{TableManager#createStorages}} and {{IndexManager#startIndexAsync}} 
correspondingly.
 

Current issue is related to versioned values' race:
 * {{TableManager.java:1395}} on {{tableVv.update(...)}}
 * {{IndexManager.java:270}} on {{handleMetastoreEventVv.update(...)}}

There is {{AssertionError}} with a message like "Causality token is outdated, 
previous token 21, got 21" when the number 21 may vary.

The reason is metastore's colling in {{createOrGetLocallyAssignmentsAsync}} 
along as 
{{{}writeTableAssignmentsToMetastore{}}}:
 * the first one calls metastore to check are stable assignments equals null in 
{{{}TableManager.java:1298{}}};
 * the last one calls metastore to write stable assignments in 
{{TableManager.java:705-706}} on {{.invoke(...)}} method.

There should be a way to call versioned value's update before addressing to 
metastore.

> Recovery is not possible, if node have no needed storage profile
> 
>
> Key: IGNITE-21387
> URL: https://issues.apache.org/jira/browse/IGNITE-21387
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> Looks like any table try to create storages on the recovery node, even if is 
> shouldn't be here, because of zone storage profile filter.
> The isssue is reproducing by 
> ItDistributionZonesFiltersTest#te

[jira] [Comment Edited] (IGNITE-21387) Recovery is not possible, if node have no needed storage profile

2024-04-15 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837121#comment-17837121
 ] 

Mikhail Efremov edited comment on IGNITE-21387 at 4/15/24 1:10 PM:
---

At the current state, the following changes are done:
 * In {{ItDistributionZonesFiltersTest}} the single 
testFilteredDataNodesPropagatedToStable test was divided on 3 separate tests:
 ** {{testDataNodesByFilterPropagatedToStable}} that tests only by attribute 
filtering by specified region and storage filter;
 ** {{testDataNodesByProfilePropagatedToStable}} that tests only by matching 
profile name;
 ** {{testDataNodesByFilterAndProfilePropagatedToStable}} tests both cases 
together: attribute filters and profile both.

 * In {{TableManager}} was done refactoring:
 ** extracting assignments creation into the separate method 
{{createOrGetLocallyAssignmentsAsync}} along as 
{{{}writeTableAssignmentsToMetastore{}}};
 ** the goal of the previous bullet is to introduce new methods 
{{isLocalNodeInAssignmentList}} and 
{{isLocalNodeMatchAssignment}} that should be used for checking on a node's 
TableManager and IndexManager is the node belongs to current stable assignments 
or not. If the latter, then node shouldn't create storages and indices: 
{{MvTableStorage}} and 
{{TxStateTableStorage}} on table create event in 
{{TableManager#createStorages}} and {{IndexManager#startIndexAsync}} 
correspondingly.
 

Current issue is related to versioned values' race:
 * {{TableManager.java:1395}} on {{tableVv.update(...)}}
 * {{IndexManager.java:270}} on {{handleMetastoreEventVv.update(...)}}

There is {{AssertionError}} with a message like "Causality token is outdated, 
previous token 21, got 21" when the number 21 may vary.

The reason is metastore's colling in {{createOrGetLocallyAssignmentsAsync}} 
along as 
{{{}writeTableAssignmentsToMetastore{}}}:
 * the first one calls metastore to check are stable assignments equals null in 
{{{}TableManager.java:1298{}}};
 * the last one calls metastore to write stable assignments in 
{{TableManager.java:705-706}} on {{.invoke(...)}} method.

There should be a way to call versioned value's update before addressing to 
metastore.


was (Author: JIRAUSER303791):
At the current state, the following changes are done:
 * In {{ItDistributionZonesFiltersTest}} the single 
testFilteredDataNodesPropagatedToStable test was divided on 3 separate tests:
 ** {{testDataNodesByFilterPropagatedToStable}} that tests only by attribute 
filtering by specified region and storage filter;
 ** {{testDataNodesByProfilePropagatedToStable}} that tests only by matching 
profile name;
 ** {{testDataNodesByFilterAndProfilePropagatedToStable}} tests both cases 
together: attribute filters and profile both.

 *  
 * In {{TableManager}} was done refactoring:
 ** extracting assignments creation into the separate method 
{{createOrGetLocallyAssignmentsAsync}} along as 
{{{}writeTableAssignmentsToMetastore{}}};
 ** the goal of the previous bullet is to introduce new methods 
{{isLocalNodeInAssignmentList}} and 
{{isLocalNodeMatchAssignment}} that should be used for checking on a node's 
TableManager and IndexManager is the node belongs to current stable assignments 
or not. If the latter, then node shouldn't create storages and indices: 
{{MvTableStorage}} and 
{{TxStateTableStorage}} on table create event in 
{{TableManager#createStorages}} and {{IndexManager#startIndexAsync}} 
correspondingly.
 

Current issue is related to versioned values' race:
 * {{TableManager.java:1395}} on {{tableVv.update(...)}}
 * {{IndexManager.java:270}} on {{handleMetastoreEventVv.update(...)}}

There is {{AssertionError}} with a message like "Causality token is outdated, 
previous token 21, got 21" when the number 21 may vary.

The reason is metastore's colling in {{createOrGetLocallyAssignmentsAsync}} 
along as 
{{{}writeTableAssignmentsToMetastore{}}}:
 * the first one calls metastore to check are stable assignments equals null in 
{{{}TableManager.java:1298{}}};
 * the last one calls metastore to write stable assignments in 
{{TableManager.java:705-706}} on {{.invoke(...)}} method.

There should be a way to call versioned value's update before addressing to 
metastore.

> Recovery is not possible, if node have no needed storage profile
> 
>
> Key: IGNITE-21387
> URL: https://issues.apache.org/jira/browse/IGNITE-21387
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> Looks like any table try to create storages on the recovery node, even if is 
> shouldn't be here, because of zone storage profile filter.
> The isssue is reproducing by 
> ItDistributionZonesFiltersTest#test

[jira] [Updated] (IGNITE-20137) testOneNodeRestartWithGap's assert throws TransactionException is failed

2024-05-10 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-20137:
-
Summary: testOneNodeRestartWithGap's assert throws TransactionException is 
failed  (was: testOneNodeRestartWithGap is flaky)

> testOneNodeRestartWithGap's assert throws TransactionException is failed
> 
>
> Key: IGNITE-20137
> URL: https://issues.apache.org/jira/browse/IGNITE-20137
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 3.0
>Reporter: Alexey Scherbakov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0
>
>
> Looks like sometime the stopped node is not removed from the topology



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20137) testOneNodeRestartWithGap's assert throws TransactionException is failed

2024-05-10 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-20137:
-
Description: 
*Description*

With unchanged code there 2 ways to fail, both in the same assert that waits 
{{TransactionException}} to be thrown:
 # assert returns false because there is no any thrown exception;
 # assert returns false because the caught exception is timeout with 
{{{}PrimaryReplicaAwaitTimeoutException{}}}.

We could remove the assert and fix the follow line:
{code:java}
TableManager tableManager = unwrapTableManager(ignite1.tables());{code}
{color:#383838}Without unwrapping there will be {{{}ClassCastException{}}}. 
With this 2 changes the test will pass with 100% rate.{color}
 
*What to fix*

Probably, the assert should be left, but the issue needs an investigation: why 
there 2 reasons of fail and what exactly do we expect there?
 
 

  was:Looks like sometime the stopped node is not removed from the topology


> testOneNodeRestartWithGap's assert throws TransactionException is failed
> 
>
> Key: IGNITE-20137
> URL: https://issues.apache.org/jira/browse/IGNITE-20137
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 3.0
>Reporter: Alexey Scherbakov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0
>
>
> *Description*
> With unchanged code there 2 ways to fail, both in the same assert that waits 
> {{TransactionException}} to be thrown:
>  # assert returns false because there is no any thrown exception;
>  # assert returns false because the caught exception is timeout with 
> {{{}PrimaryReplicaAwaitTimeoutException{}}}.
> We could remove the assert and fix the follow line:
> {code:java}
> TableManager tableManager = unwrapTableManager(ignite1.tables());{code}
> {color:#383838}Without unwrapping there will be {{{}ClassCastException{}}}. 
> With this 2 changes the test will pass with 100% rate.{color}
>  
> *What to fix*
> Probably, the assert should be left, but the issue needs an investigation: 
> why there 2 reasons of fail and what exactly do we expect there?
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20137) testOneNodeRestartWithGap's assert throws TransactionException is failed

2024-05-10 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845456#comment-17845456
 ] 

Mikhail Efremov commented on IGNITE-20137:
--

Message in the first case:
{code:java}
org.opentest4j.AssertionFailedError: Expected 
org.apache.ignite.tx.TransactionException to be thrown, but nothing was thrown.
    at 
org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:152)
    at org.junit.jupiter.api.AssertThrows.assertThrows(AssertThrows.java:73)
    at org.junit.jupiter.api.AssertThrows.assertThrows(AssertThrows.java:35)
    at org.junit.jupiter.api.Assertions.assertThrows(Assertions.java:3115)
    at 
org.apache.ignite.internal.runner.app.ItIgniteNodeRestartTest.testOneNodeRestartWithGap(ItIgniteNodeRestartTest.java:1190)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at 
java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at java.base/java.util.stream.IntPipeline$1$1.accept(IntPipeline.java:180)
    at 
java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
    at 
java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:699)
    at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
    at 
java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
    at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
    at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541){code}

> testOneNodeRestartWithGap's assert throws TransactionException is failed
> 
>
> Key: IGNITE-20137
> URL: https://issues.apache.org/jira/browse/IGNITE-20137
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 3.0
>Reporter: Alexey Scherbakov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0
>
>
> *Description*
> With unchanged code there 2 ways to fail, both in the same assert that waits 
> {{TransactionException}} to be thrown:
>  # assert returns false because there is no any thrown exception;
>  # assert returns false because the caught exception is timeout with 
> {{{}PrimaryReplicaAwaitTimeoutException{}}}.
> We could remove the assert and fix the follow line:
> {code:java}
> TableManager tableManager = unwrapTableManager(ignite1.tables());{code}
> {color:#383838}Without unwrapping there will be {{{}ClassCastException{}}}. 
> With this 2 changes the test will pass with 100% rate.{color}
>  
> *What to fix*
> Probably, the assert should be left, but the issue needs an investigation: 
> why there 2 reasons of fail and what exactly do we expect there?
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20137) testOneNodeRestartWithGap's assert throws TransactionException is failed

2024-05-10 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845457#comment-17845457
 ] 

Mikhail Efremov commented on IGNITE-20137:
--

Message in the second case:

 
{code:java}
Unexpected exception type thrown, expected: 
 but was: 

Expected :class org.apache.ignite.tx.TransactionException
Actual   :class org.apache.ignite.lang.IgniteException

org.opentest4j.AssertionFailedError: Unexpected exception type thrown, 
expected:  but was: 

    at 
org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
    at org.junit.jupiter.api.AssertThrows.assertThrows(AssertThrows.java:67)
    at org.junit.jupiter.api.AssertThrows.assertThrows(AssertThrows.java:35)
    at org.junit.jupiter.api.Assertions.assertThrows(Assertions.java:3115)
    at 
org.apache.ignite.internal.runner.app.ItIgniteNodeRestartTest.testOneNodeRestartWithGap(ItIgniteNodeRestartTest.java:1190)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at 
java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at java.base/java.util.stream.IntPipeline$1$1.accept(IntPipeline.java:180)
    at 
java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
    at 
java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:699)
    at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
    at 
java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
    at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
    at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
Caused by: org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:73d0cfb1-6abf-4091-a038-b660d8581e84 The primary replica await timed 
out [replicationGroupId=8_part_0, referenceTimestamp=HybridTimestamp 
[physical=2024-05-11 00:43:48:686 +0600, logical=0, 
composite=112418267377565696], currentLease=Lease [leaseholder=iinrt_tonrwg_0, 
leaseholderId=99dc8a9e-8dd2-47eb-908e-59e984a0ff54, accepted=false, 
startTime=HybridTimestamp [physical=2024-05-11 00:43:53:183 +0600, logical=0, 
composite=112418267672281088], expirationTime=HybridTimestamp 
[physical=2024-05-11 00:45:53:183 +0600, logical=0, 
composite=112418275536601088], prolongable=false, proposedCandidate=null, 
replicationGroupId=8_part_0]]
    at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:118)
    at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.lambda$convertToPublicFuture$2(IgniteExceptionMapperUtil.java:138)
    at 
java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
    at 
java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
    at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
    at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
    at 
java.base/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
    at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at 
java.base/jav

[jira] [Created] (IGNITE-22222) Move ThreadLocalPartitionCommandsMarshaller to ReplicaManager module with cycle dependency fix

2024-05-14 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-2:


 Summary: Move ThreadLocalPartitionCommandsMarshaller to 
ReplicaManager module with cycle dependency fix
 Key: IGNITE-2
 URL: https://issues.apache.org/jira/browse/IGNITE-2
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

Now the marshaller starts in {{IgniteImpl}} and using only for 
{{ReplicaManager}} creation, there is a reason to move it in 
{{{}ReplicaManager{}}}'s constructor, but there is cyclic dependency arises.

Also in some places in tests classes like 
{{org.apache.ignite.internal.replicator.ItPlacementDriverReplicaSideTest#beforeTest}}
 we can't pass or mock this specific marshaller's class.

*Definition of done*

{{ThreadLocalPartitionCommandsMarshaller}} instantiation is moved into 
{{ReplicaManager}} and there no any another places of it's instance creation. 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22290) Moving LogSyncer from Loza and RaftServer and pass it to usages as IgniteComponent

2024-05-21 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22290:


 Summary: Moving LogSyncer from Loza and RaftServer and pass it to 
usages as IgniteComponent
 Key: IGNITE-22290
 URL: https://issues.apache.org/jira/browse/IGNITE-22290
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
{{{}Loza#getLogSyncer{}}}.

*Motivation*
 # There is an intention to hide all raft-specific entities behind replication 
layer and log syncer is an entity that sticking out. Besides it's only needs 
for engines and may be passed though {{IgniteImpl}} as an ignite component.
 # All current implementations except {{DefaultLogStorageFactory}} are not 
implements properly sync method at all that looks like the reason for interface 
segregation between {{LogSyncer}} and {{LogStorageFactory.}}

*Definition of done*
 # There no more {{Loza#getLogSyncer}} method and its' calls.
 # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.{{{}
{}}}
 # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
{{{}IgniteIml{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22291) Moving LogSyncer from Loza and RaftServer and pass it to usages as IgniteComponent

2024-05-21 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22291:


 Summary: Moving LogSyncer from Loza and RaftServer and pass it to 
usages as IgniteComponent
 Key: IGNITE-22291
 URL: https://issues.apache.org/jira/browse/IGNITE-22291
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
{{{}Loza#getLogSyncer{}}}.

*Motivation*
 # There is an intention to hide all raft-specific entities behind replication 
layer and log syncer is an entity that sticking out. Besides it's only needs 
for engines and may be passed though {{IgniteImpl}} as an ignite component.
 # All current implementations except {{DefaultLogStorageFactory}} are not 
implements properly sync method at all that looks like the reason for interface 
segregation between {{LogSyncer}} and {{LogStorageFactory.}}

*Definition of done*
 # There no more {{Loza#getLogSyncer}} method and its' calls.
 # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.{{{}
{}}}
 # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
{{{}IgniteIml{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22292) Moving LogSyncer from Loza and RaftServer and pass it to usages as IgniteComponent

2024-05-21 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22292:


 Summary: Moving LogSyncer from Loza and RaftServer and pass it to 
usages as IgniteComponent
 Key: IGNITE-22292
 URL: https://issues.apache.org/jira/browse/IGNITE-22292
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
{{{}Loza#getLogSyncer{}}}.

*Motivation*
 # There is an intention to hide all raft-specific entities behind replication 
layer and log syncer is an entity that sticking out. Besides it's only needs 
for engines and may be passed though {{IgniteImpl}} as an ignite component.
 # All current implementations except {{DefaultLogStorageFactory}} are not 
implements properly sync method at all that looks like the reason for interface 
segregation between {{LogSyncer}} and {{LogStorageFactory.}}

*Definition of done*
 # There no more {{Loza#getLogSyncer}} method and its' calls.
 # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.{{{}
{}}}
 # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
{{{}IgniteIml{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-22291) Moving LogSyncer from Loza and RaftServer and pass it to usages as IgniteComponent

2024-05-21 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov resolved IGNITE-22291.
--
Resolution: Invalid

> Moving LogSyncer from Loza and RaftServer and pass it to usages as 
> IgniteComponent
> --
>
> Key: IGNITE-22291
> URL: https://issues.apache.org/jira/browse/IGNITE-22291
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>
> *Description*
> The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
> {{{}Loza#getLogSyncer{}}}.
> *Motivation*
>  # There is an intention to hide all raft-specific entities behind 
> replication layer and log syncer is an entity that sticking out. Besides it's 
> only needs for engines and may be passed though {{IgniteImpl}} as an ignite 
> component.
>  # All current implementations except {{DefaultLogStorageFactory}} are not 
> implements properly sync method at all that looks like the reason for 
> interface segregation between {{LogSyncer}} and {{LogStorageFactory.}}
> *Definition of done*
>  # There no more {{Loza#getLogSyncer}} method and its' calls.
>  # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.{{{}
> {}}}
>  # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
> {{{}IgniteIml{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-22290) Moving LogSyncer from Loza and RaftServer and pass it to usages as IgniteComponent

2024-05-21 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov resolved IGNITE-22290.
--
Resolution: Invalid

> Moving LogSyncer from Loza and RaftServer and pass it to usages as 
> IgniteComponent
> --
>
> Key: IGNITE-22290
> URL: https://issues.apache.org/jira/browse/IGNITE-22290
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>
> *Description*
> The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
> {{{}Loza#getLogSyncer{}}}.
> *Motivation*
>  # There is an intention to hide all raft-specific entities behind 
> replication layer and log syncer is an entity that sticking out. Besides it's 
> only needs for engines and may be passed though {{IgniteImpl}} as an ignite 
> component.
>  # All current implementations except {{DefaultLogStorageFactory}} are not 
> implements properly sync method at all that looks like the reason for 
> interface segregation between {{LogSyncer}} and {{LogStorageFactory.}}
> *Definition of done*
>  # There no more {{Loza#getLogSyncer}} method and its' calls.
>  # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.{{{}
> {}}}
>  # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
> {{{}IgniteIml{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22292) Moving LogSyncer from Loza and RaftServer and pass it to usages as IgniteComponent

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22292:
-
Labels: ignite-3  (was: )

> Moving LogSyncer from Loza and RaftServer and pass it to usages as 
> IgniteComponent
> --
>
> Key: IGNITE-22292
> URL: https://issues.apache.org/jira/browse/IGNITE-22292
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
> {{{}Loza#getLogSyncer{}}}.
> *Motivation*
>  # There is an intention to hide all raft-specific entities behind 
> replication layer and log syncer is an entity that sticking out. Besides it's 
> only needs for engines and may be passed though {{IgniteImpl}} as an ignite 
> component.
>  # All current implementations except {{DefaultLogStorageFactory}} are not 
> implements properly sync method at all that looks like the reason for 
> interface segregation between {{LogSyncer}} and {{LogStorageFactory.}}
> *Definition of done*
>  # There no more {{Loza#getLogSyncer}} method and its' calls.
>  # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.{{{}
> {}}}
>  # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
> {{{}IgniteIml{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22313) Moving RAFT-specific code to replication layer

2024-05-23 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22313:


 Summary: Moving RAFT-specific code to replication layer
 Key: IGNITE-22313
 URL: https://issues.apache.org/jira/browse/IGNITE-22313
 Project: Ignite
  Issue Type: Epic
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


An epic for tickets to move RAFT-related entities and it's calls under 
replication layer



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22292) Moving LogSyncer from Loza and RaftServer and pass it to usages as IgniteComponent

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22292:
-
Epic Link: IGNITE-22313  (was: IGNITE-22115)

> Moving LogSyncer from Loza and RaftServer and pass it to usages as 
> IgniteComponent
> --
>
> Key: IGNITE-22292
> URL: https://issues.apache.org/jira/browse/IGNITE-22292
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
> {{{}Loza#getLogSyncer{}}}.
> *Motivation*
>  # There is an intention to hide all raft-specific entities behind 
> replication layer and log syncer is an entity that sticking out. Besides it's 
> only needs for engines and may be passed though {{IgniteImpl}} as an ignite 
> component.
>  # All current implementations except {{DefaultLogStorageFactory}} are not 
> implements properly sync method at all that looks like the reason for 
> interface segregation between {{LogSyncer}} and {{LogStorageFactory.}}
> *Definition of done*
>  # There no more {{Loza#getLogSyncer}} method and its' calls.
>  # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.{{{}
> {}}}
>  # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
> {{{}IgniteIml{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22315) Make raft-client starting only once and only with raft-client and replica together

2024-05-23 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22315:


 Summary: Make raft-client starting only once and only with 
raft-client and replica together
 Key: IGNITE-22315
 URL: https://issues.apache.org/jira/browse/IGNITE-22315
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

The goal is to make sure that when partition starting on an ignite-node, if the 
node in assignments, then there should be only ones raft-node, raft-client and 
replica and no any started entities otherwise.

*Motivation*

Now on the one hand in {{TableManager}} there is a path when raft-node and 
replica didn't start, but raft-client is:

In {{TableManager#startPartitionAndStartClient}} there is a place where 
startGroupFut returns false: {{localMemberAssignment}} is {{null}} or 
{{PartitionReplicatorNodeRecovery#shouldStartGroup}} returns {{{}false{}}}. 
Then, raft-node and replica will not be started, but regardless of 
{{startGroupFut}} result raft client will be started:
{code:java}
startGroupFut
.thenComposeAsync(v -> inBusyLock(busyLock, () -> {
TableRaftService tableRaftService = 
table.internalTable().tableRaftService();

try {
// Return existing service if it's already started.
return completedFuture(
(TopologyAwareRaftGroupService) 
tableRaftService.partitionRaftGroupService(replicaGrpId.partitionId())
);
} catch (IgniteInternalException e) {
// We use "IgniteInternalException" in accordance with the 
javadoc of "partitionRaftGroupService" method.
try {
// TODO IGNITE-19614 This procedure takes 10 seconds if 
there's no majority online.
return raftMgr
.startRaftGroupService(replicaGrpId, 
newConfiguration, raftGroupServiceFactory, raftCommandsMarshaller);
} catch (NodeStoppingException ex) {
return failedFuture(ex);
}
}
}), ioExecutor)
.thenAcceptAsync(updatedRaftGroupService -> inBusyLock(busyLock, () -> {
((InternalTableImpl) internalTbl).tableRaftService()
.updateInternalTableRaftGroupService(partId, 
updatedRaftGroupService);

boolean startedRaftNode = startGroupFut.join();
if (localMemberAssignment == null || !startedRaftNode || 
replicaMgr.isReplicaStarted(replicaGrpId)) {
return;
} {code}
{color:#00855f}{color:#383838}the code shows that {{v}} argument 
({{{}startGroupFut{}}}'s result) in the lambda doesn't affect on raft-client's 
starting, and also the client is put into {{TableRaftService}} then regardless 
of replica's starting.{color}{color}

On the other hand, there is a place that rely on raft-client creation on every 
node with table's ID in {{TableManager#tables}} map: inside 
{{TableManager#handleChangePendingAssignmentEvent}} there are two calls: # the 
same name method {{TableManager#handleChangePendingAssignmentEvent}}
 # {{TableManager#changePeersOnRebalance}}

Both are asking for internal table's 
{{{}tableRaftService().partitionRaftGroupService(partId){}}}, but there no any 
checks about is the local node is suitable for raft-entities and replica 
starting or they are already started.
 
Then, when we would try to fix instant raft-client starting it will lead to 
instant {{IgniteInternalException}} with _"No such partition P in table T"_ 
from {{TableRaftService#}}
{{{}partitionRaftGroupService{}}}. This is a problem that should be 
solved.{color:#00855f}
{color}{color:#383838}
{color}
{color:#383838}*Definition of done*{color}
{color:#383838}1. Raft-client must be started only together with raft-node and 
replica or shouldn't be started otherwise.
{color}
{color:#383838}2. {{TableRaftService}} shouldn't be requested for raft-client 
if the local node isn't in a raft group.{color}
{color:#383838}3. Some tests may rely on described behavior, then failed tests 
should be investigated and, probably fixed.{color}
{color:#383838}4. New tests that check single starting of raft-node, 
raft-client and replica together are required{color}
{color:#0f54d6} {color}

{color:#383838} {color}

{color:#00855f} {color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21805) Refactor TableManager and move all RAFT related pieces to Replica

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-21805:
-
Epic Link: IGNITE-22313  (was: IGNITE-19170)

> Refactor TableManager and move all RAFT related pieces to Replica
> -
>
> Key: IGNITE-21805
> URL: https://issues.apache.org/jira/browse/IGNITE-21805
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Motivation*
> At the moment we have some places inside the TableManager, which:
>  * Use RaftManager to start/stop RAFT nodes 
> (handleChangePendingAssignmentEvent/handleChangeStableAssignmentEvent)
>  * Use RaftGroupService through 
> table.internalTable().tableRaftService().partitionRaftGroupService calls
> This fact prevents us on the track of zone-based collocation. The further 
> collocation steps will be easier, if we will move the whole RAFT connected 
> operation to the Replica class. Moreover, it should be there semantically
> *Definition of done*
>  * All code inside the handleChangePendingAssignmentEvent connected with the 
> start of raft groups (PartitionListener/RebalanceRaftGroupEventsListener) and 
> raft clients must be moved to the start of the Replica itself
>  * The same about handleChangeStableAssignmentEvent - the stop of Replica 
> must stop appropriate raft node
>  * All calls for 
> table.internalTable().tableRaftService().partitionRaftGroupService must be 
> replaced by the replicaMgr.replica(replicaGrpdId).raftClient()
> *Implementation notes*
>  * The new temporary methods must be implemented and remove after IGNITE-22036
>  ** ReplicaManager.replica(ReplicationGroupId replicaGrpId) - which returns 
> the appropriate Replica by group id
>  ** Replica.raftClient() - which return replica's RaftGroupService (raft 
> client)
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22218) Remove TableRaftService in favor of using RaftGroupService from Replica instances

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22218:
-
Epic Link: IGNITE-22313  (was: IGNITE-22115)

> Remove TableRaftService in favor of using RaftGroupService from Replica 
> instances
> -
>
> Key: IGNITE-22218
> URL: https://issues.apache.org/jira/browse/IGNITE-22218
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> All {{RaftGroupService}} instances should be gotten from 
> {{Replica#raftClient}} instead of using
> {{TableRaftService#partitionRaftGroupService}}. 
> *Motivation*
> Main goal is to follow IGNITE-21805 track and localize raft-clients' 
> instances management inside {{ReplicaManager}} and
> {{Replica}}. That requires to remove {{TableRaftService}} as entity that 
> collects raft-clients inside.
> *Definition of done*
> 1. {{TableRaftService}} is removed.
> 2. All usings of TableRaftService#partitionRaftGroupService}} should be 
> replaced on
>{{ReplicaManager#getReplica#raftClient}} calls chain.
> 3. {{TableViewInternal#leaderAssignment}} should be removed from the 
> interface.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22222) Move ThreadLocalPartitionCommandsMarshaller to ReplicaManager module with cycle dependency fix

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-2:
-
Epic Link: IGNITE-22313

> Move ThreadLocalPartitionCommandsMarshaller to ReplicaManager module with 
> cycle dependency fix
> --
>
> Key: IGNITE-2
> URL: https://issues.apache.org/jira/browse/IGNITE-2
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> Now the marshaller starts in {{IgniteImpl}} and using only for 
> {{ReplicaManager}} creation, there is a reason to move it in 
> {{{}ReplicaManager{}}}'s constructor, but there is cyclic dependency arises.
> Also in some places in tests classes like 
> {{org.apache.ignite.internal.replicator.ItPlacementDriverReplicaSideTest#beforeTest}}
>  we can't pass or mock this specific marshaller's class.
> *Definition of done*
> {{ThreadLocalPartitionCommandsMarshaller}} instantiation is moved into 
> {{ReplicaManager}} and there no any another places of it's instance creation. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22036) Replace TableManager.changePeersOnRebalance by the broadcast ReplicaRequest

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22036:
-
Epic Link: IGNITE-22313  (was: IGNITE-19170)

> Replace TableManager.changePeersOnRebalance by the broadcast ReplicaRequest
> ---
>
> Key: IGNITE-22036
> URL: https://issues.apache.org/jira/browse/IGNITE-22036
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> To simplify the process of moving to the zone-based collocation we need to 
> remove all raft-connected code from TableManager. After IGNITE-21805 we will 
> still have:
> * TableManager.changePeersOnRebalance
> This method must be replaced by appropriate request to Replica.
> *Definition of done*
> * TableManager send the broadcast ChangePeersReplicaRequest to all nodes from 
> table partition assignments instead of direct call to raft changePeersAsync
> *Implementation details*
> We need to:
> * Introduce ChangePeersReplicaRequest(PeersAndLearners peersAndLearners) - 
> the new replica request type
> * Move the code from TableManager.changePeersOnRebalance to the Replica 
> itself, as a reaction to ChangePeersReplicaRequest
> * TableManager must send the ChangePeersReplicaRequest to all nodes from 
> table partition assignments instead of direct 
> TableManager.changePeersOnRebalance call



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21805) Refactor TableManager and move all RAFT related pieces to Replica

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-21805:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Refactor TableManager and move all RAFT related pieces to Replica
> -
>
> Key: IGNITE-21805
> URL: https://issues.apache.org/jira/browse/IGNITE-21805
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Motivation*
> At the moment we have some places inside the TableManager, which:
>  * Use RaftManager to start/stop RAFT nodes 
> (handleChangePendingAssignmentEvent/handleChangeStableAssignmentEvent)
>  * Use RaftGroupService through 
> table.internalTable().tableRaftService().partitionRaftGroupService calls
> This fact prevents us on the track of zone-based collocation. The further 
> collocation steps will be easier, if we will move the whole RAFT connected 
> operation to the Replica class. Moreover, it should be there semantically
> *Definition of done*
>  * All code inside the handleChangePendingAssignmentEvent connected with the 
> start of raft groups (PartitionListener/RebalanceRaftGroupEventsListener) and 
> raft clients must be moved to the start of the Replica itself
>  * The same about handleChangeStableAssignmentEvent - the stop of Replica 
> must stop appropriate raft node
>  * All calls for 
> table.internalTable().tableRaftService().partitionRaftGroupService must be 
> replaced by the replicaMgr.replica(replicaGrpdId).raftClient()
> *Implementation notes*
>  * The new temporary methods must be implemented and remove after IGNITE-22036
>  ** ReplicaManager.replica(ReplicationGroupId replicaGrpId) - which returns 
> the appropriate Replica by group id
>  ** Replica.raftClient() - which return replica's RaftGroupService (raft 
> client)
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22315) Make raft-client starting only once and only with raft-client and replica together

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22315:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Make raft-client starting only once and only with raft-client and replica 
> together
> --
>
> Key: IGNITE-22315
> URL: https://issues.apache.org/jira/browse/IGNITE-22315
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> The goal is to make sure that when partition starting on an ignite-node, if 
> the node in assignments, then there should be only ones raft-node, 
> raft-client and replica and no any started entities otherwise.
> *Motivation*
> Now on the one hand in {{TableManager}} there is a path when raft-node and 
> replica didn't start, but raft-client is:
> In {{TableManager#startPartitionAndStartClient}} there is a place where 
> startGroupFut returns false: {{localMemberAssignment}} is {{null}} or 
> {{PartitionReplicatorNodeRecovery#shouldStartGroup}} returns {{{}false{}}}. 
> Then, raft-node and replica will not be started, but regardless of 
> {{startGroupFut}} result raft client will be started:
> {code:java}
> startGroupFut
> .thenComposeAsync(v -> inBusyLock(busyLock, () -> {
> TableRaftService tableRaftService = 
> table.internalTable().tableRaftService();
> try {
> // Return existing service if it's already started.
> return completedFuture(
> (TopologyAwareRaftGroupService) 
> tableRaftService.partitionRaftGroupService(replicaGrpId.partitionId())
> );
> } catch (IgniteInternalException e) {
> // We use "IgniteInternalException" in accordance with the 
> javadoc of "partitionRaftGroupService" method.
> try {
> // TODO IGNITE-19614 This procedure takes 10 seconds if 
> there's no majority online.
> return raftMgr
> .startRaftGroupService(replicaGrpId, 
> newConfiguration, raftGroupServiceFactory, raftCommandsMarshaller);
> } catch (NodeStoppingException ex) {
> return failedFuture(ex);
> }
> }
> }), ioExecutor)
> .thenAcceptAsync(updatedRaftGroupService -> inBusyLock(busyLock, () 
> -> {
> ((InternalTableImpl) internalTbl).tableRaftService()
> .updateInternalTableRaftGroupService(partId, 
> updatedRaftGroupService);
> boolean startedRaftNode = startGroupFut.join();
> if (localMemberAssignment == null || !startedRaftNode || 
> replicaMgr.isReplicaStarted(replicaGrpId)) {
> return;
> } {code}
> {color:#00855f}{color:#383838}the code shows that {{v}} argument 
> ({{{}startGroupFut{}}}'s result) in the lambda doesn't affect on 
> raft-client's starting, and also the client is put into {{TableRaftService}} 
> then regardless of replica's starting.{color}{color}
> On the other hand, there is a place that rely on raft-client creation on 
> every node with table's ID in {{TableManager#tables}} map: inside 
> {{TableManager#handleChangePendingAssignmentEvent}} there are two calls: # 
> the same name method {{TableManager#handleChangePendingAssignmentEvent}}
>  # {{TableManager#changePeersOnRebalance}}
> Both are asking for internal table's 
> {{{}tableRaftService().partitionRaftGroupService(partId){}}}, but there no 
> any checks about is the local node is suitable for raft-entities and replica 
> starting or they are already started.
>  
> Then, when we would try to fix instant raft-client starting it will lead to 
> instant {{IgniteInternalException}} with _"No such partition P in table T"_ 
> from {{TableRaftService#}}
> {{{}partitionRaftGroupService{}}}. This is a problem that should be 
> solved.{color:#00855f}
> {color}{color:#383838}
> {color}
> {color:#383838}*Definition of done*{color}
> {color:#383838}1. Raft-client must be started only together with raft-node 
> and replica or shouldn't be started otherwise.
> {color}
> {color:#383838}2. {{TableRaftService}} shouldn't be requested for raft-client 
> if the local node isn't in a raft group.{color}
> {color:#383838}3. Some tests may rely on described behavior, then failed 
> tests should be investigated and, probably fixed.{color}
> {color:#383838}4. New tests that check single starting of raft-node, 
> raft-client and replica together are required{color}
> {color:#0f54d6} {color}
> {color:#383838} {color}
> {color:#00855f} {color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22218) Remove TableRaftService in favor of using RaftGroupService from Replica instances

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22218:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Remove TableRaftService in favor of using RaftGroupService from Replica 
> instances
> -
>
> Key: IGNITE-22218
> URL: https://issues.apache.org/jira/browse/IGNITE-22218
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> All {{RaftGroupService}} instances should be gotten from 
> {{Replica#raftClient}} instead of using
> {{TableRaftService#partitionRaftGroupService}}. 
> *Motivation*
> Main goal is to follow IGNITE-21805 track and localize raft-clients' 
> instances management inside {{ReplicaManager}} and
> {{Replica}}. That requires to remove {{TableRaftService}} as entity that 
> collects raft-clients inside.
> *Definition of done*
> 1. {{TableRaftService}} is removed.
> 2. All usings of TableRaftService#partitionRaftGroupService}} should be 
> replaced on
>{{ReplicaManager#getReplica#raftClient}} calls chain.
> 3. {{TableViewInternal#leaderAssignment}} should be removed from the 
> interface.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22292) Moving LogSyncer from Loza and RaftServer and pass it to usages as IgniteComponent

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22292:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Moving LogSyncer from Loza and RaftServer and pass it to usages as 
> IgniteComponent
> --
>
> Key: IGNITE-22292
> URL: https://issues.apache.org/jira/browse/IGNITE-22292
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
> {{{}Loza#getLogSyncer{}}}.
> *Motivation*
>  # There is an intention to hide all raft-specific entities behind 
> replication layer and log syncer is an entity that sticking out. Besides it's 
> only needs for engines and may be passed though {{IgniteImpl}} as an ignite 
> component.
>  # All current implementations except {{DefaultLogStorageFactory}} are not 
> implements properly sync method at all that looks like the reason for 
> interface segregation between {{LogSyncer}} and {{LogStorageFactory.}}
> *Definition of done*
>  # There no more {{Loza#getLogSyncer}} method and its' calls.
>  # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.{{{}
> {}}}
>  # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
> {{{}IgniteIml{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22315) Make raft-client starting only once and only with raft-client and replica together

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22315:
-
Description: 
*Description*

The goal is to make sure that when partition starting on an ignite-node, if the 
node in assignments, then there should be only ones raft-node, raft-client and 
replica and no any started entities otherwise.

*Motivation*

Now on the one hand in {{TableManager}} there is a path when raft-node and 
replica didn't start, but raft-client is:

In {{TableManager#startPartitionAndStartClient}} there is a place where 
{{startGroupFut}} returns {{{}false{}}}:{{ localMemberAssignment}} is {{null}} 
or {{PartitionReplicatorNodeRecovery#shouldStartGroup}} returns {{{}false{}}}. 
Then, raft-node and replica will not be started, but regardless of 
{{startGroupFut}} result raft client will be started:
{code:java}
startGroupFut
.thenComposeAsync(v -> inBusyLock(busyLock, () -> {
TableRaftService tableRaftService = 
table.internalTable().tableRaftService();

try {
// Return existing service if it's already started.
return completedFuture(
(TopologyAwareRaftGroupService) 
tableRaftService.partitionRaftGroupService(replicaGrpId.partitionId())
);
} catch (IgniteInternalException e) {
// We use "IgniteInternalException" in accordance with the 
javadoc of "partitionRaftGroupService" method.
try {
// TODO IGNITE-19614 This procedure takes 10 seconds if 
there's no majority online.
return raftMgr
.startRaftGroupService(replicaGrpId, 
newConfiguration, raftGroupServiceFactory, raftCommandsMarshaller);
} catch (NodeStoppingException ex) {
return failedFuture(ex);
}
}
}), ioExecutor)
.thenAcceptAsync(updatedRaftGroupService -> inBusyLock(busyLock, () -> {
((InternalTableImpl) internalTbl).tableRaftService()
.updateInternalTableRaftGroupService(partId, 
updatedRaftGroupService);

boolean startedRaftNode = startGroupFut.join();
if (localMemberAssignment == null || !startedRaftNode || 
replicaMgr.isReplicaStarted(replicaGrpId)) {
return;
} {code}
{color:#00855f}the code shows that {{v}} argument ({{{}startGroupFut{}}}'s 
result) in the lambda doesn't affect on raft-client's starting, and also the 
client is put into {{TableRaftService}} then regardless of replica's 
starting.{color}

On the other hand, there is a place that rely on raft-client creation on every 
node with table's ID in {{TableManager#tables}} map: inside 
{{TableManager#handleChangePendingAssignmentEvent}} there are two calls: 
 # the same name method 
{{TableManager#handleChangePendingAssignmentEvent}}{{{}{}}}
 # {{TableManager#changePeersOnRebalance}}

Both are asking for internal table's 
{{{}tableRaftService().partitionRaftGroupService(partId){}}}, but there no any 
checks about is the local node is suitable for raft-entities and replica 
starting or they are already started.
 
Then, when we would try to fix instant raft-client starting it will lead to 
instant {{IgniteInternalException}} with _"No such partition P in table T"_ 
from {{{}TableRaftService#{}}}{{{}partitionRaftGroupService{}}}. This is a 
problem that should be solved.


{color:#383838}*Definition of done*{color}
{color:#383838}1. Raft-client must be started only together with raft-node and 
replica or shouldn't be started otherwise.{color}
{color:#383838}2. {{TableRaftService}} shouldn't be requested for raft-client 
if the local node isn't in a raft group.{color}
{color:#383838}3. Some tests may rely on described behavior, then failed tests 
should be investigated and, probably fixed.{color}
{color:#383838}4. New tests that check single starting of raft-node, 
raft-client and replica together are required{color}
{color:#0f54d6} {color}

{color:#383838} {color}

{color:#00855f} {color}

  was:
*Description*

The goal is to make sure that when partition starting on an ignite-node, if the 
node in assignments, then there should be only ones raft-node, raft-client and 
replica and no any started entities otherwise.

*Motivation*

Now on the one hand in {{TableManager}} there is a path when raft-node and 
replica didn't start, but raft-client is:

In {{TableManager#startPartitionAndStartClient}} there is a place where 
startGroupFut returns false: {{localMemberAssignment}} is {{null}} or 
{{PartitionReplicatorNodeRecovery#shouldStartGroup}} returns {{{}false{}}}. 
Then, raft-node and replica will not be started, but regardless of 
{{startGroupFut}} result raft client will be started:
{code:java}
startGroupFut
.thenComposeAsync(v -> inBusyLock(busyLock, () -> {
TableRaftServi

[jira] [Updated] (IGNITE-22315) Make raft-client starting only once and only with raft-client and replica together

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22315:
-
Description: 
*Description*

The goal is to make sure that when partition starting on an ignite-node, if the 
node in assignments, then there should be only ones raft-node, raft-client and 
replica and no any started entities otherwise.

*Motivation*

Now on the one hand in {{TableManager}} there is a path when raft-node and 
replica didn't start, but raft-client is:

In {{TableManager#startPartitionAndStartClient}} there is a place where 
{{startGroupFut}} returns {{{}false{}}}:\{{ localMemberAssignment}} is {{null}} 
or {{PartitionReplicatorNodeRecovery#shouldStartGroup}} returns {{{}false{}}}. 
Then, raft-node and replica will not be started, but regardless of 
{{startGroupFut}} result raft client will be started:
{code:java}
startGroupFut
.thenComposeAsync(v -> inBusyLock(busyLock, () -> {
TableRaftService tableRaftService = 
table.internalTable().tableRaftService();

try {
// Return existing service if it's already started.
return completedFuture(
(TopologyAwareRaftGroupService) 
tableRaftService.partitionRaftGroupService(replicaGrpId.partitionId())
);
} catch (IgniteInternalException e) {
// We use "IgniteInternalException" in accordance with the 
javadoc of "partitionRaftGroupService" method.
try {
// TODO IGNITE-19614 This procedure takes 10 seconds if 
there's no majority online.
return raftMgr
.startRaftGroupService(replicaGrpId, 
newConfiguration, raftGroupServiceFactory, raftCommandsMarshaller);
} catch (NodeStoppingException ex) {
return failedFuture(ex);
}
}
}), ioExecutor)
.thenAcceptAsync(updatedRaftGroupService -> inBusyLock(busyLock, () -> {
((InternalTableImpl) internalTbl).tableRaftService()
.updateInternalTableRaftGroupService(partId, 
updatedRaftGroupService);

boolean startedRaftNode = startGroupFut.join();
if (localMemberAssignment == null || !startedRaftNode || 
replicaMgr.isReplicaStarted(replicaGrpId)) {
return;
} {code}
the code shows that {{v}} argument ({{{}startGroupFut{}}}'s result) in the 
lambda doesn't affect on raft-client's starting, and also the client is put 
into {{TableRaftService}} then regardless of replica's starting.

On the other hand, there is a place that rely on raft-client creation on every 
node with table's ID in {{TableManager#tables}} map: inside 
{{TableManager#handleChangePendingAssignmentEvent}} there are two calls:
 # the same name method 
{{{}TableManager#handleChangePendingAssignmentEvent{}}}}}{}}}
 # {{TableManager#changePeersOnRebalance}}

Both are asking for internal table's 
{{{}tableRaftService().partitionRaftGroupService(partId){}}}, but there no any 
checks about is the local node is suitable for raft-entities and replica 
starting or they are already started.
 
Then, when we would try to fix instant raft-client starting it will lead to 
instant {{IgniteInternalException}} with _"No such partition P in table T"_ 
from {{{}TableRaftService#{}}}{{{}partitionRaftGroupService{}}}. This is a 
problem that should be solved.

{color:#383838}*Definition of done*{color}
{color:#383838}1. Raft-client must be started only together with raft-node and 
replica or shouldn't be started otherwise.{color}
{color:#383838}2. {{TableRaftService}} shouldn't be requested for raft-client 
if the local node isn't in a raft group.{color}
{color:#383838}3. Some tests may rely on described behavior, then failed tests 
should be investigated and, probably fixed.{color}
{color:#383838}4. New tests that check single starting of raft-node, 
raft-client and replica together are required{color}
{color:#0f54d6} {color}

{color:#383838} {color}

{color:#00855f} {color}

  was:
*Description*

The goal is to make sure that when partition starting on an ignite-node, if the 
node in assignments, then there should be only ones raft-node, raft-client and 
replica and no any started entities otherwise.

*Motivation*

Now on the one hand in {{TableManager}} there is a path when raft-node and 
replica didn't start, but raft-client is:

In {{TableManager#startPartitionAndStartClient}} there is a place where 
{{startGroupFut}} returns {{{}false{}}}:{{ localMemberAssignment}} is {{null}} 
or {{PartitionReplicatorNodeRecovery#shouldStartGroup}} returns {{{}false{}}}. 
Then, raft-node and replica will not be started, but regardless of 
{{startGroupFut}} result raft client will be started:
{code:java}
startGroupFut
.thenComposeAsync(v -> inBusyLock(busyLock, () -> {
TableRaftService tab

[jira] [Updated] (IGNITE-22315) Make raft-client starting only once and only with raft-client and replica together

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22315:
-
Description: 
*Description*

The goal is to make sure that when partition starting on an ignite-node, if the 
node in assignments, then there should be only ones raft-node, raft-client and 
replica and no any started entities otherwise.

*Motivation*

Now on the one hand in {{TableManager}} there is a path when raft-node and 
replica didn't start, but raft-client is:

In {{TableManager#startPartitionAndStartClient}} there is a place where 
{{startGroupFut}} returns {{{}false{}}}:\{{ localMemberAssignment}} is {{null}} 
or {{PartitionReplicatorNodeRecovery#shouldStartGroup}} returns {{{}false{}}}. 
Then, raft-node and replica will not be started, but regardless of 
{{startGroupFut}} result raft client will be started:
{code:java}
startGroupFut
.thenComposeAsync(v -> inBusyLock(busyLock, () -> {
TableRaftService tableRaftService = 
table.internalTable().tableRaftService();

try {
// Return existing service if it's already started.
return completedFuture(
(TopologyAwareRaftGroupService) 
tableRaftService.partitionRaftGroupService(replicaGrpId.partitionId())
);
} catch (IgniteInternalException e) {
// We use "IgniteInternalException" in accordance with the 
javadoc of "partitionRaftGroupService" method.
try {
// TODO IGNITE-19614 This procedure takes 10 seconds if 
there's no majority online.
return raftMgr
.startRaftGroupService(replicaGrpId, 
newConfiguration, raftGroupServiceFactory, raftCommandsMarshaller);
} catch (NodeStoppingException ex) {
return failedFuture(ex);
}
}
}), ioExecutor)
.thenAcceptAsync(updatedRaftGroupService -> inBusyLock(busyLock, () -> {
((InternalTableImpl) internalTbl).tableRaftService()
.updateInternalTableRaftGroupService(partId, 
updatedRaftGroupService);

boolean startedRaftNode = startGroupFut.join();
if (localMemberAssignment == null || !startedRaftNode || 
replicaMgr.isReplicaStarted(replicaGrpId)) {
return;
} {code}
the code shows that {{v}} argument ({{{}startGroupFut{}}}'s result) in the 
lambda doesn't affect on raft-client's starting, and also the client is put 
into {{TableRaftService}} then regardless of replica's starting.

On the other hand, there is a place that rely on raft-client creation on every 
node with table's ID in {{TableManager#tables}} map: inside 
{{TableManager#handleChangePendingAssignmentEvent}} there are two calls:
 # the same name method 
{{{}TableManager#handleChangePendingAssignmentEvent{}}}}}{}}}
 # {{TableManager#changePeersOnRebalance}}

Both are asking for internal table's 
{{{}tableRaftService().partitionRaftGroupService(partId){}}}, but there no any 
checks about is the local node is suitable for raft-entities and replica 
starting or they are already started.
 
Then, when we would try to fix instant raft-client starting it will lead to 
instant {{IgniteInternalException}} with _"No such partition P in table T"_ 
from {{{}TableRaftService#{}}}{{{}partitionRaftGroupService{}}}. This is a 
problem that should be solved.

{color:#383838}*Definition of done*{color}
 # {color:#383838}Raft-client must be started only together with raft-node and 
replica or shouldn't be started otherwise.{color}
 # {color:#383838}{{TableRaftService}} shouldn't be requested for raft-client 
if the local node isn't in a raft group.{color}
 # {color:#383838}Some tests may rely on described behavior, then failed tests 
should be investigated and, probably fixed.{color}
 # {color:#383838}New tests that check single starting of raft-node, 
raft-client and replica together are required{color}
{color:#0f54d6} {color}

{color:#383838} {color}

{color:#00855f} {color}

  was:
*Description*

The goal is to make sure that when partition starting on an ignite-node, if the 
node in assignments, then there should be only ones raft-node, raft-client and 
replica and no any started entities otherwise.

*Motivation*

Now on the one hand in {{TableManager}} there is a path when raft-node and 
replica didn't start, but raft-client is:

In {{TableManager#startPartitionAndStartClient}} there is a place where 
{{startGroupFut}} returns {{{}false{}}}:\{{ localMemberAssignment}} is {{null}} 
or {{PartitionReplicatorNodeRecovery#shouldStartGroup}} returns {{{}false{}}}. 
Then, raft-node and replica will not be started, but regardless of 
{{startGroupFut}} result raft client will be started:
{code:java}
startGroupFut
.thenComposeAsync(v -> inBusyLock(busyLock, () -> {
TableRaftService ta

[jira] [Updated] (IGNITE-22292) Moving LogSyncer from Loza and RaftServer and pass it to usages as IgniteComponent

2024-05-23 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22292:
-
Description: 
*Description*

The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
{{{}Loza#getLogSyncer{}}}.

*Motivation*
 # There is an intention to hide all raft-specific entities behind replication 
layer and log syncer is an entity that sticking out. Besides it's only needs 
for engines and may be passed though {{IgniteImpl}} as an ignite component.
 # All current implementations except {{DefaultLogStorageFactory}} are not 
implements properly sync method at all that looks like the reason for interface 
segregation between {{LogSyncer}} and {{LogStorageFactory.}}

*Definition of done*
 # There no more {{Loza#getLogSyncer}} method and its' calls.
 # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.
 # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
{{{}IgniteImpl{}}}.

  was:
*Description*

The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
{{{}Loza#getLogSyncer{}}}.

*Motivation*
 # There is an intention to hide all raft-specific entities behind replication 
layer and log syncer is an entity that sticking out. Besides it's only needs 
for engines and may be passed though {{IgniteImpl}} as an ignite component.
 # All current implementations except {{DefaultLogStorageFactory}} are not 
implements properly sync method at all that looks like the reason for interface 
segregation between {{LogSyncer}} and {{LogStorageFactory.}}

*Definition of done*
 # There no more {{Loza#getLogSyncer}} method and its' calls.
 # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.{{{}
{}}}
 # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
{{{}IgniteIml{}}}.


> Moving LogSyncer from Loza and RaftServer and pass it to usages as 
> IgniteComponent
> --
>
> Key: IGNITE-22292
> URL: https://issues.apache.org/jira/browse/IGNITE-22292
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
> {{{}Loza#getLogSyncer{}}}.
> *Motivation*
>  # There is an intention to hide all raft-specific entities behind 
> replication layer and log syncer is an entity that sticking out. Besides it's 
> only needs for engines and may be passed though {{IgniteImpl}} as an ignite 
> component.
>  # All current implementations except {{DefaultLogStorageFactory}} are not 
> implements properly sync method at all that looks like the reason for 
> interface segregation between {{LogSyncer}} and {{LogStorageFactory.}}
> *Definition of done*
>  # There no more {{Loza#getLogSyncer}} method and its' calls.
>  # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.
>  # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
> {{{}IgniteImpl{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22330) ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is flacky

2024-05-24 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22330:


 Summary: 
ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is 
flacky
 Key: IGNITE-22330
 URL: https://issues.apache.org/jira/browse/IGNITE-22330
 Project: Ignite
  Issue Type: Bug
Reporter: Mikhail Efremov
 Attachments: image-2024-05-25-01-56-21-327.png

This test fails at least on main {{4c6662}} with strange (series of failures) 
rate:  !image-2024-05-25-01-56-21-327.png!

The common issue is {{{}TimeoutException{}}}:

 
{code:java}
java.lang.AssertionError: java.util.concurrent.TimeoutException
    at 
org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:78)
    at 
org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
    at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:67)
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
    at 
org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.awaitPrimaryReplica(ItDisasterRecoveryReconfigurationTest.java:306)
    at 
org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.testManualRebalanceIfMajorityIsLost(ItDisasterRecoveryReconfigurationTest.java:209)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at 
java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at java.base/java.util.stream.IntPipeline$1$1.accept(IntPipeline.java:180)
    at 
java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
    at 
java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:699)
    at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
    at 
java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
    at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
    at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
Caused by: java.util.concurrent.TimeoutException
    at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
    at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
    at 
org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:74)
    ... 32 more

java.util.concurrent.TimeoutException
    at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
    at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
    at 
org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:74)
    at 
org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
    at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:67)
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
    at 
org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.awaitPrimaryReplica(ItDisasterRecoveryReconfigurationTest

[jira] [Updated] (IGNITE-22330) ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is flaky

2024-05-24 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22330:
-
Summary: 
ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is 
flaky  (was: 
ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is 
flacky)

> ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is 
> flaky
> --
>
> Key: IGNITE-22330
> URL: https://issues.apache.org/jira/browse/IGNITE-22330
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
> Attachments: image-2024-05-25-01-56-21-327.png
>
>
> This test fails at least on main {{4c6662}} with strange (series of failures) 
> rate:  !image-2024-05-25-01-56-21-327.png!
> The common issue is {{{}TimeoutException{}}}:
>  
> {code:java}
> java.lang.AssertionError: java.util.concurrent.TimeoutException
>     at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:78)
>     at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
>     at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:67)
>     at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)
>     at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>     at 
> org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.awaitPrimaryReplica(ItDisasterRecoveryReconfigurationTest.java:306)
>     at 
> org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.testManualRebalanceIfMajorityIsLost(ItDisasterRecoveryReconfigurationTest.java:209)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>     at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
>     at 
> java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
>     at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>     at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
>     at java.base/java.util.stream.IntPipeline$1$1.accept(IntPipeline.java:180)
>     at 
> java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
>     at 
> java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:699)
>     at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
>     at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>     at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>     at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
>     at 
> java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
>     at 
> java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
>     at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
>     at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>     at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>     at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
>     at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>     at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
> Caused by: java.util.concurrent.TimeoutException
>     at 
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
>     at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
>     at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:74)
>     ... 32 more
> java.util.concurrent.TimeoutException
>     at 
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
>     at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2

[jira] [Comment Edited] (IGNITE-21805) Refactor TableManager and move all RAFT related pieces to Replica

2024-05-27 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849819#comment-17849819
 ] 

Mikhail Efremov edited comment on IGNITE-21805 at 5/27/24 8:07 PM:
---

*Done work*
 # {{RaftManager}} is moved into {{ReplicaManager}} completely.
 # Almost all calls from {{RaftManager}} are made as internal 
{{{}ReplicaManager{}}}'s calls.
 # {{ReplicaManager#replica()}} method is created and it returns a future 
replica by given replication group ID; the method allows to get access to 
raft-clients outside replication layer, but through it.
 # The follow RAFT-related entities are moved into {{{}ReplicaManager{}}}: 
{{{}RaftGroupOptions{}}}, {{RaftGroupEventsListener}} 
({{{}RebalanceRaftGroupEventsListener{}}}), {{{}PartitionMover{}}}, 
{{{}TopologyAwareRaftGroupServiceFactory{}}}, {{{}LogStorageFactoryCreator{}}}.
 # The creation place of raft-client in {{TableManager}} through 
{{ReplicaManager#startRaftClient}} is localized now and will be removed in  
IGNITE-22315

*Related next tickets*
 #  IGNITE-22315 is about making code to a state where if a node is related in 
assignments, then and only then there will be created only one raft-node, only 
one raft-client and only one replica. {{ReplicaManager#startRaftClient}} will 
be removed as it's single call.
 # IGNITE-22218 is about {{TableRaftService}} removing from the code and it's 
calls for raft-clients will be replaced with a chains like 
{{{}replicaMgr.replica(replicaGrpId).thenCompose(raftClient -> ...){}}}.
 # IGNITE-22292 is about {{ReplicaManager#getLogSyncer}} removal.

 


was (Author: JIRAUSER303791):
*Done work*
 # {{RaftManager}} is moved into {{ReplicaManager}} completely.
 # Almost all calls from {{RaftManager}} are made as internal 
{{{}ReplicaManager{}}}'s calls.
 # {{ReplicaManager#replica()}} method is created and it returns a future 
replica by given replication group ID; the method allows to get access to 
raft-clients outside replication layer, but through it.
 # The follow RAFT-related entities are moved into {{{}ReplicaManager{}}}: 
{{{}RaftGroupOptions{}}}, {{RaftGroupEventsListener}} 
({{{}RebalanceRaftGroupEventsListener{}}}), {{{}PartitionMover{}}}, 
{{{}TopologyAwareRaftGroupServiceFactory{}}}, {{{}LogStorageFactoryCreator{}}}.
 # The creation place of raft-client in {{TableManager}} through 
{{ReplicaManager#startRaftClient}} is localized now and will be removed in  
IGNITE-22315

*Related next tickets*
 #  IGNITE-22315 is about making code to a state where if a node is related in 
assignments, then and only then there will be created only one raft-node, only 
one raft-client and only one replica. {{ReplicaManager#startRaftClient}} will 
be removed as it's single call.
 # IGNITE-22218 is about {{TableRaftService}} removing from the code and it's 
calls for raft-clients will be replaced with a chains like 
{{{}replicaMgr.replica(replicaGrpId).thenCompose(raftClient -> ...){}}}.

 

> Refactor TableManager and move all RAFT related pieces to Replica
> -
>
> Key: IGNITE-21805
> URL: https://issues.apache.org/jira/browse/IGNITE-21805
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Motivation*
> At the moment we have some places inside the TableManager, which:
>  * Use RaftManager to start/stop RAFT nodes 
> (handleChangePendingAssignmentEvent/handleChangeStableAssignmentEvent)
>  * Use RaftGroupService through 
> table.internalTable().tableRaftService().partitionRaftGroupService calls
> This fact prevents us on the track of zone-based collocation. The further 
> collocation steps will be easier, if we will move the whole RAFT connected 
> operation to the Replica class. Moreover, it should be there semantically
> *Definition of done*
>  * All code inside the handleChangePendingAssignmentEvent connected with the 
> start of raft groups (PartitionListener/RebalanceRaftGroupEventsListener) and 
> raft clients must be moved to the start of the Replica itself
>  * The same about handleChangeStableAssignmentEvent - the stop of Replica 
> must stop appropriate raft node
>  * All calls for 
> table.internalTable().tableRaftService().partitionRaftGroupService must be 
> replaced by the replicaMgr.replica(replicaGrpdId).raftClient()
> *Implementation notes*
>  * The new temporary methods must be implemented and remove after IGNITE-22036
>  ** ReplicaManager.replica(ReplicationGroupId replicaGrpId) - which returns 
> the appropriate Replica by group id
>  ** Replica.raftClient() - which return replica's RaftGroupService (raft 
> client)
>  



--
This message was sent by Atlassian Jira
(v8.20

[jira] [Comment Edited] (IGNITE-21805) Refactor TableManager and move all RAFT related pieces to Replica

2024-05-27 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849819#comment-17849819
 ] 

Mikhail Efremov edited comment on IGNITE-21805 at 5/27/24 8:10 PM:
---

*Done work*
 # {{RaftManager}} is moved into {{ReplicaManager}} completely.
 # Almost all calls from {{RaftManager}} are made as internal 
{{{}ReplicaManager{}}}'s calls.
 # {{ReplicaManager#replica()}} method is created and it returns a future 
replica by given replication group ID; the method allows to get access to 
raft-clients outside replication layer, but through it.
 # The follow RAFT-related entities are moved into {{{}ReplicaManager{}}}: 
{{{}RaftGroupOptions{}}}, {{RaftGroupEventsListener}} 
({{{}RebalanceRaftGroupEventsListener{}}}), {{{}PartitionMover{}}}, 
{{{}TopologyAwareRaftGroupServiceFactory{}}}, {{{}LogStorageFactoryCreator{}}}.
 # The creation place of raft-client in {{TableManager}} through 
{{ReplicaManager#startRaftClient}} is localized now and will be removed in  
IGNITE-22315

*Related next tickets*
 #  IGNITE-22315 is about making code to a state where if a node is related in 
assignments, then and only then there will be created only one raft-node, only 
one raft-client and only one replica. {{ReplicaManager#startRaftClient}} will 
be removed as it's single call.
 # IGNITE-22218 is about {{TableRaftService}} removing from the code and it's 
calls for raft-clients will be replaced with a chains like 
{{{}replicaMgr.replica(replicaGrpId).thenApply(Replica::raftClient).thenCompose(raftClient
 -> ...){}}}.
 # IGNITE-22292 is about {{ReplicaManager#getLogSyncer}} removal.

 


was (Author: JIRAUSER303791):
*Done work*
 # {{RaftManager}} is moved into {{ReplicaManager}} completely.
 # Almost all calls from {{RaftManager}} are made as internal 
{{{}ReplicaManager{}}}'s calls.
 # {{ReplicaManager#replica()}} method is created and it returns a future 
replica by given replication group ID; the method allows to get access to 
raft-clients outside replication layer, but through it.
 # The follow RAFT-related entities are moved into {{{}ReplicaManager{}}}: 
{{{}RaftGroupOptions{}}}, {{RaftGroupEventsListener}} 
({{{}RebalanceRaftGroupEventsListener{}}}), {{{}PartitionMover{}}}, 
{{{}TopologyAwareRaftGroupServiceFactory{}}}, {{{}LogStorageFactoryCreator{}}}.
 # The creation place of raft-client in {{TableManager}} through 
{{ReplicaManager#startRaftClient}} is localized now and will be removed in  
IGNITE-22315

*Related next tickets*
 #  IGNITE-22315 is about making code to a state where if a node is related in 
assignments, then and only then there will be created only one raft-node, only 
one raft-client and only one replica. {{ReplicaManager#startRaftClient}} will 
be removed as it's single call.
 # IGNITE-22218 is about {{TableRaftService}} removing from the code and it's 
calls for raft-clients will be replaced with a chains like 
{{{}replicaMgr.replica(replicaGrpId).thenCompose(raftClient -> ...){}}}.
 # IGNITE-22292 is about {{ReplicaManager#getLogSyncer}} removal.

 

> Refactor TableManager and move all RAFT related pieces to Replica
> -
>
> Key: IGNITE-21805
> URL: https://issues.apache.org/jira/browse/IGNITE-21805
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Motivation*
> At the moment we have some places inside the TableManager, which:
>  * Use RaftManager to start/stop RAFT nodes 
> (handleChangePendingAssignmentEvent/handleChangeStableAssignmentEvent)
>  * Use RaftGroupService through 
> table.internalTable().tableRaftService().partitionRaftGroupService calls
> This fact prevents us on the track of zone-based collocation. The further 
> collocation steps will be easier, if we will move the whole RAFT connected 
> operation to the Replica class. Moreover, it should be there semantically
> *Definition of done*
>  * All code inside the handleChangePendingAssignmentEvent connected with the 
> start of raft groups (PartitionListener/RebalanceRaftGroupEventsListener) and 
> raft clients must be moved to the start of the Replica itself
>  * The same about handleChangeStableAssignmentEvent - the stop of Replica 
> must stop appropriate raft node
>  * All calls for 
> table.internalTable().tableRaftService().partitionRaftGroupService must be 
> replaced by the replicaMgr.replica(replicaGrpdId).raftClient()
> *Implementation notes*
>  * The new temporary methods must be implemented and remove after IGNITE-22036
>  ** ReplicaManager.replica(ReplicationGroupId replicaGrpId) - which returns 
> the appropriate Replica by group id
>  ** Replica.raftClient() - which return re

[jira] [Comment Edited] (IGNITE-21805) Refactor TableManager and move all RAFT related pieces to Replica

2024-05-27 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849819#comment-17849819
 ] 

Mikhail Efremov edited comment on IGNITE-21805 at 5/27/24 8:11 PM:
---

*Done work*
 # {{RaftManager}} is moved into {{ReplicaManager}} completely.
 # Almost all calls from {{RaftManager}} are made as internal 
{{{}ReplicaManager{}}}'s calls.
 # {{ReplicaManager#replica()}} method is created and it returns a future 
replica by given replication group ID; the method allows to get access to 
raft-clients outside replication layer, but through it.
 # The follow RAFT-related entities are moved into {{{}ReplicaManager{}}}: 
{{{}RaftGroupOptions{}}}, {{RaftGroupEventsListener}} 
({{{}RebalanceRaftGroupEventsListener{}}}), {{{}PartitionMover{}}}, 
{{{}TopologyAwareRaftGroupServiceFactory{}}}, {{{}LogStorageFactoryCreator{}}}.
 # The creation place of raft-client in {{TableManager}} through 
{{ReplicaManager#startRaftClient}} is localized now and will be removed in  
IGNITE-22315

*Next related tickets*
 #  IGNITE-22315 is about making code to a state where if a node is related in 
assignments, then and only then there will be created only one raft-node, only 
one raft-client and only one replica. {{ReplicaManager#startRaftClient}} will 
be removed as it's single call.
 # IGNITE-22218 is about {{TableRaftService}} removing from the code and it's 
calls for raft-clients will be replaced with a chains like 
{{{}replicaMgr.replica(replicaGrpId).thenApply(Replica::raftClient).thenCompose(raftClient
 -> ...){}}}.
 # IGNITE-22292 is about {{ReplicaManager#getLogSyncer}} removal.

 


was (Author: JIRAUSER303791):
*Done work*
 # {{RaftManager}} is moved into {{ReplicaManager}} completely.
 # Almost all calls from {{RaftManager}} are made as internal 
{{{}ReplicaManager{}}}'s calls.
 # {{ReplicaManager#replica()}} method is created and it returns a future 
replica by given replication group ID; the method allows to get access to 
raft-clients outside replication layer, but through it.
 # The follow RAFT-related entities are moved into {{{}ReplicaManager{}}}: 
{{{}RaftGroupOptions{}}}, {{RaftGroupEventsListener}} 
({{{}RebalanceRaftGroupEventsListener{}}}), {{{}PartitionMover{}}}, 
{{{}TopologyAwareRaftGroupServiceFactory{}}}, {{{}LogStorageFactoryCreator{}}}.
 # The creation place of raft-client in {{TableManager}} through 
{{ReplicaManager#startRaftClient}} is localized now and will be removed in  
IGNITE-22315

*Related next tickets*
 #  IGNITE-22315 is about making code to a state where if a node is related in 
assignments, then and only then there will be created only one raft-node, only 
one raft-client and only one replica. {{ReplicaManager#startRaftClient}} will 
be removed as it's single call.
 # IGNITE-22218 is about {{TableRaftService}} removing from the code and it's 
calls for raft-clients will be replaced with a chains like 
{{{}replicaMgr.replica(replicaGrpId).thenApply(Replica::raftClient).thenCompose(raftClient
 -> ...){}}}.
 # IGNITE-22292 is about {{ReplicaManager#getLogSyncer}} removal.

 

> Refactor TableManager and move all RAFT related pieces to Replica
> -
>
> Key: IGNITE-21805
> URL: https://issues.apache.org/jira/browse/IGNITE-21805
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Motivation*
> At the moment we have some places inside the TableManager, which:
>  * Use RaftManager to start/stop RAFT nodes 
> (handleChangePendingAssignmentEvent/handleChangeStableAssignmentEvent)
>  * Use RaftGroupService through 
> table.internalTable().tableRaftService().partitionRaftGroupService calls
> This fact prevents us on the track of zone-based collocation. The further 
> collocation steps will be easier, if we will move the whole RAFT connected 
> operation to the Replica class. Moreover, it should be there semantically
> *Definition of done*
>  * All code inside the handleChangePendingAssignmentEvent connected with the 
> start of raft groups (PartitionListener/RebalanceRaftGroupEventsListener) and 
> raft clients must be moved to the start of the Replica itself
>  * The same about handleChangeStableAssignmentEvent - the stop of Replica 
> must stop appropriate raft node
>  * All calls for 
> table.internalTable().tableRaftService().partitionRaftGroupService must be 
> replaced by the replicaMgr.replica(replicaGrpdId).raftClient()
> *Implementation notes*
>  * The new temporary methods must be implemented and remove after IGNITE-22036
>  ** ReplicaManager.replica(ReplicationGroupId replicaGrpId) - which returns 
> the appropriate Replica by group id
>  ** Replic

[jira] [Comment Edited] (IGNITE-21805) Refactor TableManager and move all RAFT related pieces to Replica

2024-05-27 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849819#comment-17849819
 ] 

Mikhail Efremov edited comment on IGNITE-21805 at 5/27/24 8:12 PM:
---

*Done work*
 # {{RaftManager}} is moved into {{ReplicaManager}} completely.
 # Almost all calls from {{RaftManager}} are made as internal 
{{ReplicaManager}}'s calls.
 # {{ReplicaManager#replica()}} method is created and it returns a future 
replica by given replication group ID; the method allows to get access to 
raft-clients outside replication layer, but through it.
 # The follow RAFT-related entities are moved into {{ReplicaManager}}: 
{{RaftGroupOptions}}, {{RaftGroupEventsListener}} 
({{RebalanceRaftGroupEventsListener}}), {{PartitionMover}}, 
{{TopologyAwareRaftGroupServiceFactory}}, {{LogStorageFactoryCreator}}.
 # The creation place of raft-client in {{TableManager}} through 
{{ReplicaManager#startRaftClient}} is localized now and will be removed in  
IGNITE-22315

*Next related tickets*
 #  IGNITE-22315 is about making code to a state where if a node is related in 
assignments, then and only then there will be created only one raft-node, only 
one raft-client and only one replica. {{ReplicaManager#startRaftClient}} will 
be removed as it's single call.
 # IGNITE-22218 is about {{TableRaftService}} removing from the code and it's 
calls for raft-clients will be replaced with a chains like 
{{replicaMgr.replica(replicaGrpId).thenApply(Replica::raftClient).thenCompose(raftClient
 -> ...)}}.
 # IGNITE-22292 is about {{ReplicaManager#getLogSyncer}} removal.

 


was (Author: JIRAUSER303791):
*Done work*
 # {{RaftManager}} is moved into {{ReplicaManager}} completely.
 # Almost all calls from {{RaftManager}} are made as internal 
{{{}ReplicaManager{}}}'s calls.
 # {{ReplicaManager#replica()}} method is created and it returns a future 
replica by given replication group ID; the method allows to get access to 
raft-clients outside replication layer, but through it.
 # The follow RAFT-related entities are moved into {{{}ReplicaManager{}}}: 
{{{}RaftGroupOptions{}}}, {{RaftGroupEventsListener}} 
({{{}RebalanceRaftGroupEventsListener{}}}), {{{}PartitionMover{}}}, 
{{{}TopologyAwareRaftGroupServiceFactory{}}}, {{{}LogStorageFactoryCreator{}}}.
 # The creation place of raft-client in {{TableManager}} through 
{{ReplicaManager#startRaftClient}} is localized now and will be removed in  
IGNITE-22315

*Next related tickets*
 #  IGNITE-22315 is about making code to a state where if a node is related in 
assignments, then and only then there will be created only one raft-node, only 
one raft-client and only one replica. {{ReplicaManager#startRaftClient}} will 
be removed as it's single call.
 # IGNITE-22218 is about {{TableRaftService}} removing from the code and it's 
calls for raft-clients will be replaced with a chains like 
{{{}replicaMgr.replica(replicaGrpId).thenApply(Replica::raftClient).thenCompose(raftClient
 -> ...){}}}.
 # IGNITE-22292 is about {{ReplicaManager#getLogSyncer}} removal.

 

> Refactor TableManager and move all RAFT related pieces to Replica
> -
>
> Key: IGNITE-21805
> URL: https://issues.apache.org/jira/browse/IGNITE-21805
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Motivation*
> At the moment we have some places inside the TableManager, which:
>  * Use RaftManager to start/stop RAFT nodes 
> (handleChangePendingAssignmentEvent/handleChangeStableAssignmentEvent)
>  * Use RaftGroupService through 
> table.internalTable().tableRaftService().partitionRaftGroupService calls
> This fact prevents us on the track of zone-based collocation. The further 
> collocation steps will be easier, if we will move the whole RAFT connected 
> operation to the Replica class. Moreover, it should be there semantically
> *Definition of done*
>  * All code inside the handleChangePendingAssignmentEvent connected with the 
> start of raft groups (PartitionListener/RebalanceRaftGroupEventsListener) and 
> raft clients must be moved to the start of the Replica itself
>  * The same about handleChangeStableAssignmentEvent - the stop of Replica 
> must stop appropriate raft node
>  * All calls for 
> table.internalTable().tableRaftService().partitionRaftGroupService must be 
> replaced by the replicaMgr.replica(replicaGrpdId).raftClient()
> *Implementation notes*
>  * The new temporary methods must be implemented and remove after IGNITE-22036
>  ** ReplicaManager.replica(ReplicationGroupId replicaGrpId) - which returns 
> the appropriate Replica by group id
>  ** Replica.raftClient() - which return re

[jira] [Created] (IGNITE-22355) TableManager's static mocks from #mockManagersAndCreateTableWithDelay don't work

2024-05-28 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22355:


 Summary: TableManager's static mocks from 
#mockManagersAndCreateTableWithDelay don't work
 Key: IGNITE-22355
 URL: https://issues.apache.org/jira/browse/IGNITE-22355
 Project: Ignite
  Issue Type: Bug
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

In tests table creation method there is code like below:

 
{code:java}
try (MockedStatic affinityServiceMock = 
mockStatic(AffinityUtils.class)) {
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityServiceMock.when(() -> AffinityUtils.calculateAssignments(any(), 
anyInt(), anyInt()))
.thenReturn(assignment);
}{code}
As the result {{AffinityUtils#calculateAssignments}} calls outside of 
try-with-resources returns List of empty sets, but desired behavior is list of 
sets with the given {{{}node{}}}.

*Definition of done*

{{AffinityUtils#calculateAssignments}} call inside 
{{TableManager#getOrCreateAssignments}} must returns list of non-empty sets 
with single node.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22355) TableManagerTest's static mocks from #mockManagersAndCreateTableWithDelay don't work properly

2024-05-28 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22355:
-
Summary: TableManagerTest's static mocks from 
#mockManagersAndCreateTableWithDelay don't work properly  (was: TableManager's 
static mocks from #mockManagersAndCreateTableWithDelay don't work)

> TableManagerTest's static mocks from #mockManagersAndCreateTableWithDelay 
> don't work properly
> -
>
> Key: IGNITE-22355
> URL: https://issues.apache.org/jira/browse/IGNITE-22355
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> In tests table creation method there is code like below:
>  
> {code:java}
> try (MockedStatic affinityServiceMock = 
> mockStatic(AffinityUtils.class)) {
> ArrayList> assignment = new ArrayList<>(PARTITIONS);
> for (int part = 0; part < PARTITIONS; part++) {
> assignment.add(new ArrayList<>(Collections.singleton(node)));
> }
> affinityServiceMock.when(() -> AffinityUtils.calculateAssignments(any(), 
> anyInt(), anyInt()))
> .thenReturn(assignment);
> }{code}
> As the result {{AffinityUtils#calculateAssignments}} calls outside of 
> try-with-resources returns List of empty sets, but desired behavior is list 
> of sets with the given {{{}node{}}}.
> *Definition of done*
> {{AffinityUtils#calculateAssignments}} call inside 
> {{TableManager#getOrCreateAssignments}} must returns list of non-empty sets 
> with single node.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22372) Moving RaftManager into Replica

2024-05-29 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22372:


 Summary: Moving RaftManager into Replica
 Key: IGNITE-22372
 URL: https://issues.apache.org/jira/browse/IGNITE-22372
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

The main goal is to move {{RaftManager}} into {{Replica}} through it's 
constructor and starting raft-node and raft-client in {{Replica}} and stopping 
on {{Replica}}'s shutdown process inside a corresponding replica.

*Motivation*

After IGNITE-21805 {{RaftManager}} was moved from {{TableManager}} into 
{{ReplicaManager}}. Therefore raft-nodes and raft-replicas are starting and 
stopping inside {{ReplicaManager}}. The main idea of previous ticket is to hide 
all RAFT-related entities as far as possible into Replica itself. Then, 
{{RaftManager}} and related usages should be moved into {{ReplicaManager.}}

*Definition of Done*
 # {{RaftManager}} component is moved to {{Replica}}.
 # {{Loza#startRaftGroupNode}} and {{RaftManager#stopRaftNodes}} should be 
called inside {{Replica}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22372) Moving RaftManager into Replica

2024-05-29 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22372:
-
Description: 
*Description*

The main goal is to move {{RaftManager}} into {{Replica}} through it's 
constructor and starting raft-node and raft-client in {{Replica}} and stopping 
on {{Replica}}'s shutdown process inside a corresponding replica.

*Motivation*

After IGNITE-21805 {{RaftManager}} was moved from {{TableManager}} into 
{{ReplicaManager}}. Therefore raft-nodes and raft-replicas are starting and 
stopping inside {{ReplicaManager}}. The main idea of previous ticket is to hide 
all RAFT-related entities as far as possible into Replica itself. Then, 
{{RaftManager}} and related usages should be moved into {{ReplicaManager.}}

*Definition of Done*
 # {{RaftManager}} component is moved to {{Replica}}.
 # {{Loza#startRaftGroupNode}} and {{RaftManager#stopRaftNodes}} should be 
called inside {{Replica}}.

*Current issues*
There no of determination how to handle this code: {{((Loza) 
raftManager).volatileRaft().logStorage().value();}} In theory, the only purpose 
is to create raft options and pass them into {{Loza#startRaftGroupNode}} and 
critical challenges shouldn't arise there. But the attention is here.

  was:
*Description*

The main goal is to move {{RaftManager}} into {{Replica}} through it's 
constructor and starting raft-node and raft-client in {{Replica}} and stopping 
on {{Replica}}'s shutdown process inside a corresponding replica.

*Motivation*

After IGNITE-21805 {{RaftManager}} was moved from {{TableManager}} into 
{{ReplicaManager}}. Therefore raft-nodes and raft-replicas are starting and 
stopping inside {{ReplicaManager}}. The main idea of previous ticket is to hide 
all RAFT-related entities as far as possible into Replica itself. Then, 
{{RaftManager}} and related usages should be moved into {{ReplicaManager.}}

*Definition of Done*
 # {{RaftManager}} component is moved to {{Replica}}.
 # {{Loza#startRaftGroupNode}} and {{RaftManager#stopRaftNodes}} should be 
called inside {{Replica}}.


> Moving RaftManager into Replica
> ---
>
> Key: IGNITE-22372
> URL: https://issues.apache.org/jira/browse/IGNITE-22372
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> The main goal is to move {{RaftManager}} into {{Replica}} through it's 
> constructor and starting raft-node and raft-client in {{Replica}} and 
> stopping on {{Replica}}'s shutdown process inside a corresponding replica.
> *Motivation*
> After IGNITE-21805 {{RaftManager}} was moved from {{TableManager}} into 
> {{ReplicaManager}}. Therefore raft-nodes and raft-replicas are starting and 
> stopping inside {{ReplicaManager}}. The main idea of previous ticket is to 
> hide all RAFT-related entities as far as possible into Replica itself. Then, 
> {{RaftManager}} and related usages should be moved into {{ReplicaManager.}}
> *Definition of Done*
>  # {{RaftManager}} component is moved to {{Replica}}.
>  # {{Loza#startRaftGroupNode}} and {{RaftManager#stopRaftNodes}} should be 
> called inside {{Replica}}.
> *Current issues*
> There no of determination how to handle this code: {{((Loza) 
> raftManager).volatileRaft().logStorage().value();}} In theory, the only 
> purpose is to create raft options and pass them into 
> {{Loza#startRaftGroupNode}} and critical challenges shouldn't arise there. 
> But the attention is here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22373) Delete startReplica overloadings from ReplicaManager

2024-05-29 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22373:


 Summary: Delete startReplica overloadings from ReplicaManager
 Key: IGNITE-22373
 URL: https://issues.apache.org/jira/browse/IGNITE-22373
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*
After IGNITE-21805 there are still 2 redundant {{Replica#startReplica}} 
overloads:

{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PeersAndLearners newConfiguration,
Consumer updateTableRaftService,
Function createListener,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newRaftClientFut
) { ... }
{code}

{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newReplicaListenerFut
) { ... }
{code}

They are marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and 
their only purpose is to be used in tests. 

The main goal of this ticket is to delete them or make them private.

*Motivation*
There should the only one public {{Replica#startReplica}} method for 
replication group creation.

*Definition of Done*
1. Both provided above overloads of {{Replica#startReplica}} should be deleted 
or be private.
2. All tests that called the overloaded methods should be fixed in favor of the 
single {{Replica#startReplica}} method which is called now in {{TableManager}}.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-22355) TableManagerTest's static mocks from #mockManagersAndCreateTableWithDelay don't work properly

2024-05-29 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850479#comment-17850479
 ] 

Mikhail Efremov edited comment on IGNITE-22355 at 5/29/24 7:25 PM:
---

Due to [https://github.com/mockito/mockito/issues/2142] static mocks aren't 
accessible from a different threads. I tired to use static mock as a field:

{code:java}
private MockedStatic affinityUtilsStaticMock;
@BeforeEach
void before() throws NodeStoppingException {
affinityUtilsStaticMock = mockStatic(AffinityUtils.class);
// ...
}
// ...
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
// ...
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityUtilsStaticMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), 
anyInt())).thenReturn(assignment);

TableManager tableManager = createTableManager(tblManagerFut);
// ...
}
// ...
void after() throws Exception {
ComponentContext componentContext = new ComponentContext();
closeAll(
// ...
affinityUtilsStaticMock
);
}
{code}

But inside inside {{TableManager#getOrCreateAssignments}} it didn't works, the 
place:

{code:java}
assignmentsFuture = distributionZoneManager.dataNodes(causalityToken, 
catalogVersion, zoneDescriptor.id())
.thenApply(dataNodes -> AffinityUtils.calculateAssignments(
dataNodes,
zoneDescriptor.partitions(),
zoneDescriptor.replicas()
)//...
{code}

returns a list with size {{zoneDescriptor.partitions()}} of empty node set as 
the ticket describes as problem. In debug we can check is the mocking still 
works with:

{code:java}
Mockito.mockingDetails(AffinityUtils.class).isMock()
{code}

and it returns {{false}}. In oppisite, we can check that after {{.when}} 
statement the same check returns {{true}}.

The temporal solution is to mock {{DistributionZoneManager#dataNodes}} with 
non-empty set with {{node1}} inside. Then, non-mocked 
{{AffinityUtils#calculateAssignments}} returns a list with size 
{{zoneDescriptor.partitions()}} (in the case partitions equals 32) of non-empty 
node set with {{node1} inside as desired.

The test class demands rewriting, the related ticket must be created.


 


was (Author: JIRAUSER303791):
Due to [https://github.com/mockito/mockito/issues/2142] static mocks aren't 
accessible from a different threads. I tired to use static mock as a field:

 
{code:java}
private MockedStatic affinityUtilsStaticMock;
@BeforeEach
void before() throws NodeStoppingException {
affinityUtilsStaticMock = mockStatic(AffinityUtils.class);
// ...
}
// ...
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
// ...
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityUtilsStaticMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), 
anyInt())).thenReturn(assignment);

TableManager tableManager = createTableManager(tblManagerFut);
// ...
}
// ...
void after() throws Exception {
ComponentContext componentContext = new ComponentContext();
closeAll(
// ...
affinityUtilsStaticMock
);
}
{code}

But inside inside {{TableManager#getOrCreateAssignments}} it didn't works, the 
place:


{code:java}
assignmentsFuture = distributionZoneManager.dataNodes(causalityToken, 
catalogVersion, zoneDescriptor.id())
.thenApply(dataNodes -> AffinityUtils.calculateAssignments(
dataNodes,
zoneDescriptor.partitions(),
zoneDescriptor.replicas()
)//...
{code}

returns a list with size {{zoneDescriptor.partitions()}} of empty node set as 
the ticket describes as problem. In debug we can check is the mocking still 
works with:

{code:java}
Mockito.mockingDetails(AffinityUtils.class).isMock()
{code}

and it returns {{false}}. In oppisite, we can check that after {{.when}} 
statement the same check returns {{true}}.

The temporal solution is to mock {{DistributionZoneManager#dataNodes}} with 
non-empty set with {{node1}} inside. Then, non-mocked 
{{AffinityUtils#calculateAssignments}} returns a list with size 
{{zoneDescriptor.partitions()}} (in the case partitions equals 32) of non-empty 
node set with {{node1} inside as desired.

The test class demands rewriting, the related ticket must be created.


 

> TableManagerTest's static mocks from #moc

[jira] [Commented] (IGNITE-22355) TableManagerTest's static mocks from #mockManagersAndCreateTableWithDelay don't work properly

2024-05-29 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850479#comment-17850479
 ] 

Mikhail Efremov commented on IGNITE-22355:
--

Due to [https://github.com/mockito/mockito/issues/2142] static mocks aren't 
accessible from a different threads. I tired to use static mock as a field:

 
{code:java}
private MockedStatic affinityUtilsStaticMock;
@BeforeEach
void before() throws NodeStoppingException {
affinityUtilsStaticMock = mockStatic(AffinityUtils.class);
// ...
}
// ...
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
// ...
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityUtilsStaticMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), 
anyInt())).thenReturn(assignment);

TableManager tableManager = createTableManager(tblManagerFut);
// ...
}
// ...
void after() throws Exception {
ComponentContext componentContext = new ComponentContext();
closeAll(
// ...
affinityUtilsStaticMock
);
}
{code}

But inside inside {{TableManager#getOrCreateAssignments}} it didn't works, the 
place:


{code:java}
assignmentsFuture = distributionZoneManager.dataNodes(causalityToken, 
catalogVersion, zoneDescriptor.id())
.thenApply(dataNodes -> AffinityUtils.calculateAssignments(
dataNodes,
zoneDescriptor.partitions(),
zoneDescriptor.replicas()
)//...
{code}

returns a list with size {{zoneDescriptor.partitions()}} of empty node set as 
the ticket describes as problem. In debug we can check is the mocking still 
works with:

{code:java}
Mockito.mockingDetails(AffinityUtils.class).isMock()
{code}

and it returns {{false}}. In oppisite, we can check that after {{.when}} 
statement the same check returns {{true}}.

The temporal solution is to mock {{DistributionZoneManager#dataNodes}} with 
non-empty set with {{node1}} inside. Then, non-mocked 
{{AffinityUtils#calculateAssignments}} returns a list with size 
{{zoneDescriptor.partitions()}} (in the case partitions equals 32) of non-empty 
node set with {{node1} inside as desired.

The test class demands rewriting, the related ticket must be created.


 

> TableManagerTest's static mocks from #mockManagersAndCreateTableWithDelay 
> don't work properly
> -
>
> Key: IGNITE-22355
> URL: https://issues.apache.org/jira/browse/IGNITE-22355
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> In tests table creation method there is code like below:
>  
> {code:java}
> try (MockedStatic affinityServiceMock = 
> mockStatic(AffinityUtils.class)) {
> ArrayList> assignment = new ArrayList<>(PARTITIONS);
> for (int part = 0; part < PARTITIONS; part++) {
> assignment.add(new ArrayList<>(Collections.singleton(node)));
> }
> affinityServiceMock.when(() -> AffinityUtils.calculateAssignments(any(), 
> anyInt(), anyInt()))
> .thenReturn(assignment);
> }{code}
> As the result {{AffinityUtils#calculateAssignments}} calls outside of 
> try-with-resources returns List of empty sets, but desired behavior is list 
> of sets with the given {{{}node{}}}.
> *Definition of done*
> {{AffinityUtils#calculateAssignments}} call inside 
> {{TableManager#getOrCreateAssignments}} must returns list of non-empty sets 
> with single node.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-22355) TableManagerTest's static mocks from #mockManagersAndCreateTableWithDelay don't work properly

2024-05-29 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850479#comment-17850479
 ] 

Mikhail Efremov edited comment on IGNITE-22355 at 5/29/24 7:26 PM:
---

Due to [https://github.com/mockito/mockito/issues/2142] static mocks aren't 
accessible from a different threads. I tired to use static mock as a field:

{code:java}
private MockedStatic affinityUtilsStaticMock;

@BeforeEach
void before() throws NodeStoppingException {
affinityUtilsStaticMock = mockStatic(AffinityUtils.class);
// ...
}
// ...
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
// ...
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityUtilsStaticMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), 
anyInt())).thenReturn(assignment);

TableManager tableManager = createTableManager(tblManagerFut);
// ...
}
// ...
void after() throws Exception {
ComponentContext componentContext = new ComponentContext();
closeAll(
// ...
affinityUtilsStaticMock
);
}
{code}

But inside inside {{TableManager#getOrCreateAssignments}} it didn't works, the 
place:

{code:java}
assignmentsFuture = distributionZoneManager.dataNodes(causalityToken, 
catalogVersion, zoneDescriptor.id())
.thenApply(dataNodes -> AffinityUtils.calculateAssignments(
dataNodes,
zoneDescriptor.partitions(),
zoneDescriptor.replicas()
)//...
{code}

returns a list with size {{zoneDescriptor.partitions()}} of empty node set as 
the ticket describes as problem. In debug we can check is the mocking still 
works with:

{code:java}
Mockito.mockingDetails(AffinityUtils.class).isMock()
{code}

and it returns {{false}}. In oppisite, we can check that after {{.when}} 
statement the same check returns {{true}}.

The temporal solution is to mock {{DistributionZoneManager#dataNodes}} with 
non-empty set with {{node1}} inside. Then, non-mocked 
{{AffinityUtils#calculateAssignments}} returns a list with size 
{{zoneDescriptor.partitions()}} (in the case partitions equals 32) of non-empty 
node set with {{node1} inside as desired.

The test class demands rewriting, the related ticket must be created.


 


was (Author: JIRAUSER303791):
Due to [https://github.com/mockito/mockito/issues/2142] static mocks aren't 
accessible from a different threads. I tired to use static mock as a field:

{code:java}
private MockedStatic affinityUtilsStaticMock;
@BeforeEach
void before() throws NodeStoppingException {
affinityUtilsStaticMock = mockStatic(AffinityUtils.class);
// ...
}
// ...
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
// ...
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityUtilsStaticMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), 
anyInt())).thenReturn(assignment);

TableManager tableManager = createTableManager(tblManagerFut);
// ...
}
// ...
void after() throws Exception {
ComponentContext componentContext = new ComponentContext();
closeAll(
// ...
affinityUtilsStaticMock
);
}
{code}

But inside inside {{TableManager#getOrCreateAssignments}} it didn't works, the 
place:

{code:java}
assignmentsFuture = distributionZoneManager.dataNodes(causalityToken, 
catalogVersion, zoneDescriptor.id())
.thenApply(dataNodes -> AffinityUtils.calculateAssignments(
dataNodes,
zoneDescriptor.partitions(),
zoneDescriptor.replicas()
)//...
{code}

returns a list with size {{zoneDescriptor.partitions()}} of empty node set as 
the ticket describes as problem. In debug we can check is the mocking still 
works with:

{code:java}
Mockito.mockingDetails(AffinityUtils.class).isMock()
{code}

and it returns {{false}}. In oppisite, we can check that after {{.when}} 
statement the same check returns {{true}}.

The temporal solution is to mock {{DistributionZoneManager#dataNodes}} with 
non-empty set with {{node1}} inside. Then, non-mocked 
{{AffinityUtils#calculateAssignments}} returns a list with size 
{{zoneDescriptor.partitions()}} (in the case partitions equals 32) of non-empty 
node set with {{node1} inside as desired.

The test class demands rewriting, the related ticket must be created.


 

> TableManagerTest's static mocks from #mockM

[jira] [Comment Edited] (IGNITE-22355) TableManagerTest's static mocks from #mockManagersAndCreateTableWithDelay don't work properly

2024-05-29 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850479#comment-17850479
 ] 

Mikhail Efremov edited comment on IGNITE-22355 at 5/29/24 7:34 PM:
---

Due to [https://github.com/mockito/mockito/issues/2142] static mocks aren't 
accessible from a different threads. I tired to use static mock as a field:

{code:java}
private MockedStatic affinityUtilsStaticMock;

@BeforeEach
void before() throws NodeStoppingException {
affinityUtilsStaticMock = mockStatic(AffinityUtils.class);
// ...
}
// ...
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
// ...
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityUtilsStaticMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), 
anyInt())).thenReturn(assignment);

TableManager tableManager = createTableManager(tblManagerFut);
// ...
}
// ...
void after() throws Exception {
ComponentContext componentContext = new ComponentContext();
closeAll(
// ...
affinityUtilsStaticMock
);
}
{code}

But inside inside {{TableManager#getOrCreateAssignments}} it didn't works, the 
place:

{code:java}
assignmentsFuture = distributionZoneManager.dataNodes(causalityToken, 
catalogVersion, zoneDescriptor.id())
.thenApply(dataNodes -> AffinityUtils.calculateAssignments(
dataNodes,
zoneDescriptor.partitions(),
zoneDescriptor.replicas()
)//...
{code}

returns a list with size {{zoneDescriptor.partitions()}} of empty node set as 
the ticket describes as problem. In debug we can check is the mocking still 
works with:

{code:java}
Mockito.mockingDetails(AffinityUtils.class).isMock()
{code}

and it returns {{false}}. In oppisite, we can check that after {{.when}} 
statement the same check returns {{true}}.

The temporal solution is to mock {{DistributionZoneManager#dataNodes}} with 
non-empty set with {{node1}} inside:

{code:java}
when(distributionZoneManager.dataNodes(anyLong(), anyInt(), anyInt()))
.thenReturn(completedFuture(Set.of(NODE_NAME)));
{code}

where {{NODE_NAME}} is {{node1}}.

Then, non-mocked {{AffinityUtils#calculateAssignments}} returns a list with 
size {{zoneDescriptor.partitions()}} (in the case partitions equals 32) of 
non-empty node set with {{node1} inside as desired.

The test class demands rewriting, the related ticket must be created.


 


was (Author: JIRAUSER303791):
Due to [https://github.com/mockito/mockito/issues/2142] static mocks aren't 
accessible from a different threads. I tired to use static mock as a field:

{code:java}
private MockedStatic affinityUtilsStaticMock;

@BeforeEach
void before() throws NodeStoppingException {
affinityUtilsStaticMock = mockStatic(AffinityUtils.class);
// ...
}
// ...
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
// ...
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityUtilsStaticMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), 
anyInt())).thenReturn(assignment);

TableManager tableManager = createTableManager(tblManagerFut);
// ...
}
// ...
void after() throws Exception {
ComponentContext componentContext = new ComponentContext();
closeAll(
// ...
affinityUtilsStaticMock
);
}
{code}

But inside inside {{TableManager#getOrCreateAssignments}} it didn't works, the 
place:

{code:java}
assignmentsFuture = distributionZoneManager.dataNodes(causalityToken, 
catalogVersion, zoneDescriptor.id())
.thenApply(dataNodes -> AffinityUtils.calculateAssignments(
dataNodes,
zoneDescriptor.partitions(),
zoneDescriptor.replicas()
)//...
{code}

returns a list with size {{zoneDescriptor.partitions()}} of empty node set as 
the ticket describes as problem. In debug we can check is the mocking still 
works with:

{code:java}
Mockito.mockingDetails(AffinityUtils.class).isMock()
{code}

and it returns {{false}}. In oppisite, we can check that after {{.when}} 
statement the same check returns {{true}}.

The temporal solution is to mock {{DistributionZoneManager#dataNodes}} with 
non-empty set with {{node1}} inside. Then, non-mocked 
{{AffinityUtils#calculateAssignments}} returns a list with size 
{{zoneDescriptor.partitions()}} (in the case partiti

[jira] [Comment Edited] (IGNITE-22355) TableManagerTest's static mocks from #mockManagersAndCreateTableWithDelay don't work properly

2024-05-29 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850479#comment-17850479
 ] 

Mikhail Efremov edited comment on IGNITE-22355 at 5/29/24 8:03 PM:
---

Due to [https://github.com/mockito/mockito/issues/2142] static mocks aren't 
accessible from a different threads. I tired to use static mock as a field:

{code:java}
private MockedStatic affinityUtilsStaticMock;

@BeforeEach
void before() throws NodeStoppingException {
affinityUtilsStaticMock = mockStatic(AffinityUtils.class);
// ...
}
// ...
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
// ...
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityUtilsStaticMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), 
anyInt())).thenReturn(assignment);

TableManager tableManager = createTableManager(tblManagerFut);
// ...
}
// ...
void after() throws Exception {
ComponentContext componentContext = new ComponentContext();
closeAll(
// ...
affinityUtilsStaticMock
);
}
{code}

But inside inside {{TableManager#getOrCreateAssignments}} it didn't works, the 
place:

{code:java}
assignmentsFuture = distributionZoneManager.dataNodes(causalityToken, 
catalogVersion, zoneDescriptor.id())
.thenApply(dataNodes -> AffinityUtils.calculateAssignments(
dataNodes,
zoneDescriptor.partitions(),
zoneDescriptor.replicas()
)//...
{code}

returns a list with size {{zoneDescriptor.partitions()}} of empty node set as 
the ticket describes as problem. In debug we can check is the mocking still 
works with:

{code:java}
Mockito.mockingDetails(AffinityUtils.class).isMock()
{code}

and it returns {{false}}. In oppisite, we can check that after {{.when}} 
statement the same check returns {{true}}.

The temporal solution is to mock {{DistributionZoneManager#dataNodes}} with 
non-empty set with {{node1}} inside:

{code:java}
when(distributionZoneManager.dataNodes(anyLong(), anyInt(), anyInt()))
.thenReturn(completedFuture(Set.of(NODE_NAME)));
{code}

where {{NODE_NAME}} is {{node1}}.

Then, non-mocked {{AffinityUtils#calculateAssignments}} returns a list with 
size {{zoneDescriptor.partitions()}} (in the case partitions equals 32) of 
non-empty node set with {{node1}} inside as desired.

The test class demands rewriting, the related ticket must be created.


 


was (Author: JIRAUSER303791):
Due to [https://github.com/mockito/mockito/issues/2142] static mocks aren't 
accessible from a different threads. I tired to use static mock as a field:

{code:java}
private MockedStatic affinityUtilsStaticMock;

@BeforeEach
void before() throws NodeStoppingException {
affinityUtilsStaticMock = mockStatic(AffinityUtils.class);
// ...
}
// ...
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
// ...
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityUtilsStaticMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), 
anyInt())).thenReturn(assignment);

TableManager tableManager = createTableManager(tblManagerFut);
// ...
}
// ...
void after() throws Exception {
ComponentContext componentContext = new ComponentContext();
closeAll(
// ...
affinityUtilsStaticMock
);
}
{code}

But inside inside {{TableManager#getOrCreateAssignments}} it didn't works, the 
place:

{code:java}
assignmentsFuture = distributionZoneManager.dataNodes(causalityToken, 
catalogVersion, zoneDescriptor.id())
.thenApply(dataNodes -> AffinityUtils.calculateAssignments(
dataNodes,
zoneDescriptor.partitions(),
zoneDescriptor.replicas()
)//...
{code}

returns a list with size {{zoneDescriptor.partitions()}} of empty node set as 
the ticket describes as problem. In debug we can check is the mocking still 
works with:

{code:java}
Mockito.mockingDetails(AffinityUtils.class).isMock()
{code}

and it returns {{false}}. In oppisite, we can check that after {{.when}} 
statement the same check returns {{true}}.

The temporal solution is to mock {{DistributionZoneManager#dataNodes}} with 
non-empty set with {{node1}} inside:

{code:java}
when(distributionZoneManager.dataNodes(anyLong(), anyInt(), anyInt()))
.thenReturn(completedFuture(Set.of

[jira] [Created] (IGNITE-22388) TableManagerTest revision

2024-05-31 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22388:


 Summary: TableManagerTest revision
 Key: IGNITE-22388
 URL: https://issues.apache.org/jira/browse/IGNITE-22388
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov


*Description* 

During IGNITE-22315 we found that tests 
{{TableManagerTable#tableManagerStopTest*}} are failed and after investigations 
there was clear that several static mocks aren't in use in tests that leads to 
actually unworked tests. The special case was fixed in IGNITE-22355, but there 
are more issued places, especially described below. The main goal of the ticket 
is to revise the test class and rewrite it in accordings to its purpose without 
any excessive, unused code lines.

*Motivation* 

There will be provided 3 examples with problematic code the was found in 
{{TableManagerTable#tableManagerStopTest*}} but not limited.

The first code smell is the name of the tests: 
{{TableManagerTable#tableManagerStopTest1-4}}. The names doesn't tell how 1st 
and 4th are different.

The second is another (except that in IGNITE-22355) static mock 
{{schemaServiceMock}} in try-with-resources block that shouldn't work outside 
the block:

{code:title=|language=java|collapse=false}private TableViewInternal 
mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
String consistentId = "node0";

when(ts.getByConsistentId(any())).thenReturn(new ClusterNodeImpl(
UUID.randomUUID().toString(),
consistentId,
new NetworkAddress("localhost", 47500)
));

try (MockedStatic schemaServiceMock = 
mockStatic(SchemaUtils.class)) {
schemaServiceMock.when(() -> SchemaUtils.prepareSchemaDescriptor(any()))
.thenReturn(mock(SchemaDescriptor.class));
}

try (MockedStatic affinityServiceMock = 
mockStatic(AffinityUtils.class)) {
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityServiceMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), anyInt()))
.thenReturn(assignment);
}
//...
{code}

In the test class there are only two such mocks and {{affinityServiceMock}} is 
removed in  IGNITE-22355, but {{schemaServiceMock}} still requires reworking.

The third issue in the code above is {{String consistentId = "node0"}} for mock 
{{when(ts.getByConsistentId(any()))}}. The problem is that {{ts}} isn't really 
uses: {{ts}} is topology service field in the test class, but for 
{{TableManager}} creation topology service arrives from a mocked cluster 
service:

{code:title=|language=java|collapse=false}private TableManager 
createTableManager(/*...*/) {
  TableManager tableManager = new TableManager(
  // ...
  clusterService.topologyService(),
  //...
{code}

{{topologyService}} is mocked in {{before}} with another mock:

{code:title=|language=java|collapse=false}@BeforeEach
 void before() throws NodeStoppingException {
 // ...
 TopologyService topologyService = mock(TopologyService.class);

 when(clusterService.topologyService()).thenReturn(topologyService);
 // ...
{code}

That means, that {{ts}} field isn't in use and the code above is just for 
nothing.

There only 3 arguments, but they shows drastical problems with the test class 
and the ticket calls for the class reworking.

*Definition of done*

# Tests' names should have meaningful names that differs one from another.
# Tests shouldn't contain unused and meaningless code.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22388) TableManagerTest revision

2024-05-31 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22388:
-
Description: 
*Description*

During IGNITE-22315 we found that tests 
{{TableManagerTable#tableManagerStopTest*}} are failed and after investigations 
there was clear that several static mocks aren't in use in tests that leads to 
actually unworked tests. The special case was fixed in IGNITE-22355, but there 
are more issued places, especially described below. The main goal of the ticket 
is to revise the test class and rewrite it in accordings to its purpose without 
any excessive, unused code lines.

*Motivation*

There will be provided 3 examples with problematic code the was found in 
{{TableManagerTable#tableManagerStopTest*}} but not limited.

The first code smell is the name of the tests: 
{{{}TableManagerTable#tableManagerStopTest1-4{}}}. The names doesn't tell how 
1st and 4th are different.

The second is another (except that in IGNITE-22355) static mock 
{{schemaServiceMock}} in try-with-resources block that shouldn't work outside 
the block:
{code:java}
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
String consistentId = "node0";
// ...
when(ts.getByConsistentId(any())).thenReturn(new ClusterNodeImpl(
UUID.randomUUID().toString(),
consistentId,
new NetworkAddress("localhost", 47500)
));

try (MockedStatic schemaServiceMock = 
mockStatic(SchemaUtils.class)) {
schemaServiceMock.when(() -> SchemaUtils.prepareSchemaDescriptor(any()))
.thenReturn(mock(SchemaDescriptor.class));
}

try (MockedStatic affinityServiceMock = 
mockStatic(AffinityUtils.class)) {
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityServiceMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), anyInt()))
.thenReturn(assignment);
}
//...
{code}
In the test class there are only two such mocks and {{affinityServiceMock}} is 
removed in IGNITE-22355, but {{schemaServiceMock}} still requires reworking.

The third issue in the code above: the problem is that {{ts}} isn't really 
uses: {{ts}} is topology service field in the test class, but for 
{{TableManager}} creation topology service arrives from a mocked cluster 
service:
{code:java}
private TableManager createTableManager(/*...*/) {
  TableManager tableManager = new TableManager(
  // ...
  clusterService.topologyService(),
  //...
{code}
{{topologyService}} is mocked in {{before}} with another mock:
{code:java}
@BeforeEach
 void before() throws NodeStoppingException {
 // ...
 TopologyService topologyService = mock(TopologyService.class);

 when(clusterService.topologyService()).thenReturn(topologyService);
 // ...
{code}
That means, that {{ts}} field isn't in use and the code above is just for 
nothing.

There only 3 arguments, but they shows drastical problems with the test class 
and the ticket calls for the class reworking.

*Definition of done*
 # Tests' names should have meaningful names that differs one from another.
 # Tests shouldn't contain unused and meaningless code.

  was:
*Description* 

During IGNITE-22315 we found that tests 
{{TableManagerTable#tableManagerStopTest*}} are failed and after investigations 
there was clear that several static mocks aren't in use in tests that leads to 
actually unworked tests. The special case was fixed in IGNITE-22355, but there 
are more issued places, especially described below. The main goal of the ticket 
is to revise the test class and rewrite it in accordings to its purpose without 
any excessive, unused code lines.

*Motivation* 

There will be provided 3 examples with problematic code the was found in 
{{TableManagerTable#tableManagerStopTest*}} but not limited.

The first code smell is the name of the tests: 
{{TableManagerTable#tableManagerStopTest1-4}}. The names doesn't tell how 1st 
and 4th are different.

The second is another (except that in IGNITE-22355) static mock 
{{schemaServiceMock}} in try-with-resources block that shouldn't work outside 
the block:

{code:title=|language=java|collapse=false}private TableViewInternal 
mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
String consistentId = "node0";

when(ts.getByConsistentId(any())).thenReturn(new ClusterNodeImpl(
UUID.randomUUID().toString(),
consistentId,
new NetworkAddress("localhost", 47500)
));

try (MockedStatic 

[jira] [Updated] (IGNITE-22388) TableManagerTest revision

2024-05-31 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22388:
-
Description: 
*Description*

During IGNITE-22315 we found that tests 
{{TableManagerTest#tableManagerStopTest*}} are failed and after investigations 
there was clear that several static mocks aren't in use in tests that leads to 
actually unworked tests. The special case was fixed in IGNITE-22355, but there 
are more issued places, especially described below. The main goal of the ticket 
is to revise the test class and rewrite it in accordings to its purpose without 
any excessive, unused code lines.

*Motivation*

There will be provided 3 examples with problematic code the was found in 
{{TableManagerTable#tableManagerStopTest*}} but not limited.

The first code smell is the name of the tests: 
{{{}TableManagerTable#tableManagerStopTest1-4{}}}. The names doesn't tell how 
1st and 4th are different.

The second is another (except that in IGNITE-22355) static mock 
{{schemaServiceMock}} in try-with-resources block that shouldn't work outside 
the block:
{code:java}
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
String consistentId = "node0";
// ...
when(ts.getByConsistentId(any())).thenReturn(new ClusterNodeImpl(
UUID.randomUUID().toString(),
consistentId,
new NetworkAddress("localhost", 47500)
));

try (MockedStatic schemaServiceMock = 
mockStatic(SchemaUtils.class)) {
schemaServiceMock.when(() -> SchemaUtils.prepareSchemaDescriptor(any()))
.thenReturn(mock(SchemaDescriptor.class));
}

try (MockedStatic affinityServiceMock = 
mockStatic(AffinityUtils.class)) {
ArrayList> assignment = new ArrayList<>(PARTITIONS);

for (int part = 0; part < PARTITIONS; part++) {
assignment.add(new ArrayList<>(Collections.singleton(node)));
}

affinityServiceMock.when(() -> 
AffinityUtils.calculateAssignments(any(), anyInt(), anyInt()))
.thenReturn(assignment);
}
//...
{code}
In the test class there are only two such mocks and {{affinityServiceMock}} is 
removed in IGNITE-22355, but {{schemaServiceMock}} still requires reworking.

The third issue in the code above: the problem is that {{ts}} isn't really 
uses: {{ts}} is topology service field in the test class, but for 
{{TableManager}} creation topology service arrives from a mocked cluster 
service:
{code:java}
private TableManager createTableManager(/*...*/) {
  TableManager tableManager = new TableManager(
  // ...
  clusterService.topologyService(),
  //...
{code}
{{topologyService}} is mocked in {{before}} with another mock:
{code:java}
@BeforeEach
 void before() throws NodeStoppingException {
 // ...
 TopologyService topologyService = mock(TopologyService.class);

 when(clusterService.topologyService()).thenReturn(topologyService);
 // ...
{code}
That means, that {{ts}} field isn't in use and the code above is just for 
nothing.

There only 3 arguments, but they shows drastical problems with the test class 
and the ticket calls for the class reworking.

*Definition of done*
 # Tests' names should have meaningful names that differs one from another.
 # Tests shouldn't contain unused and meaningless code.

  was:
*Description*

During IGNITE-22315 we found that tests 
{{TableManagerTable#tableManagerStopTest*}} are failed and after investigations 
there was clear that several static mocks aren't in use in tests that leads to 
actually unworked tests. The special case was fixed in IGNITE-22355, but there 
are more issued places, especially described below. The main goal of the ticket 
is to revise the test class and rewrite it in accordings to its purpose without 
any excessive, unused code lines.

*Motivation*

There will be provided 3 examples with problematic code the was found in 
{{TableManagerTable#tableManagerStopTest*}} but not limited.

The first code smell is the name of the tests: 
{{{}TableManagerTable#tableManagerStopTest1-4{}}}. The names doesn't tell how 
1st and 4th are different.

The second is another (except that in IGNITE-22355) static mock 
{{schemaServiceMock}} in try-with-resources block that shouldn't work outside 
the block:
{code:java}
private TableViewInternal mockManagersAndCreateTableWithDelay(
String tableName,
CompletableFuture tblManagerFut,
@Nullable Phaser phaser
) throws Exception {
String consistentId = "node0";
// ...
when(ts.getByConsistentId(any())).thenReturn(new ClusterNodeImpl(
UUID.randomUUID().toString(),
consistentId,
new NetworkAddress("localhost", 47500)
));

try (MockedStatic schemaServiceMock = 

[jira] [Commented] (IGNITE-22315) Make raft-client starting only once and only with raft-client and replica together

2024-06-21 Thread Mikhail Efremov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17856838#comment-17856838
 ] 

Mikhail Efremov commented on IGNITE-22315:
--

Actual PR link: https://github.com/apache/ignite-3/pull/3956

> Make raft-client starting only once and only with raft-client and replica 
> together
> --
>
> Key: IGNITE-22315
> URL: https://issues.apache.org/jira/browse/IGNITE-22315
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> The goal is to make sure that when partition starting on an ignite-node, if 
> the node in assignments, then there should be only ones raft-node, 
> raft-client and replica and no any started entities otherwise.
> *Motivation*
> Now on the one hand in {{TableManager}} there is a path when raft-node and 
> replica didn't start, but raft-client is:
> In {{TableManager#startPartitionAndStartClient}} there is a place where 
> {{startGroupFut}} returns {{{}false{}}}:\{{ localMemberAssignment}} is 
> {{null}} or {{PartitionReplicatorNodeRecovery#shouldStartGroup}} returns 
> {{{}false{}}}. Then, raft-node and replica will not be started, but 
> regardless of {{startGroupFut}} result raft client will be started:
> {code:java}
> startGroupFut
> .thenComposeAsync(v -> inBusyLock(busyLock, () -> {
> TableRaftService tableRaftService = 
> table.internalTable().tableRaftService();
> try {
> // Return existing service if it's already started.
> return completedFuture(
> (TopologyAwareRaftGroupService) 
> tableRaftService.partitionRaftGroupService(replicaGrpId.partitionId())
> );
> } catch (IgniteInternalException e) {
> // We use "IgniteInternalException" in accordance with the 
> javadoc of "partitionRaftGroupService" method.
> try {
> // TODO IGNITE-19614 This procedure takes 10 seconds if 
> there's no majority online.
> return raftMgr
> .startRaftGroupService(replicaGrpId, 
> newConfiguration, raftGroupServiceFactory, raftCommandsMarshaller);
> } catch (NodeStoppingException ex) {
> return failedFuture(ex);
> }
> }
> }), ioExecutor)
> .thenAcceptAsync(updatedRaftGroupService -> inBusyLock(busyLock, () 
> -> {
> ((InternalTableImpl) internalTbl).tableRaftService()
> .updateInternalTableRaftGroupService(partId, 
> updatedRaftGroupService);
> boolean startedRaftNode = startGroupFut.join();
> if (localMemberAssignment == null || !startedRaftNode || 
> replicaMgr.isReplicaStarted(replicaGrpId)) {
> return;
> } {code}
> the code shows that {{v}} argument ({{{}startGroupFut{}}}'s result) in the 
> lambda doesn't affect on raft-client's starting, and also the client is put 
> into {{TableRaftService}} then regardless of replica's starting.
> On the other hand, there is a place that rely on raft-client creation on 
> every node with table's ID in {{TableManager#tables}} map: inside 
> {{TableManager#handleChangePendingAssignmentEvent}} there are two calls:
>  # the same name method 
> {{{}TableManager#handleChangePendingAssignmentEvent{}}}}}{}}}
>  # {{TableManager#changePeersOnRebalance}}
> Both are asking for internal table's 
> {{{}tableRaftService().partitionRaftGroupService(partId){}}}, but there no 
> any checks about is the local node is suitable for raft-entities and replica 
> starting or they are already started.
>  
> Then, when we would try to fix instant raft-client starting it will lead to 
> instant {{IgniteInternalException}} with _"No such partition P in table T"_ 
> from {{{}TableRaftService#{}}}{{{}partitionRaftGroupService{}}}. This is a 
> problem that should be solved.
> {color:#383838}*Definition of done*{color}
>  # {color:#383838}Raft-client must be started only together with raft-node 
> and replica or shouldn't be started otherwise.{color}
>  # {color:#383838}{{TableRaftService}} shouldn't be requested for raft-client 
> if the local node isn't in a raft group.{color}
>  # {color:#383838}Some tests may rely on described behavior, then failed 
> tests should be investigated and, probably fixed.{color}
>  # {color:#383838}New tests that check single starting of raft-node, 
> raft-client and replica together are required{color}
> {color:#0f54d6} {color}
> {color:#383838} {color}
> {color:#00855f} {color}



--
This message was sent by Atlassian Jira
(v

[jira] [Assigned] (IGNITE-22567) Add containsAll to RecordView and KeyValueView

2024-06-25 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov reassigned IGNITE-22567:


Assignee: Mikhail Efremov

> Add containsAll to RecordView and KeyValueView
> --
>
> Key: IGNITE-22567
> URL: https://issues.apache.org/jira/browse/IGNITE-22567
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Ignite 2.x has *containsKeys*, and there is nothing similar in Ignite 3 - we 
> are missing a method to check if all specified keys exist in a table.
> Add *containsAll* method to *RecordView* and *KeyValueView*:
> {code:java}
> public interface RecordView {
> boolean containsAll(@Nullable Transaction tx, Collection keyRecs);
> }
> {code}
> {code:java}
> public interface KeyValueView {
> boolean containsAll(@Nullable Transaction tx, Collection keys);
> }
> {code}
> * Include corresponding async methods too



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22373) Delete startReplica(ReplicationGroupId, PeersAndLearners, Function, PendingComparableValuesTracker, CompletableFuture)

2024-07-09 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22373:
-
Summary: Delete startReplica(ReplicationGroupId, PeersAndLearners, 
Function, PendingComparableValuesTracker, CompletableFuture)  (was: Delete 
startReplica overloadings from ReplicaManager)

> Delete startReplica(ReplicationGroupId, PeersAndLearners, Function, 
> PendingComparableValuesTracker, CompletableFuture)
> --
>
> Key: IGNITE-22373
> URL: https://issues.apache.org/jira/browse/IGNITE-22373
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> After IGNITE-21805 there are still 2 redundant {{Replica#startReplica}} 
> overloads:
> {code:java}
> public CompletableFuture startReplica(
> ReplicationGroupId replicaGrpId,
> PeersAndLearners newConfiguration,
> Consumer updateTableRaftService,
> Function createListener,
> PendingComparableValuesTracker storageIndexTracker,
> CompletableFuture newRaftClientFut
> ) { ... }
> {code}
> {code:java}
> public CompletableFuture startReplica(
> ReplicationGroupId replicaGrpId,
> PendingComparableValuesTracker storageIndexTracker,
> CompletableFuture newReplicaListenerFut
> ) { ... }
> {code}
> They are marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and 
> their only purpose is to be used in tests. 
> The main goal of this ticket is to delete them or make them private.
> *Motivation*
> There should the only one public {{Replica#startReplica}} method for 
> replication group creation.
> *Definition of Done*
> 1. Both provided above overloads of {{Replica#startReplica}} should be 
> deleted or be private.
> 2. All tests that called the overloaded methods should be fixed in favor of 
> the single {{Replica#startReplica}} method which is called now in 
> {{TableManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22691) Delete startReplica(ReplicationGroupId, PendingComparableValuesTracker, CompletableFuture)

2024-07-09 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22691:


 Summary: Delete startReplica(ReplicationGroupId, 
PendingComparableValuesTracker, CompletableFuture)
 Key: IGNITE-22691
 URL: https://issues.apache.org/jira/browse/IGNITE-22691
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


WIP



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22373) Delete startReplica(ReplicationGroupId, PeersAndLearners, Function, PendingComparableValuesTracker, CompletableFuture)

2024-07-09 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22373:
-
Description: 
*Description*
After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PeersAndLearners newConfiguration,
Consumer updateTableRaftService,
Function createListener,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newRaftClientFut
) { ... }
{code}
It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its only 
purpose is to be used in tests:
 * {{TableManagerRecoveryTest#startComponents}}
 * {{ReplicaManagerTest#testReplicaEvents}}
 * {{ItPlacementDriverReplicaSideTest#createReplicationGroup}}
 * {{ReplicaUnavailableTest#testWithReplicaStartedAfterRequestSending}}
 * {{ReplicaUnavailableTest#testWithNotReadyReplica}}

The main goal of this ticket is to delete the method and fix all its previous 
usages.

*Motivation*
There should the only one public {{Replica#startReplica}} method for 
replication group creation.

*Definition of Done*
1. Titled {{Replica#startReplica}} should be deleted.
2. All mentioned tests that called the overloaded methods should be fixed in 
favor of the single {{Replica#startReplica}} method which is called now in 
{{{}TableManager{}}}.

  was:
*Description*
After IGNITE-21805 there are still 2 redundant {{Replica#startReplica}} 
overloads:

{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PeersAndLearners newConfiguration,
Consumer updateTableRaftService,
Function createListener,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newRaftClientFut
) { ... }
{code}

{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newReplicaListenerFut
) { ... }
{code}

They are marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and 
their only purpose is to be used in tests. 

The main goal of this ticket is to delete them or make them private.

*Motivation*
There should the only one public {{Replica#startReplica}} method for 
replication group creation.

*Definition of Done*
1. Both provided above overloads of {{Replica#startReplica}} should be deleted 
or be private.
2. All tests that called the overloaded methods should be fixed in favor of the 
single {{Replica#startReplica}} method which is called now in {{TableManager}}.



> Delete startReplica(ReplicationGroupId, PeersAndLearners, Function, 
> PendingComparableValuesTracker, CompletableFuture)
> --
>
> Key: IGNITE-22373
> URL: https://issues.apache.org/jira/browse/IGNITE-22373
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
> {code:java}
> public CompletableFuture startReplica(
> ReplicationGroupId replicaGrpId,
> PeersAndLearners newConfiguration,
> Consumer updateTableRaftService,
> Function createListener,
> PendingComparableValuesTracker storageIndexTracker,
> CompletableFuture newRaftClientFut
> ) { ... }
> {code}
> It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its 
> only purpose is to be used in tests:
>  * {{TableManagerRecoveryTest#startComponents}}
>  * {{ReplicaManagerTest#testReplicaEvents}}
>  * {{ItPlacementDriverReplicaSideTest#createReplicationGroup}}
>  * {{ReplicaUnavailableTest#testWithReplicaStartedAfterRequestSending}}
>  * {{ReplicaUnavailableTest#testWithNotReadyReplica}}
> The main goal of this ticket is to delete the method and fix all its previous 
> usages.
> *Motivation*
> There should the only one public {{Replica#startReplica}} method for 
> replication group creation.
> *Definition of Done*
> 1. Titled {{Replica#startReplica}} should be deleted.
> 2. All mentioned tests that called the overloaded methods should be fixed in 
> favor of the single {{Replica#startReplica}} method which is called now in 
> {{{}TableManager{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22373) Delete startReplica(ReplicationGroupId, PeersAndLearners, Function, PendingComparableValuesTracker, CompletableFuture)

2024-07-09 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22373:
-
Description: 
*Description*
After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PeersAndLearners newConfiguration,
Function createListener,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newRaftClientFut
) { ... }
{code}
It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its only 
purpose is to be used in tests:
 * {{TableManagerRecoveryTest#startComponents}}
 * {{ReplicaManagerTest#testReplicaEvents}}
 * {{ItPlacementDriverReplicaSideTest#createReplicationGroup}}
 * {{ReplicaUnavailableTest#testWithReplicaStartedAfterRequestSending}}
 * {{ReplicaUnavailableTest#testWithNotReadyReplica}}

The main goal of this ticket is to delete the method and fix all its previous 
usages.

*Motivation*
There should the only one public {{Replica#startReplica}} method for 
replication group creation.

*Definition of Done*
1. Titled {{Replica#startReplica}} should be deleted.
2. All mentioned tests that called the overloaded methods should be fixed in 
favor of the single {{Replica#startReplica}} method which is called now in 
{{{}TableManager{}}}.

  was:
*Description*
After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PeersAndLearners newConfiguration,
Consumer updateTableRaftService,
Function createListener,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newRaftClientFut
) { ... }
{code}
It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its only 
purpose is to be used in tests:
 * {{TableManagerRecoveryTest#startComponents}}
 * {{ReplicaManagerTest#testReplicaEvents}}
 * {{ItPlacementDriverReplicaSideTest#createReplicationGroup}}
 * {{ReplicaUnavailableTest#testWithReplicaStartedAfterRequestSending}}
 * {{ReplicaUnavailableTest#testWithNotReadyReplica}}

The main goal of this ticket is to delete the method and fix all its previous 
usages.

*Motivation*
There should the only one public {{Replica#startReplica}} method for 
replication group creation.

*Definition of Done*
1. Titled {{Replica#startReplica}} should be deleted.
2. All mentioned tests that called the overloaded methods should be fixed in 
favor of the single {{Replica#startReplica}} method which is called now in 
{{{}TableManager{}}}.


> Delete startReplica(ReplicationGroupId, PeersAndLearners, Function, 
> PendingComparableValuesTracker, CompletableFuture)
> --
>
> Key: IGNITE-22373
> URL: https://issues.apache.org/jira/browse/IGNITE-22373
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
> {code:java}
> public CompletableFuture startReplica(
> ReplicationGroupId replicaGrpId,
> PeersAndLearners newConfiguration,
> Function createListener,
> PendingComparableValuesTracker storageIndexTracker,
> CompletableFuture newRaftClientFut
> ) { ... }
> {code}
> It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its 
> only purpose is to be used in tests:
>  * {{TableManagerRecoveryTest#startComponents}}
>  * {{ReplicaManagerTest#testReplicaEvents}}
>  * {{ItPlacementDriverReplicaSideTest#createReplicationGroup}}
>  * {{ReplicaUnavailableTest#testWithReplicaStartedAfterRequestSending}}
>  * {{ReplicaUnavailableTest#testWithNotReadyReplica}}
> The main goal of this ticket is to delete the method and fix all its previous 
> usages.
> *Motivation*
> There should the only one public {{Replica#startReplica}} method for 
> replication group creation.
> *Definition of Done*
> 1. Titled {{Replica#startReplica}} should be deleted.
> 2. All mentioned tests that called the overloaded methods should be fixed in 
> favor of the single {{Replica#startReplica}} method which is called now in 
> {{{}TableManager{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22373) Delete startReplica(ReplicationGroupId, PeersAndLearners, Function, PendingComparableValuesTracker, CompletableFuture)

2024-07-09 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22373:
-
Description: 
*Description*
After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PeersAndLearners newConfiguration,
Function createListener,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newRaftClientFut
) { ... }
{code}
It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its only 
purpose is to be used in tests:
 * {{TableManagerRecoveryTest#startComponents}}
 * {{ReplicaManagerTest#testReplicaEvents}}
 * {{ItPlacementDriverReplicaSideTest#createReplicationGroup}}
 * {{ReplicaUnavailableTest#testWithReplicaStartedAfterRequestSending}}
 * {{ReplicaUnavailableTest#testWithNotReadyReplica}}

The main goal of this ticket is to delete the method and fix all its previous 
usages.

*Motivation*
There should the only one public {{Replica#startReplica}} method for 
replication group creation.

*Definition of Done*
1. Titled {{Replica#startReplica}} should be deleted.
2. All mentioned tests that called the overloaded method should be fixed in 
favor of the single {{Replica#startReplica}} method which is called now in 
{{TableManager}}.

  was:
*Description*
After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PeersAndLearners newConfiguration,
Function createListener,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newRaftClientFut
) { ... }
{code}
It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its only 
purpose is to be used in tests:
 * {{TableManagerRecoveryTest#startComponents}}
 * {{ReplicaManagerTest#testReplicaEvents}}
 * {{ItPlacementDriverReplicaSideTest#createReplicationGroup}}
 * {{ReplicaUnavailableTest#testWithReplicaStartedAfterRequestSending}}
 * {{ReplicaUnavailableTest#testWithNotReadyReplica}}

The main goal of this ticket is to delete the method and fix all its previous 
usages.

*Motivation*
There should the only one public {{Replica#startReplica}} method for 
replication group creation.

*Definition of Done*
1. Titled {{Replica#startReplica}} should be deleted.
2. All mentioned tests that called the overloaded methods should be fixed in 
favor of the single {{Replica#startReplica}} method which is called now in 
{{{}TableManager{}}}.


> Delete startReplica(ReplicationGroupId, PeersAndLearners, Function, 
> PendingComparableValuesTracker, CompletableFuture)
> --
>
> Key: IGNITE-22373
> URL: https://issues.apache.org/jira/browse/IGNITE-22373
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
> {code:java}
> public CompletableFuture startReplica(
> ReplicationGroupId replicaGrpId,
> PeersAndLearners newConfiguration,
> Function createListener,
> PendingComparableValuesTracker storageIndexTracker,
> CompletableFuture newRaftClientFut
> ) { ... }
> {code}
> It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its 
> only purpose is to be used in tests:
>  * {{TableManagerRecoveryTest#startComponents}}
>  * {{ReplicaManagerTest#testReplicaEvents}}
>  * {{ItPlacementDriverReplicaSideTest#createReplicationGroup}}
>  * {{ReplicaUnavailableTest#testWithReplicaStartedAfterRequestSending}}
>  * {{ReplicaUnavailableTest#testWithNotReadyReplica}}
> The main goal of this ticket is to delete the method and fix all its previous 
> usages.
> *Motivation*
> There should the only one public {{Replica#startReplica}} method for 
> replication group creation.
> *Definition of Done*
> 1. Titled {{Replica#startReplica}} should be deleted.
> 2. All mentioned tests that called the overloaded method should be fixed in 
> favor of the single {{Replica#startReplica}} method which is called now in 
> {{TableManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22691) Delete startReplica(ReplicationGroupId, PendingComparableValuesTracker, CompletableFuture)

2024-07-09 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22691:
-
Description: 
*Description*
After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newReplicaListenerFut
) { ... }
{code}
It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its only 
purpose is to be used in tests:
 * 
{{ItRebalanceDistributedTest#verifyThatRaftNodesAndReplicasWereStartedOnlyOnce}}
Checks that the {{startRepica}} and raft-node starts only once. Looks like just 
have to add several missing arguments.
 * {{ItTxTestCluster#startTable}}

The main goal of this ticket is to delete the method.

*Motivation*
There should the only one public {{Replica#startReplica}} method for 
replication group creation.

*Definition of Done*
1. Titled {{Replica#startReplica}} should be deleted.
2. All mentioned tests that called the overloaded method should be fixed in 
favor of the single {{Replica#startReplica}} method which is called now in 
{{TableManager}}.

  was:WIP


> Delete startReplica(ReplicationGroupId, PendingComparableValuesTracker, 
> CompletableFuture)
> --
>
> Key: IGNITE-22691
> URL: https://issues.apache.org/jira/browse/IGNITE-22691
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
> {code:java}
> public CompletableFuture startReplica(
> ReplicationGroupId replicaGrpId,
> PendingComparableValuesTracker storageIndexTracker,
> CompletableFuture newReplicaListenerFut
> ) { ... }
> {code}
> It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its 
> only purpose is to be used in tests:
>  * 
> {{ItRebalanceDistributedTest#verifyThatRaftNodesAndReplicasWereStartedOnlyOnce}}
> Checks that the {{startRepica}} and raft-node starts only once. Looks like 
> just have to add several missing arguments.
>  * {{ItTxTestCluster#startTable}}
> The main goal of this ticket is to delete the method.
> *Motivation*
> There should the only one public {{Replica#startReplica}} method for 
> replication group creation.
> *Definition of Done*
> 1. Titled {{Replica#startReplica}} should be deleted.
> 2. All mentioned tests that called the overloaded method should be fixed in 
> favor of the single {{Replica#startReplica}} method which is called now in 
> {{TableManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-22699) Anti-hijack protection of containsAllAsync() is broken

2024-07-09 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov reassigned IGNITE-22699:


Assignee: Mikhail Efremov

> Anti-hijack protection of containsAllAsync() is broken
> --
>
> Key: IGNITE-22699
> URL: https://issues.apache.org/jira/browse/IGNITE-22699
> Project: Ignite
>  Issue Type: Bug
>Reporter: Roman Puchkovskiy
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> Both KeyValueView and RecordView have containsAllAsync(). Its implementations 
> in PublicApiThreadingKeyValueView and PublicApiThreadingRecordView are broken 
> because they treat the operation as a sync one, so no anti-hijack protection 
> is added.
> Also, the new operation should be added to 
> {{ItKvRecordApiThreadingTest.KeyValueViewAsyncOperation}} and 
> {{ItKvRecordApiThreadingTest.RecordViewAsyncOperation}} to cover the 
> anti-hijack protection for the new methods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22292) Moving LogSyncer from Loza and RaftServer

2024-07-17 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22292:
-
Description: 
*Description*

The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
{{{}Loza#getLogSyncer{}}}.

*Motivation*
There is an intention to hide all raft-specific entities behind replication 
layer and log syncer is an entity that sticking out. Besides it's only needs 
for engines and may be passed though {{IgniteImpl}} as an ignite component.

*Definition of done*
There no more {{Loza#getLogSyncer}} method and its' calls.


  was:
*Description*

The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
{{{}Loza#getLogSyncer{}}}.

*Motivation*
 # There is an intention to hide all raft-specific entities behind replication 
layer and log syncer is an entity that sticking out. Besides it's only needs 
for engines and may be passed though {{IgniteImpl}} as an ignite component.
 # All current implementations except {{DefaultLogStorageFactory}} are not 
implements properly sync method at all that looks like the reason for interface 
segregation between {{LogSyncer}} and {{LogStorageFactory.}}

*Definition of done*
 # There no more {{Loza#getLogSyncer}} method and its' calls.
 # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.
 # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
{{{}IgniteImpl{}}}.


> Moving LogSyncer from Loza and RaftServer
> -
>
> Key: IGNITE-22292
> URL: https://issues.apache.org/jira/browse/IGNITE-22292
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee:  Kirill Sizov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
> {{{}Loza#getLogSyncer{}}}.
> *Motivation*
> There is an intention to hide all raft-specific entities behind replication 
> layer and log syncer is an entity that sticking out. Besides it's only needs 
> for engines and may be passed though {{IgniteImpl}} as an ignite component.
> *Definition of done*
> There no more {{Loza#getLogSyncer}} method and its' calls.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22292) Moving LogSyncer from Loza and RaftServer

2024-07-17 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22292:
-
Summary: Moving LogSyncer from Loza and RaftServer  (was: Moving LogSyncer 
from Loza and RaftServer and pass it to usages as IgniteComponent)

> Moving LogSyncer from Loza and RaftServer
> -
>
> Key: IGNITE-22292
> URL: https://issues.apache.org/jira/browse/IGNITE-22292
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee:  Kirill Sizov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> The main goal is to remove {{LogSyncer}} from {{Loza}} and especially method 
> {{{}Loza#getLogSyncer{}}}.
> *Motivation*
>  # There is an intention to hide all raft-specific entities behind 
> replication layer and log syncer is an entity that sticking out. Besides it's 
> only needs for engines and may be passed though {{IgniteImpl}} as an ignite 
> component.
>  # All current implementations except {{DefaultLogStorageFactory}} are not 
> implements properly sync method at all that looks like the reason for 
> interface segregation between {{LogSyncer}} and {{LogStorageFactory.}}
> *Definition of done*
>  # There no more {{Loza#getLogSyncer}} method and its' calls.
>  # {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.
>  # {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
> {{{}IgniteImpl{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22766) Make LogSyncer and LogStorageFactory interfaces are separated.

2024-07-17 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22766:


 Summary: Make LogSyncer and LogStorageFactory interfaces are 
separated.
 Key: IGNITE-22766
 URL: https://issues.apache.org/jira/browse/IGNITE-22766
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


Make =LogSyncer= and =LogStorageFactory= interfaces are separated.

*Description*

The goal of the ticket is to make =LogSyncer='s implementations are separated 
from the =LogStorageFactory= class that at
least by the name isn't intented for log's sync operation.

*Motivation*

All current implementations except =DefaultLogStorageFactory= are not 
implements properly =sync()= method at all that looks like the reason for 
interface segregation between =LogSyncer= and =LogStorageFactory=.

*Definition of done*

1. =LogSyncer= and =LogStorageFactory= interfaces are separated.
2. =LogSyncer= creates and injects into engines as =IgniteComponent= in 
=IgniteImpl=.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22766) Make LogSyncer and LogStorageFactory interfaces are separated.

2024-07-17 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22766:
-
Description: 
Make {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.

*Description*

The goal of the ticket is to make {{LogSyncer}}'s implementations are separated 
from the {{LogStorageFactory}} class that at
least by the name isn't intended for log's sync operation.

*Motivation*

All current implementations except {{DefaultLogStorageFactory}} are not 
implements properly {{sync()}} method at all that looks like the reason for 
interface segregation between {{LogSyncer}} and {{LogStorageFactory}}.

*Definition of done*

1. {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.
2. {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
{{IgniteImpl}}.

  was:
Make =LogSyncer= and =LogStorageFactory= interfaces are separated.

*Description*

The goal of the ticket is to make =LogSyncer='s implementations are separated 
from the =LogStorageFactory= class that at
least by the name isn't intented for log's sync operation.

*Motivation*

All current implementations except =DefaultLogStorageFactory= are not 
implements properly =sync()= method at all that looks like the reason for 
interface segregation between =LogSyncer= and =LogStorageFactory=.

*Definition of done*

1. =LogSyncer= and =LogStorageFactory= interfaces are separated.
2. =LogSyncer= creates and injects into engines as =IgniteComponent= in 
=IgniteImpl=.


> Make LogSyncer and LogStorageFactory interfaces are separated.
> --
>
> Key: IGNITE-22766
> URL: https://issues.apache.org/jira/browse/IGNITE-22766
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> Make {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.
> *Description*
> The goal of the ticket is to make {{LogSyncer}}'s implementations are 
> separated from the {{LogStorageFactory}} class that at
> least by the name isn't intended for log's sync operation.
> *Motivation*
> All current implementations except {{DefaultLogStorageFactory}} are not 
> implements properly {{sync()}} method at all that looks like the reason for 
> interface segregation between {{LogSyncer}} and {{LogStorageFactory}}.
> *Definition of done*
> 1. {{LogSyncer}} and {{LogStorageFactory}} interfaces are separated.
> 2. {{LogSyncer}} creates and injects into engines as {{IgniteComponent}} in 
> {{IgniteImpl}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22774) Remove raftClient map from ItTxTestCluster

2024-07-18 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22774:


 Summary: Remove raftClient map from ItTxTestCluster
 Key: IGNITE-22774
 URL: https://issues.apache.org/jira/browse/IGNITE-22774
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

The goal is to remove {{raftClient}} from {{ItTxTestCluster}} due to all 
raft-clients are created inside replcas through 
{{ReplicaManager#startReplica}}, and all usages of {{raftClient}} are 
effectively useless: this map is used in the single common case: 

{code:title=|language=java|collapse=false}assertSame(
txTestCluster.raftClients.get(ACC_TABLE_NAME).get(0).clusterService(),
txTestCluster.getLeader(ACC_TABLE_NAME).service()
);
{code}

Effectively, the first statement get raft-client for a table and it's 0th 
partition, then extract it's {{ClusterService}}. The second statement inside of 
{{#getLeader}} do the similar thing:

{code:title=|language=java|collapse=false}protected Loza getLeader(String 
tableName) {
var services = raftClients.get(tableName);

Peer leader = services.get(0).leader();

assertNotNull(leader);

return raftServers.get(leader.consistentId());
}
{code}

Thus, the second statement means that we get a {{ClusterService}} from {{Loza}} 
instance that related to leader's raft-client.

But when we put clients inside the map, we put only raft-clients from leaders 
of partition's replication group:

{code:title=|language=java|collapse=false}if (startClient) {
  RaftGroupService service = RaftGroupServiceImpl
.start(grpId, client, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

clients.put(p, service);
} else {
// Create temporary client to find a leader address.
ClusterService tmpSvc = cluster.get(0);

RaftGroupService service = RaftGroupServiceImpl
.start(grpId, tmpSvc, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

Peer leader = service.leader();

service.shutdown();

ClusterService leaderSrv = cluster.stream()
.filter(cluster -> 
cluster.topologyService().localMember().name().equals(leader.consistentId()))
.findAny()
.orElseThrow();

RaftGroupService leaderClusterSvc = RaftGroupServiceImpl
.start(grpId, leaderSrv, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

clients.put(p, leaderClusterSvc);
}
{code}

The first branch relates to tests that requres separated node as transaction 
coordinator and legacy approach demanded for separated raft-client. Now it all 
works on replication layer and the first branch will be fully removed from 
[https://issues.apache.org/jira/browse/IGNITE-22691]. Then, the second branch 
create tempoaral raft-client just to find out a leader peer, and then get 
leader's {{ClusterService}} and create an instance of raft-client based on 
found service. Again, after beforementioned ticket there wouldn't any 
raft-clients instantinations, but still the logic after this branch is get a 
leader's raft-client and save it to the map.

Then, the first statement get a leader {{ClusterService}} from the map for 
given table and 0th partition, and then the second statement get {{Loza}} 
instance for leader's node consistent ID and and ask it for a 
{{ClusterService}}. The both services are always identical because it's the 
same replication layer leader node. Then, the assertion is redundand and all 
the code with {{raftClients}} and {{#leader()}} are the same.

The only place that didn't mention, but shouldn't be removed, but rewritten is 
{{getLeaderId}} that is required by 
{{ItTxDistributedTestThreeNodesThreeReplicas#testPrimaryReplicaDirectUpdateForExplicitTxn}}
 test, returns more explict by method's name {{Peer}} type. The test is 
internally bloking message handling on RAFT replication layer before entity 
will be written in RAFT-log, and then it must be done exactly on RAFT-group 
leader. Then, taking into the account {{raftClients}} removal, the method's 
body should be:

{code:title=|language=java|collapse=false}int partId = 0;
reutrn replicaManagers.get(extractConsistentId(cluster.get(partId)))
 .replica(new TablePartitionId(tables.get(tableName), 
partId))
 .thenApply(replica -> replica.raftClient().leader());
{code}

The function {{extractConsistentId(cluster.get(partId))}} is equivalent to 
{{cluster.get(partId).topologyService().localMember().name()}} and {{tables}} 
is a mapping of tables' names to {{TableImpl}} instances that are inserted to 
the map in {{ItTxTestCluster#startTable}} right after instantiation.

*Motivation*

The map and related code are redundand, aren't working as intented an

[jira] [Updated] (IGNITE-22774) Remove raftClient map from ItTxTestCluster

2024-07-18 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22774:
-
Description: 
*Description*

The goal is to remove {{raftClient}} from {{ItTxTestCluster}} due to all 
raft-clients are created inside replcas through 
{{ReplicaManager#startReplica}}, and all usages of {{raftClient}} are 
effectively useless: this map is used in the single common case: 

{code:title=|language=java|collapse=false}assertSame(
txTestCluster.raftClients.get(ACC_TABLE_NAME).get(0).clusterService(),
txTestCluster.getLeader(ACC_TABLE_NAME).service()
);
{code}

Effectively, the first statement get raft-client for a table and it's 0th 
partition, then extract it's {{ClusterService}}. The second statement inside of 
{{#getLeader}} do the similar thing:

{code:title=|language=java|collapse=false}protected Loza getLeader(String 
tableName) {
var services = raftClients.get(tableName);

Peer leader = services.get(0).leader();

assertNotNull(leader);

return raftServers.get(leader.consistentId());
}
{code}

Thus, the second statement means that we get a {{ClusterService}} from {{Loza}} 
instance that related to leader's raft-client.

But when we put clients inside the map, we put only raft-clients from leaders 
of partition's replication group:

{code:title=|language=java|collapse=false}if (startClient) {
  RaftGroupService service = RaftGroupServiceImpl
.start(grpId, client, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

clients.put(p, service);
} else {
// Create temporary client to find a leader address.
ClusterService tmpSvc = cluster.get(0);

RaftGroupService service = RaftGroupServiceImpl
.start(grpId, tmpSvc, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

Peer leader = service.leader();

service.shutdown();

ClusterService leaderSrv = cluster.stream()
.filter(cluster -> 
cluster.topologyService().localMember().name().equals(leader.consistentId()))
.findAny()
.orElseThrow();

RaftGroupService leaderClusterSvc = RaftGroupServiceImpl
.start(grpId, leaderSrv, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

clients.put(p, leaderClusterSvc);
}
{code}

The first branch relates to tests that requres separated node as transaction 
coordinator and legacy approach demanded for separated raft-client. Now it all 
works on replication layer and the first branch will be fully removed from 
[https://issues.apache.org/jira/browse/IGNITE-22691]. Then, the second branch 
create tempoaral raft-client just to find out a leader peer, and then get 
leader's {{ClusterService}} and create an instance of raft-client based on 
found service. Again, after beforementioned ticket there wouldn't any 
raft-clients instantinations, but still the logic after this branch is get a 
leader's raft-client and save it to the map.

Then, the first statement get a leader {{ClusterService}} from the map for 
given table and 0th partition, and then the second statement get {{Loza}} 
instance for leader's node consistent ID and and ask it for a 
{{ClusterService}}. The both services are always identical because it's the 
same replication layer leader node. Then, the assertion is redundand and all 
the code with {{raftClients}} and {{#leader()}} are the same.

The only place that didn't mention, but shouldn't be removed, but rewritten is 
{{getLeaderId}} that is required by 
{{ItTxDistributedTestThreeNodesThreeReplicas#testPrimaryReplicaDirectUpdateForExplicitTxn}}
 test, returns more explict by method's name {{Peer}} type. The test is 
internally bloking message handling on RAFT replication layer before entity 
will be written in RAFT-log, and then it must be done exactly on RAFT-group 
leader. Then, taking into the account {{raftClients}} removal, the method's 
body should be:

{code:title=|language=java|collapse=false}int partId = 0;
int partId = 0;
return replicaManagers.get(extractConsistentId(cluster.get(partId)))
.replica(new TablePartitionId(tables.get(tableName).tableId(), 
partId))
.thenApply(replica -> replica.raftClient().leader())
.join();
{code}

The function {{extractConsistentId(cluster.get(partId))}} is equivalent to 
{{cluster.get(partId).topologyService().localMember().name()}} and {{tables}} 
is a mapping of tables' names to {{TableImpl}} instances that are inserted to 
the map in {{ItTxTestCluster#startTable}} right after instantiation.

*Motivation*

The map and related code are redundand, aren't working as intented and should 
be removed.

*Definition of done*

# {{raftClients}} map is removed.
# {{getLeader}} and related assertions are removed

[jira] [Updated] (IGNITE-22774) Remove raftClient map from ItTxTestCluster

2024-07-18 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22774:
-
Description: 
*Description*

The goal is to remove {{raftClient}} from {{ItTxTestCluster}} due to all 
raft-clients are created inside replcas through 
{{ReplicaManager#startReplica}}, and all usages of {{raftClient}} are 
effectively useless: this map is used in the single common case: 

{code:title=|language=java|collapse=false}assertSame(
txTestCluster.raftClients.get(ACC_TABLE_NAME).get(0).clusterService(),
txTestCluster.getLeader(ACC_TABLE_NAME).service()
);
{code}

Effectively, the first statement get raft-client for a table and it's 0th 
partition, then extract it's {{ClusterService}}. The second statement inside of 
{{#getLeader}} do the similar thing:

{code:title=|language=java|collapse=false}protected Loza getLeader(String 
tableName) {
var services = raftClients.get(tableName);

Peer leader = services.get(0).leader();

assertNotNull(leader);

return raftServers.get(leader.consistentId());
}
{code}

Thus, the second statement means that we get a {{ClusterService}} from {{Loza}} 
instance that related to leader's raft-client.

But when we put clients inside the map, we put only raft-clients from leaders 
of partition's replication group:

{code:title=|language=java|collapse=false}if (startClient) {
  RaftGroupService service = RaftGroupServiceImpl
.start(grpId, client, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

clients.put(p, service);
} else {
// Create temporary client to find a leader address.
ClusterService tmpSvc = cluster.get(0);

RaftGroupService service = RaftGroupServiceImpl
.start(grpId, tmpSvc, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

Peer leader = service.leader();

service.shutdown();

ClusterService leaderSrv = cluster.stream()
.filter(cluster -> 
cluster.topologyService().localMember().name().equals(leader.consistentId()))
.findAny()
.orElseThrow();

RaftGroupService leaderClusterSvc = RaftGroupServiceImpl
.start(grpId, leaderSrv, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

clients.put(p, leaderClusterSvc);
}
{code}

The first branch relates to tests that requres separated node as transaction 
coordinator and legacy approach demanded for separated raft-client. Now it all 
works on replication layer and the first branch will be fully removed from 
[https://issues.apache.org/jira/browse/IGNITE-22691]. Then, the second branch 
create tempoaral raft-client just to find out a leader peer, and then get 
leader's {{ClusterService}} and create an instance of raft-client based on 
found service. Again, after beforementioned ticket there wouldn't any 
raft-clients instantinations, but still the logic after this branch is get a 
leader's raft-client and save it to the map.

Then, the first statement get a leader {{ClusterService}} from the map for 
given table and 0th partition, and then the second statement get {{Loza}} 
instance for leader's node consistent ID and and ask it for a 
{{ClusterService}}. The both services are always identical because it's the 
same replication layer leader node. Then, the assertion is redundand and all 
the code with {{raftClients}} and {{#leader()}} are the same.

The only place that didn't mention, but shouldn't be removed, but rewritten is 
{{getLeaderId}} that is required by 
{{ItTxDistributedTestThreeNodesThreeReplicas#testPrimaryReplicaDirectUpdateForExplicitTxn}}
 test, returns more explict by method's name {{Peer}} type. The test is 
internally bloking message handling on RAFT replication layer before entity 
will be written in RAFT-log, and then it must be done exactly on RAFT-group 
leader. Then, taking into the account {{raftClients}} removal, the method's 
body should be:

{code:title=|language=java|collapse=false}
int partId = 0;
return replicaManagers.get(extractConsistentId(cluster.get(partId)))
.replica(new TablePartitionId(tables.get(tableName).tableId(), 
partId))
.thenApply(replica -> replica.raftClient().leader())
.join();
{code}

The function {{extractConsistentId(cluster.get(partId))}} is equivalent to 
{{cluster.get(partId).topologyService().localMember().name()}} and {{tables}} 
is a mapping of tables' names to {{TableImpl}} instances that are inserted to 
the map in {{ItTxTestCluster#startTable}} right after instantiation.

*Motivation*

The map and related code are redundand, aren't working as intented and should 
be removed.

*Definition of done*

# {{raftClients}} map is removed.
# {{getLeader}} and related assertions are removed.
# {{getLeader

[jira] [Updated] (IGNITE-22774) Remove raftClient map from ItTxTestCluster

2024-07-18 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22774:
-
Description: 
*Description*

The goal is to remove {{raftClient}} from {{ItTxTestCluster}} due to all 
raft-clients are created inside replicas through 
{{ReplicaManager#startReplica}}, and most of usages of {{raftClient}} are 
effectively useless: this map is used in the single common case: 

{code:title=|language=java|collapse=false}assertSame(
txTestCluster.raftClients.get(ACC_TABLE_NAME).get(0).clusterService(),
txTestCluster.getLeader(ACC_TABLE_NAME).service()
);
{code}

Effectively, the first statement get raft-client for a table and it's 0th 
partition, then extract it's {{ClusterService}}. The second statement inside of 
{{#getLeader}} do the similar thing:

{code:title=|language=java|collapse=false}protected Loza getLeader(String 
tableName) {
var services = raftClients.get(tableName);

Peer leader = services.get(0).leader();

assertNotNull(leader);

return raftServers.get(leader.consistentId());
}
{code}

Thus, the second statement means that we get a {{ClusterService}} from {{Loza}} 
instance that related to leader's raft-client.

But when we put clients inside the map, we put only raft-clients from leaders 
of partition's replication group:

{code:title=|language=java|collapse=false}if (startClient) {
  RaftGroupService service = RaftGroupServiceImpl
.start(grpId, client, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

clients.put(p, service);
} else {
// Create temporary client to find a leader address.
ClusterService tmpSvc = cluster.get(0);

RaftGroupService service = RaftGroupServiceImpl
.start(grpId, tmpSvc, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

Peer leader = service.leader();

service.shutdown();

ClusterService leaderSrv = cluster.stream()
.filter(cluster -> 
cluster.topologyService().localMember().name().equals(leader.consistentId()))
.findAny()
.orElseThrow();

RaftGroupService leaderClusterSvc = RaftGroupServiceImpl
.start(grpId, leaderSrv, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

clients.put(p, leaderClusterSvc);
}
{code}

The first branch relates to tests that requires separated node as transaction 
coordinator and legacy approach demanded for separated raft-client. Now it all 
works on replication layer and the first branch will be fully removed from 
[https://issues.apache.org/jira/browse/IGNITE-22691]. Then, the second branch 
create temporal raft-client just to find out a leader peer, and then get 
leader's {{ClusterService}} and create an instance of raft-client based on 
found service. Again, after before-mentioned ticket there wouldn't any 
raft-clients instantiations, but still the logic after this branch is get a 
leader's raft-client and save it to the map.

Then, the first statement get a leader {{ClusterService}} from the map for 
given table and 0th partition, and then the second statement get {{Loza}} 
instance for leader's node consistent ID and and ask it for a 
{{ClusterService}}. The both services are always identical because it's the 
same replication layer leader node. Then, the assertion is redundand and all 
the code with {{raftClients}} and {{#leader()}} are the same.

The only place that didn't mention, but shouldn't be removed, but rewritten is 
{{getLeaderId}} that is required by 
{{ItTxDistributedTestThreeNodesThreeReplicas#testPrimaryReplicaDirectUpdateForExplicitTxn}}
 test, returns more explicit by method's name {{Peer}} type. The test is 
internally blocking message handling on RAFT replication layer before entity 
will be written in RAFT-log, and then it must be done exactly on RAFT-group 
leader. Then, taking into the account {{raftClients}} removal, the method's 
body should be:

{code:title=|language=java|collapse=false}
int partId = 0;

return replicaManagers.get(extractConsistentId(cluster.get(partId)))
.replica(new TablePartitionId(tables.get(tableName).tableId(), 
partId))
.thenApply(replica -> replica.raftClient().leader())
.join();
{code}

The function {{extractConsistentId(cluster.get(partId))}} is equivalent to 
{{cluster.get(partId).topologyService().localMember().name()}} and {{tables}} 
is a mapping of tables' names to {{TableImpl}} instances that are inserted to 
the map in {{ItTxTestCluster#startTable}} right after instantiation.

And the last place the map is used is raft-clients stopping process, that 
should be replaced (actually it should be done much earlier) with replicas 
stopping process for every possible table-partition pairs:

{code:ti

[jira] [Updated] (IGNITE-22774) Remove raftClient map from ItTxTestCluster

2024-07-18 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22774:
-
Description: 
*Description*

The goal is to remove {{raftClient}} from {{ItTxTestCluster}} due to all 
raft-clients are created inside replicas through 
{{ReplicaManager#startReplica}}, and most of usages of {{raftClient}} are 
effectively useless: this map is used in the single common case: 

{code:title=|language=java|collapse=false}assertSame(
txTestCluster.raftClients.get(ACC_TABLE_NAME).get(0).clusterService(),
txTestCluster.getLeader(ACC_TABLE_NAME).service()
);
{code}

Effectively, the first statement get raft-client for a table and it's 0th 
partition, then extract it's {{ClusterService}}. The second statement inside of 
{{#getLeader}} do the similar thing:

{code:title=|language=java|collapse=false}protected Loza getLeader(String 
tableName) {
var services = raftClients.get(tableName);

Peer leader = services.get(0).leader();

assertNotNull(leader);

return raftServers.get(leader.consistentId());
}
{code}

Thus, the second statement means that we get a {{ClusterService}} from {{Loza}} 
instance that related to leader's raft-client.

But when we put clients inside the map, we put only raft-clients from leaders 
of partition's replication group:

{code:title=|language=java|collapse=false}if (startClient) {
  RaftGroupService service = RaftGroupServiceImpl
.start(grpId, client, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

clients.put(p, service);
} else {
// Create temporary client to find a leader address.
ClusterService tmpSvc = cluster.get(0);

RaftGroupService service = RaftGroupServiceImpl
.start(grpId, tmpSvc, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

Peer leader = service.leader();

service.shutdown();

ClusterService leaderSrv = cluster.stream()
.filter(cluster -> 
cluster.topologyService().localMember().name().equals(leader.consistentId()))
.findAny()
.orElseThrow();

RaftGroupService leaderClusterSvc = RaftGroupServiceImpl
.start(grpId, leaderSrv, FACTORY, raftConfig, membersConf, true, 
executor, commandsMarshaller)
.get(5, TimeUnit.SECONDS);

clients.put(p, leaderClusterSvc);
}
{code}

The first branch relates to tests that requires separated node as transaction 
coordinator and legacy approach demanded for separated raft-client. Now it all 
works on replication layer and the first branch will be fully removed from 
[https://issues.apache.org/jira/browse/IGNITE-22691]. Then, the second branch 
create temporal raft-client just to find out a leader peer, and then get 
leader's {{ClusterService}} and create an instance of raft-client based on 
found service. Again, after before-mentioned ticket there wouldn't any 
raft-clients instantiations, but still the logic after this branch is get a 
leader's raft-client and save it to the map.

Then, the first statement get a leader {{ClusterService}} from the map for 
given table and 0th partition, and then the second statement get {{Loza}} 
instance for leader's node consistent ID and and ask it for a 
{{ClusterService}}. The both services are always identical because it's the 
same replication layer leader node. Then, the assertion is redundand and all 
the code with {{raftClients}} and {{#leader()}} are the same.

The only place that didn't mention, but shouldn't be removed, but rewritten is 
{{getLeaderId}} that is required by 
{{ItTxDistributedTestThreeNodesThreeReplicas#testPrimaryReplicaDirectUpdateForExplicitTxn}}
 test, returns more explicit by method's name {{Peer}} type. The test is 
internally blocking message handling on RAFT replication layer before entity 
will be written in RAFT-log, and then it must be done exactly on RAFT-group 
leader. Then, taking into the account {{raftClients}} removal, the method's 
body should be:

{code:title=|language=java|collapse=false}
int partId = 0;

return replicaManagers.get(extractConsistentId(cluster.get(partId)))
.replica(new TablePartitionId(tables.get(tableName).tableId(), 
partId))
.thenApply(replica -> replica.raftClient().leader())
.join();
{code}

The function {{extractConsistentId(cluster.get(partId))}} is equivalent to 
{{cluster.get(partId).topologyService().localMember().name()}} and {{tables}} 
is a mapping of tables' names to {{TableImpl}} instances that are inserted to 
the map in {{ItTxTestCluster#startTable}} right after instantiation.

And the last place the map is used is raft-clients stopping process, that 
should be replaced (actually it should be done much earlier) with replicas 
stopping process for every possible table-partition pairs:

{code:ti

[jira] [Updated] (IGNITE-22774) Remove raftClient map from ItTxTestCluster

2024-07-25 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22774:
-
Epic Link: IGNITE-22313

> Remove raftClient map from ItTxTestCluster
> --
>
> Key: IGNITE-22774
> URL: https://issues.apache.org/jira/browse/IGNITE-22774
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> The goal is to remove {{raftClient}} from {{ItTxTestCluster}} due to all 
> raft-clients are created inside replicas through 
> {{ReplicaManager#startReplica}}, and most of usages of {{raftClient}} are 
> effectively useless: this map is used in the single common case: 
> {code:title=|language=java|collapse=false}assertSame(
> txTestCluster.raftClients.get(ACC_TABLE_NAME).get(0).clusterService(),
> txTestCluster.getLeader(ACC_TABLE_NAME).service()
> );
> {code}
> Effectively, the first statement get raft-client for a table and it's 0th 
> partition, then extract it's {{ClusterService}}. The second statement inside 
> of {{#getLeader}} do the similar thing:
> {code:title=|language=java|collapse=false}protected Loza getLeader(String 
> tableName) {
> var services = raftClients.get(tableName);
> Peer leader = services.get(0).leader();
> assertNotNull(leader);
> return raftServers.get(leader.consistentId());
> }
> {code}
> Thus, the second statement means that we get a {{ClusterService}} from 
> {{Loza}} instance that related to leader's raft-client.
> But when we put clients inside the map, we put only raft-clients from leaders 
> of partition's replication group:
> {code:title=|language=java|collapse=false}if (startClient) {
>   RaftGroupService service = RaftGroupServiceImpl
> .start(grpId, client, FACTORY, raftConfig, membersConf, true, 
> executor, commandsMarshaller)
> .get(5, TimeUnit.SECONDS);
> clients.put(p, service);
> } else {
> // Create temporary client to find a leader address.
> ClusterService tmpSvc = cluster.get(0);
> RaftGroupService service = RaftGroupServiceImpl
> .start(grpId, tmpSvc, FACTORY, raftConfig, membersConf, true, 
> executor, commandsMarshaller)
> .get(5, TimeUnit.SECONDS);
> Peer leader = service.leader();
> service.shutdown();
> ClusterService leaderSrv = cluster.stream()
> .filter(cluster -> 
> cluster.topologyService().localMember().name().equals(leader.consistentId()))
> .findAny()
> .orElseThrow();
> RaftGroupService leaderClusterSvc = RaftGroupServiceImpl
> .start(grpId, leaderSrv, FACTORY, raftConfig, membersConf, true, 
> executor, commandsMarshaller)
> .get(5, TimeUnit.SECONDS);
> clients.put(p, leaderClusterSvc);
> }
> {code}
> The first branch relates to tests that requires separated node as transaction 
> coordinator and legacy approach demanded for separated raft-client. Now it 
> all works on replication layer and the first branch will be fully removed 
> from [https://issues.apache.org/jira/browse/IGNITE-22691]. Then, the second 
> branch create temporal raft-client just to find out a leader peer, and then 
> get leader's {{ClusterService}} and create an instance of raft-client based 
> on found service. Again, after before-mentioned ticket there wouldn't any 
> raft-clients instantiations, but still the logic after this branch is get a 
> leader's raft-client and save it to the map.
> Then, the first statement get a leader {{ClusterService}} from the map for 
> given table and 0th partition, and then the second statement get {{Loza}} 
> instance for leader's node consistent ID and and ask it for a 
> {{ClusterService}}. The both services are always identical because it's the 
> same replication layer leader node. Then, the assertion is redundand and all 
> the code with {{raftClients}} and {{#leader()}} are the same.
> The only place that didn't mention, but shouldn't be removed, but rewritten 
> is {{getLeaderId}} that is required by 
> {{ItTxDistributedTestThreeNodesThreeReplicas#testPrimaryReplicaDirectUpdateForExplicitTxn}}
>  test, returns more explicit by method's name {{Peer}} type. The test is 
> internally blocking message handling on RAFT replication layer before entity 
> will be written in RAFT-log, and then it must be done exactly on RAFT-group 
> leader. Then, taking into the account {{raftClients}} removal, the method's 
> body should be:
> {code:title=|language=java|collapse=false}
> int partId = 0;
> return replicaManagers.get(extractConsistentId(cluster.get(partId)))
> .replica(new 
> TablePartitionId(tables.get(tableName).tableId(), partId))
>  

[jira] [Updated] (IGNITE-22774) Remove raftClient map from ItTxTestCluster

2024-07-25 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22774:
-
Epic Link:   (was: IGNITE-22313)

> Remove raftClient map from ItTxTestCluster
> --
>
> Key: IGNITE-22774
> URL: https://issues.apache.org/jira/browse/IGNITE-22774
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> The goal is to remove {{raftClient}} from {{ItTxTestCluster}} due to all 
> raft-clients are created inside replicas through 
> {{ReplicaManager#startReplica}}, and most of usages of {{raftClient}} are 
> effectively useless: this map is used in the single common case: 
> {code:title=|language=java|collapse=false}assertSame(
> txTestCluster.raftClients.get(ACC_TABLE_NAME).get(0).clusterService(),
> txTestCluster.getLeader(ACC_TABLE_NAME).service()
> );
> {code}
> Effectively, the first statement get raft-client for a table and it's 0th 
> partition, then extract it's {{ClusterService}}. The second statement inside 
> of {{#getLeader}} do the similar thing:
> {code:title=|language=java|collapse=false}protected Loza getLeader(String 
> tableName) {
> var services = raftClients.get(tableName);
> Peer leader = services.get(0).leader();
> assertNotNull(leader);
> return raftServers.get(leader.consistentId());
> }
> {code}
> Thus, the second statement means that we get a {{ClusterService}} from 
> {{Loza}} instance that related to leader's raft-client.
> But when we put clients inside the map, we put only raft-clients from leaders 
> of partition's replication group:
> {code:title=|language=java|collapse=false}if (startClient) {
>   RaftGroupService service = RaftGroupServiceImpl
> .start(grpId, client, FACTORY, raftConfig, membersConf, true, 
> executor, commandsMarshaller)
> .get(5, TimeUnit.SECONDS);
> clients.put(p, service);
> } else {
> // Create temporary client to find a leader address.
> ClusterService tmpSvc = cluster.get(0);
> RaftGroupService service = RaftGroupServiceImpl
> .start(grpId, tmpSvc, FACTORY, raftConfig, membersConf, true, 
> executor, commandsMarshaller)
> .get(5, TimeUnit.SECONDS);
> Peer leader = service.leader();
> service.shutdown();
> ClusterService leaderSrv = cluster.stream()
> .filter(cluster -> 
> cluster.topologyService().localMember().name().equals(leader.consistentId()))
> .findAny()
> .orElseThrow();
> RaftGroupService leaderClusterSvc = RaftGroupServiceImpl
> .start(grpId, leaderSrv, FACTORY, raftConfig, membersConf, true, 
> executor, commandsMarshaller)
> .get(5, TimeUnit.SECONDS);
> clients.put(p, leaderClusterSvc);
> }
> {code}
> The first branch relates to tests that requires separated node as transaction 
> coordinator and legacy approach demanded for separated raft-client. Now it 
> all works on replication layer and the first branch will be fully removed 
> from [https://issues.apache.org/jira/browse/IGNITE-22691]. Then, the second 
> branch create temporal raft-client just to find out a leader peer, and then 
> get leader's {{ClusterService}} and create an instance of raft-client based 
> on found service. Again, after before-mentioned ticket there wouldn't any 
> raft-clients instantiations, but still the logic after this branch is get a 
> leader's raft-client and save it to the map.
> Then, the first statement get a leader {{ClusterService}} from the map for 
> given table and 0th partition, and then the second statement get {{Loza}} 
> instance for leader's node consistent ID and and ask it for a 
> {{ClusterService}}. The both services are always identical because it's the 
> same replication layer leader node. Then, the assertion is redundand and all 
> the code with {{raftClients}} and {{#leader()}} are the same.
> The only place that didn't mention, but shouldn't be removed, but rewritten 
> is {{getLeaderId}} that is required by 
> {{ItTxDistributedTestThreeNodesThreeReplicas#testPrimaryReplicaDirectUpdateForExplicitTxn}}
>  test, returns more explicit by method's name {{Peer}} type. The test is 
> internally blocking message handling on RAFT replication layer before entity 
> will be written in RAFT-log, and then it must be done exactly on RAFT-group 
> leader. Then, taking into the account {{raftClients}} removal, the method's 
> body should be:
> {code:title=|language=java|collapse=false}
> int partId = 0;
> return replicaManagers.get(extractConsistentId(cluster.get(partId)))
> .replica(new 
> TablePartitionId(tables.get(tableName).tableId(), partId))
> 

[jira] [Created] (IGNITE-22851) Move raft client's update configuration process into replcas

2024-07-26 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22851:


 Summary: Move raft client's update configuration process into 
replcas
 Key: IGNITE-22851
 URL: https://issues.apache.org/jira/browse/IGNITE-22851
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov


*Description*

*Motivation*

*Definition of done*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22851) Move raft client's update configuration process into replicas

2024-07-30 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22851:
-
Summary: Move raft client's update configuration process into replicas  
(was: Move raft client's update configuration process into replcas)

> Move raft client's update configuration process into replicas
> -
>
> Key: IGNITE-22851
> URL: https://issues.apache.org/jira/browse/IGNITE-22851
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>
> *Description*
> *Motivation*
> *Definition of done*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22928) Fix testZoneReplicaListener

2024-08-05 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22928:


 Summary: Fix testZoneReplicaListener
 Key: IGNITE-22928
 URL: https://issues.apache.org/jira/browse/IGNITE-22928
 Project: Ignite
  Issue Type: Improvement
Reporter: Mikhail Efremov
Assignee: Mikhail Efremov






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22928) Fix testZoneReplicaListener

2024-08-06 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22928:
-
Description: 
*Description*

The issue with test is \{{TestPlacementDriver}} that returns only one node that 
may not be in replication group at least at start of the test and thus have no 
any replica and raft entities. It leads to \{{NPE}} in the follow code from 
\{{PartitionReplicaLifecycleManager}}:

{code:title=|language=java|collapse=false}return localServicesStartFuture
              .thenComposeAsync(v -> inBusyLock(busyLock, () -> 
isLocalNodeIsPrimary(replicaGrpId)), ioExecutor)
              .thenAcceptAsync(isLeaseholder -> inBusyLock(busyLock, () -> {
                  boolean isLocalNodeInStableOrPending = 
isNodeInReducedStableOrPendingAssignments(
                          replicaGrpId,
                          stableAssignments,
                          pendingAssignments,
                          revision
                  );

                  if (!isLocalNodeInStableOrPending && !isLeaseholder) {
                      return;
                  }

                  assert isLocalNodeInStableOrPending || isLeaseholder
                          : "The local node is outside of the replication group 
[inStableOrPending=" + isLocalNodeInStableOrPending
                          + ", isLeaseholder=" + isLeaseholder + "].";

                  // For forced assignments, we exclude dead stable nodes, and 
all alive stable nodes are already in pending assignments.
                  // Union is not required in such a case.
                  Set newAssignments = pendingAssignmentsAreForced 
|| stableAssignments == null
                          ? pendingAssignmentsNodes
                          : union(pendingAssignmentsNodes, 
stableAssignments.nodes());

                  replicaMgr.replica(replicaGrpId)
                          .thenApply(Replica::raftClient)
                          .thenAccept(raftClient -> 
raftClient.updateConfiguration(fromAssignments(newAssignments)));
              }), ioExecutor);
{code}

On node that has been returning from \{{TestPlacementDriver}} will pass 
\{{isLocalNodeIsPrimary}} check and all follow checks in any case, but the node 
doesn't host a replication group, then there no replica future and then 
\{{replicaMgr#replica}} returns \{{null}} and then \{{NPE}} on \{{null}}-value 
is thrown.

The solution is to add to \{{TestPlacementDriver}} kind of mapping of 
\{{ZonePartitionId}} to \{{ClusterNode}} of "primary" replica host node. But 
there is an another problem: in debug we can see 25 partitions for zone 0. At 
least not very suit to write 25 mappings in the map, but zone 0 is a common 
public zone and is a subject of the test. Then, the solution is to reduce 
default's zone partition number or add mapping for all it's partitions. 

*Motivation*

The crucial test should be fixed.

*Definition of done*

The test is passed.

> Fix testZoneReplicaListener
> ---
>
> Key: IGNITE-22928
> URL: https://issues.apache.org/jira/browse/IGNITE-22928
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>
> *Description*
> The issue with test is \{{TestPlacementDriver}} that returns only one node 
> that may not be in replication group at least at start of the test and thus 
> have no any replica and raft entities. It leads to \{{NPE}} in the follow 
> code from \{{PartitionReplicaLifecycleManager}}:
> {code:title=|language=java|collapse=false}return localServicesStartFuture
>               .thenComposeAsync(v -> inBusyLock(busyLock, () -> 
> isLocalNodeIsPrimary(replicaGrpId)), ioExecutor)
>               .thenAcceptAsync(isLeaseholder -> inBusyLock(busyLock, () -> {
>                   boolean isLocalNodeInStableOrPending = 
> isNodeInReducedStableOrPendingAssignments(
>                           replicaGrpId,
>                           stableAssignments,
>                           pendingAssignments,
>                           revision
>                   );
>                   if (!isLocalNodeInStableOrPending && !isLeaseholder) {
>                       return;
>                   }
>                   assert isLocalNodeInStableOrPending || isLeaseholder
>                           : "The local node is outside of the replication 
> group [inStableOrPending=" + isLocalNodeInStableOrPending
>                           + ", isLeaseholder=" + isLeaseholder + "].";
>                   // For forced assignments, we exclude dead stable nodes, 
> and all alive stable nodes are already in pending assignments.
>                   // Union is not required in such a case.
>                   Set newAssignments = 
> pendingAssignmentsAreForced || stableAssignments == null
>                   

[jira] [Updated] (IGNITE-22928) Fix testZoneReplicaListener

2024-08-06 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22928:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)
  Labels: ignite-3  (was: )

> Fix testZoneReplicaListener
> ---
>
> Key: IGNITE-22928
> URL: https://issues.apache.org/jira/browse/IGNITE-22928
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>
> *Description*
> The issue with test is \{{TestPlacementDriver}} that returns only one node 
> that may not be in replication group at least at start of the test and thus 
> have no any replica and raft entities. It leads to \{{NPE}} in the follow 
> code from \{{PartitionReplicaLifecycleManager}}:
> {code:title=|language=java|collapse=false}return localServicesStartFuture
>               .thenComposeAsync(v -> inBusyLock(busyLock, () -> 
> isLocalNodeIsPrimary(replicaGrpId)), ioExecutor)
>               .thenAcceptAsync(isLeaseholder -> inBusyLock(busyLock, () -> {
>                   boolean isLocalNodeInStableOrPending = 
> isNodeInReducedStableOrPendingAssignments(
>                           replicaGrpId,
>                           stableAssignments,
>                           pendingAssignments,
>                           revision
>                   );
>                   if (!isLocalNodeInStableOrPending && !isLeaseholder) {
>                       return;
>                   }
>                   assert isLocalNodeInStableOrPending || isLeaseholder
>                           : "The local node is outside of the replication 
> group [inStableOrPending=" + isLocalNodeInStableOrPending
>                           + ", isLeaseholder=" + isLeaseholder + "].";
>                   // For forced assignments, we exclude dead stable nodes, 
> and all alive stable nodes are already in pending assignments.
>                   // Union is not required in such a case.
>                   Set newAssignments = 
> pendingAssignmentsAreForced || stableAssignments == null
>                           ? pendingAssignmentsNodes
>                           : union(pendingAssignmentsNodes, 
> stableAssignments.nodes());
>                   replicaMgr.replica(replicaGrpId)
>                           .thenApply(Replica::raftClient)
>                           .thenAccept(raftClient -> 
> raftClient.updateConfiguration(fromAssignments(newAssignments)));
>               }), ioExecutor);
> {code}
> On node that has been returning from \{{TestPlacementDriver}} will pass 
> \{{isLocalNodeIsPrimary}} check and all follow checks in any case, but the 
> node doesn't host a replication group, then there no replica future and then 
> \{{replicaMgr#replica}} returns \{{null}} and then \{{NPE}} on 
> \{{null}}-value is thrown.
> The solution is to add to \{{TestPlacementDriver}} kind of mapping of 
> \{{ZonePartitionId}} to \{{ClusterNode}} of "primary" replica host node. But 
> there is an another problem: in debug we can see 25 partitions for zone 0. At 
> least not very suit to write 25 mappings in the map, but zone 0 is a common 
> public zone and is a subject of the test. Then, the solution is to reduce 
> default's zone partition number or add mapping for all it's partitions. 
> *Motivation*
> The crucial test should be fixed.
> *Definition of done*
> The test is passed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22691) Make startReplica(ReplicationGroupId, PendingComparableValuesTracker, CompletableFuture) private

2024-08-06 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22691:
-
Summary: Make startReplica(ReplicationGroupId, 
PendingComparableValuesTracker, CompletableFuture) private  (was: Delete 
startReplica(ReplicationGroupId, PendingComparableValuesTracker, 
CompletableFuture))

> Make startReplica(ReplicationGroupId, PendingComparableValuesTracker, 
> CompletableFuture) private
> 
>
> Key: IGNITE-22691
> URL: https://issues.apache.org/jira/browse/IGNITE-22691
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
> {code:java}
> public CompletableFuture startReplica(
> ReplicationGroupId replicaGrpId,
> PendingComparableValuesTracker storageIndexTracker,
> CompletableFuture newReplicaListenerFut
> ) { ... }
> {code}
> It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its 
> only purpose is to be used in tests:
>  * 
> {{ItRebalanceDistributedTest#verifyThatRaftNodesAndReplicasWereStartedOnlyOnce}}
> Checks that the {{startRepica}} and raft-node starts only once. Looks like 
> just have to add several missing arguments.
>  * {{ItTxTestCluster#startTable}}
> The main goal of this ticket is to delete the method.
> *Motivation*
> There should the only one public {{Replica#startReplica}} method for 
> replication group creation.
> *Definition of Done*
> 1. Titled {{Replica#startReplica}} should be deleted.
> 2. All mentioned tests that called the overloaded method should be fixed in 
> favor of the single {{Replica#startReplica}} method which is called now in 
> {{TableManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22691) Make startReplica(ReplicationGroupId, PendingComparableValuesTracker, CompletableFuture) private

2024-08-06 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22691:
-
Description: 
*Description*
After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newReplicaListenerFut
) { ... }
{code}
It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its only 
purpose is to be used in tests:
 * 
{{ItRebalanceDistributedTest#verifyThatRaftNodesAndReplicasWereStartedOnlyOnce}}
Checks that the {{startRepica}} and raft-node starts only once. Looks like just 
have to add several missing arguments.
 * {{ItTxTestCluster#startTable}}

The main goal of this ticket is to delete the method.

*Motivation*
There should the only one public {{Replica#startReplica}} method for 
replication group creation.

*Definition of Done*
1. Titled {{Replica#startReplica}} should be private.
2. All mentioned tests that called the overloaded method should be fixed in 
favor of the single {{Replica#startReplica}} method which is called now in 
{{{}TableManager{}}}.

  was:
*Description*
After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
{code:java}
public CompletableFuture startReplica(
ReplicationGroupId replicaGrpId,
PendingComparableValuesTracker storageIndexTracker,
CompletableFuture newReplicaListenerFut
) { ... }
{code}
It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its only 
purpose is to be used in tests:
 * 
{{ItRebalanceDistributedTest#verifyThatRaftNodesAndReplicasWereStartedOnlyOnce}}
Checks that the {{startRepica}} and raft-node starts only once. Looks like just 
have to add several missing arguments.
 * {{ItTxTestCluster#startTable}}

The main goal of this ticket is to delete the method.

*Motivation*
There should the only one public {{Replica#startReplica}} method for 
replication group creation.

*Definition of Done*
1. Titled {{Replica#startReplica}} should be deleted.
2. All mentioned tests that called the overloaded method should be fixed in 
favor of the single {{Replica#startReplica}} method which is called now in 
{{TableManager}}.


> Make startReplica(ReplicationGroupId, PendingComparableValuesTracker, 
> CompletableFuture) private
> 
>
> Key: IGNITE-22691
> URL: https://issues.apache.org/jira/browse/IGNITE-22691
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Efremov
>Assignee: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Description*
> After IGNITE-21805 there are redundant {{Replica#startReplica}} overload:
> {code:java}
> public CompletableFuture startReplica(
> ReplicationGroupId replicaGrpId,
> PendingComparableValuesTracker storageIndexTracker,
> CompletableFuture newReplicaListenerFut
> ) { ... }
> {code}
> It's marked now as {{@VisibleForTesting}} and {{@Deprecated}} both and its 
> only purpose is to be used in tests:
>  * 
> {{ItRebalanceDistributedTest#verifyThatRaftNodesAndReplicasWereStartedOnlyOnce}}
> Checks that the {{startRepica}} and raft-node starts only once. Looks like 
> just have to add several missing arguments.
>  * {{ItTxTestCluster#startTable}}
> The main goal of this ticket is to delete the method.
> *Motivation*
> There should the only one public {{Replica#startReplica}} method for 
> replication group creation.
> *Definition of Done*
> 1. Titled {{Replica#startReplica}} should be private.
> 2. All mentioned tests that called the overloaded method should be fixed in 
> favor of the single {{Replica#startReplica}} method which is called now in 
> {{{}TableManager{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)