[jira] [Updated] (IGNITE-20279) Reordering of altering zone operations

2023-08-30 Thread Vyacheslav Koptilin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin updated IGNITE-20279:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Reordering of altering zone operations
> --
>
> Key: IGNITE-20279
> URL: https://issues.apache.org/jira/browse/IGNITE-20279
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Sergey Uttsel
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The issue is shown in the test, where several zone change operations occur. 
> In this case, the operation can be reordered and incomplete at the end of the 
> test. There are messages "Received update for replicas number" in the test 
> log in a wrong order. The reproducer based on 
> ItRebalanceDistributedTest#testThreeQueuedRebalances:
> {code:java}
> @Test
> void testThreeQueuedRebalances() throws Exception {
> Node node = getNode(0);
> createZone(node, ZONE_NAME, 1, 1);
> createTable(node, ZONE_NAME, TABLE_NAME);
> assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 
> 0).size() == 1, AWAIT_TIMEOUT_MILLIS));
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> waitPartitionAssignmentsSyncedToExpected(0, 2);
> checkPartitionNodes(0, 2);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20279) Reordering of altering zone operations

2023-08-29 Thread Sergey Uttsel (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Uttsel updated IGNITE-20279:
---
Description: 
The issue is shown in the test, where several zone change operations occur. In 
this case, the operation can be reordered and incomplete at the end of the 
test. There are messages "Received update for replicas number" in the test log 
in a wrong order. The reproducer based on 
ItRebalanceDistributedTest#testThreeQueuedRebalances:

{code:java}
@Test
void testThreeQueuedRebalances() throws Exception {
Node node = getNode(0);

createZone(node, ZONE_NAME, 1, 1);

createTable(node, ZONE_NAME, TABLE_NAME);

assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 
0).size() == 1, AWAIT_TIMEOUT_MILLIS));

alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);

waitPartitionAssignmentsSyncedToExpected(0, 2);

checkPartitionNodes(0, 2);
}
{code}


  was:
The issue is shown in the test, where several zone change operations occur. In 
this case, the operation can be reordered and incomplete at the end of the 
test. There are messages "Received update for replicas number" in the test log 
in a wrong order. The reproducer based on 
ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the test 
log:

{code:java}
@Test
void testThreeQueuedRebalances() throws Exception {
Node node = getNode(0);

createZone(node, ZONE_NAME, 1, 1);

createTable(node, ZONE_NAME, TABLE_NAME);

assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 
0).size() == 1, AWAIT_TIMEOUT_MILLIS));

alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);

waitPartitionAssignmentsSyncedToExpected(0, 2);

checkPartitionNodes(0, 2);
}
{code}



> Reordering of altering zone operations
> --
>
> Key: IGNITE-20279
> URL: https://issues.apache.org/jira/browse/IGNITE-20279
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Sergey Uttsel
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The issue is shown in the test, where several zone change operations occur. 
> In this case, the operation can be reordered and incomplete at the end of the 
> test. There are messages "Received update for replicas number" in the test 
> log in a wrong order. The reproducer based on 
> ItRebalanceDistributedTest#testThreeQueuedRebalances:
> {code:java}
> @Test
> void testThreeQueuedRebalances() throws Exception {
> Node node = getNode(0);
> createZone(node, ZONE_NAME, 1, 1);
> createTable(node, ZONE_NAME, TABLE_NAME);
> assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 
> 0).size() == 1, AWAIT_TIMEOUT_MILLIS));
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> waitPartitionAssignmentsSyncedToExpected(0, 2);
> checkPartitionNodes(0, 2);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20279) Reordering of altering zone operations

2023-08-29 Thread Sergey Uttsel (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Uttsel updated IGNITE-20279:
---
Reviewer: Mirza Aliev

> Reordering of altering zone operations
> --
>
> Key: IGNITE-20279
> URL: https://issues.apache.org/jira/browse/IGNITE-20279
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Sergey Uttsel
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The issue is shown in the test, where several zone change operations occur. 
> In this case, the operation can be reordered and incomplete at the end of the 
> test. There are messages "Received update for replicas number" in the test 
> log in a wrong order. The reproducer based on 
> ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the 
> test log:
> {code:java}
> @Test
> void testThreeQueuedRebalances() throws Exception {
> Node node = getNode(0);
> createZone(node, ZONE_NAME, 1, 1);
> createTable(node, ZONE_NAME, TABLE_NAME);
> assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 
> 0).size() == 1, AWAIT_TIMEOUT_MILLIS));
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> waitPartitionAssignmentsSyncedToExpected(0, 2);
> checkPartitionNodes(0, 2);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20279) Reordering of altering zone operations

2023-08-29 Thread Sergey Uttsel (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Uttsel updated IGNITE-20279:
---
Description: 
The issue is shown in the test, where several zone change operations occur. In 
this case, the operation can be reordered and incomplete at the end of the 
test. There are messages "Received update for replicas number" in the test log 
in a wrong order. The reproducer based on 
ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the test 
log:

{code:java}
@Test
void testThreeQueuedRebalances() throws Exception {
Node node = getNode(0);

createZone(node, ZONE_NAME, 1, 1);

createTable(node, ZONE_NAME, TABLE_NAME);

assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 
0).size() == 1, AWAIT_TIMEOUT_MILLIS));

alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);
alterZone(node, ZONE_NAME, 3);
alterZone(node, ZONE_NAME, 2);

waitPartitionAssignmentsSyncedToExpected(0, 2);

checkPartitionNodes(0, 2);
}
{code}


  was:
The issue is shown in the test, where several zone change operations occur. On 
my laptop, the test ({{tRebalanceDistributedTest#testThreeQueuedRebalances}}) 
fails at least twice on 30 runs.
 # The first issue that I see is that the test does not wait to execute the 
last zone change operation: alterZone(node, ZONE_NAME, 2). In this case, the 
operation can be incomplete at the end of the test.
 # The second issue is that the next operation may start earlier than the 
previous one is completed.
{noformat}
2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_2%tableManager-io-10][WatchProcessor]
 Error occurred when processing a watch event
 org.apache.ignite.lang.IgniteInternalException: Raft group on the node is 
already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer 
[consistentId=irdt_ttqr_2, idx=0]]]
at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) 
~[main/:?]
at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) 
~[main/:?]
at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) 
~[main/:?]
at 
org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361)
 ~[main/:?]
at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261)
 ~[main/:?]
at 
org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) 
~[main/:?]
at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
 ~[?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
at java.lang.Thread.run(Thread.java:834) ~[?:?]
{noformat}


> Reordering of altering zone operations
> --
>
> Key: IGNITE-20279
> URL: https://issues.apache.org/jira/browse/IGNITE-20279
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Sergey Uttsel
>Priority: Major
>  Labels: ignite-3
>
> The issue is shown in the test, where several zone change operations occur. 
> In this case, the operation can be reordered and incomplete at the end of the 
> test. There are messages "Received update for replicas number" in the test 
> log in a wrong order. The reproducer based on 
> ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the 
> test log:
> {code:java}
> @Test
> void testThreeQueuedRebalances() throws Exception {
> Node node = getNode(0);
> createZone(node, ZONE_NAME, 1, 1);
> createTable(node, ZONE_NAME, TABLE_NAME);
> assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 
> 0).size() == 1, AWAIT_TIMEOUT_MILLIS));
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> alterZone(node, ZONE_NAME, 2);
> alterZone(node, ZONE_NAME, 3);
> a

[jira] [Updated] (IGNITE-20279) Reordering of altering zone operations

2023-08-24 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-20279:
---
Description: 
The issue is shown in the test, where several zone change operations occur. On 
my laptop, the test ({{tRebalanceDistributedTest#testThreeQueuedRebalances}}) 
fails at least twice on 30 runs.
 # The first issue that I see is that the test does not wait to execute the 
last zone change operation: alterZone(node, ZONE_NAME, 2). In this case, the 
operation can be incomplete at the end of the test.
 # The second issue is that the next operation may start earlier than the 
previous one is completed.
{noformat}
2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_2%tableManager-io-10][WatchProcessor]
 Error occurred when processing a watch event
 org.apache.ignite.lang.IgniteInternalException: Raft group on the node is 
already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer 
[consistentId=irdt_ttqr_2, idx=0]]]
at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) 
~[main/:?]
at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) 
~[main/:?]
at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) 
~[main/:?]
at 
org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361)
 ~[main/:?]
at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261)
 ~[main/:?]
at 
org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) 
~[main/:?]
at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
 ~[?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
at java.lang.Thread.run(Thread.java:834) ~[?:?]
{noformat}

  was:
The issue is shown in the test, where several zone change operations occur. On 
my laptop, the test (tRebalanceDistributedTest#testThreeQueuedRebalances) fails 
at least twice on 30 runs.

# The first issue that I see is that the test does not wait to execute the last 
zone change operation: alterZone(node, ZONE_NAME, 2). In this case, the 
operation can be incomplete at the end of the test.
# The second issue is that the next operation may start earlier than the 
previous one is completed.
{noformat}
2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_2%tableManager-io-10][WatchProcessor]
 Error occurred when processing a watch event
 org.apache.ignite.lang.IgniteInternalException: Raft group on the node is 
already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer 
[consistentId=irdt_ttqr_2, idx=0]]]
at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) 
~[main/:?]
at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) 
~[main/:?]
at 
org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) 
~[main/:?]
at 
org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361)
 ~[main/:?]
at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261)
 ~[main/:?]
at 
org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) 
~[main/:?]
at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
 ~[?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
at java.lang.Thread.run(Thread.java:834) ~[?:?]
{noformat}


> Reordering of altering zone operations
> --
>
> Key: IGNITE-20279
> URL: https://issues.apache.org/jira/browse/IGNITE-20279
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> The issue is shown in the test, where several zone change operations occur. 
> On my laptop, the test 
> ({{tRebalanceDistributedTest#testThreeQueuedRebalances}}) fails at least 
> twice on 30 runs.
>  # The first issue that I see is that the test does not wait to execute the 
> last zone change operation: alterZone(node, ZONE_NAME, 2). In this case, the 
> oper