[jira] [Updated] (IGNITE-26414) [Flaky] Cannot upgrade from 3.0.0 to 3.1.0

Igor (Jira) Fri, 12 Sep 2025 17:54:08 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-26414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Igor updated IGNITE-26414:
--------------------------
    Description: 
*Steps to reproduce:*
 # Start cluster of 3 nodes on single host or each node on separate host on 
version 3.0.0.
 # Create tables and insert data:
{code:java}
create zone "DEFAULT_ZONE" with storage_profiles='default_aipersist'
create TABLE test_table(id INTEGER not null, field_1 TINYINT not null, field_2 
SMALLINT not null, field_3 INTEGER not null, field_4 FLOAT not null, field_5 
VARCHAR(50) not null, primary key (id)) ZONE "DEFAULT_ZONE"
insert into test_table(id, field_1, field_2, field_3, field_4, field_5) values 
(3356::INTEGER, 45::TINYINT, 22062::SMALLINT, -861252049::INTEGER, 0.0::FLOAT, 
'1field_5_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1'::VARCHAR)
{code}

3. Validate data in tables
{code:java}
select id, field_1, field_2, field_3, field_4, field_5 from test_table  {code}
4. Await all partitions of all tables in zones [DEFAULT_ZONE] local state is 
"HEALTHY" and global state is "AVAILABLE"

5. Install new distribution into cluster with full restart

6. Assert physical and logical topologies are correct

7. Await all partitions of all tables in zones [DEFAULT_ZONE] local state is 
"HEALTHY" and global state is "AVAILABLE".

*Expected:*
Partitions changed state to "HEALTHY"

*Actual:*
During 120 seconds partitions were in states `[INITIALIZING, HEALTHY]`.
In logs on one node there are repeating messages:
{code:java}
2025-09-10 04:02:04:678 +0000 
[INFO][%PdsCompatibility3NodesTest_cluster_0%JRaft-ElectionTimer-1][NodeImpl] 
Unsuccessful election round number 35, group '17_part_4' {code}
Server logs are in attachment.

The issue is flaky, but reproducible rate about 90%.

  was:
*Steps to reproduce:*
 # Start cluster of 3 nodes on single host or each node on separate host on 
version 3.0.0.
 # Create tables and insert data:
{code:java}
create zone "DEFAULT_ZONE" with storage_profiles='default_aipersist'
create TABLE test_table(id INTEGER not null, field_1 TINYINT not null, field_2 
SMALLINT not null, field_3 INTEGER not null, field_4 FLOAT not null, field_5 
VARCHAR(50) not null, primary key (id)) ZONE "DEFAULT_ZONE"
insert into test_table(id, field_1, field_2, field_3, field_4, field_5) values 
(3356::INTEGER, 45::TINYINT, 22062::SMALLINT, -861252049::INTEGER, 0.0::FLOAT, 
'1field_5_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1'::VARCHAR)
{code}

 # Validate data in tables

{code:java}
select id, field_1, field_2, field_3, field_4, field_5 from test_table  {code}

 # Await all partitions of all tables in zones [DEFAULT_ZONE] local state is 
"HEALTHY" and global state is "AVAILABLE"
 # Install new distribution into cluster with full restart
 # Assert physical and logical topologies are correct
 # Await all partitions of all tables in zones [DEFAULT_ZONE] local state is 
"HEALTHY" and global state is "AVAILABLE".

*Expected:*
Partitions changed state to "HEALTHY"

*Actual:*
During 120 seconds partitions were in states `[INITIALIZING, HEALTHY]`.
In logs on one node there are repeating messages:
{code:java}
2025-09-10 04:02:04:678 +0000 
[INFO][%PdsCompatibility3NodesTest_cluster_0%JRaft-ElectionTimer-1][NodeImpl] 
Unsuccessful election round number 35, group '17_part_4' {code}
Server logs are in attachment.



The issue is flaky, but reproducible rate about 90%.


> [Flaky] Cannot upgrade from 3.0.0 to 3.1.0
> ------------------------------------------
>
>                 Key: IGNITE-26414
>                 URL: https://issues.apache.org/jira/browse/IGNITE-26414
>             Project: Ignite
>          Issue Type: Bug
>          Components: storage engines ai3
>    Affects Versions: 3.1
>         Environment: 3 node cluster either on single host or each on separate 
> host.
>            Reporter: Igor
>            Priority: Blocker
>              Labels: ignite-3
>         Attachments: server_logs.zip
>
>
> *Steps to reproduce:*
>  # Start cluster of 3 nodes on single host or each node on separate host on 
> version 3.0.0.
>  # Create tables and insert data:
> {code:java}
> create zone "DEFAULT_ZONE" with storage_profiles='default_aipersist'
> create TABLE test_table(id INTEGER not null, field_1 TINYINT not null, 
> field_2 SMALLINT not null, field_3 INTEGER not null, field_4 FLOAT not null, 
> field_5 VARCHAR(50) not null, primary key (id)) ZONE "DEFAULT_ZONE"
> insert into test_table(id, field_1, field_2, field_3, field_4, field_5) 
> values (3356::INTEGER, 45::TINYINT, 22062::SMALLINT, -861252049::INTEGER, 
> 0.0::FLOAT, '1field_5_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1'::VARCHAR)
> {code}
> 3. Validate data in tables
> {code:java}
> select id, field_1, field_2, field_3, field_4, field_5 from test_table  {code}
> 4. Await all partitions of all tables in zones [DEFAULT_ZONE] local state is 
> "HEALTHY" and global state is "AVAILABLE"
> 5. Install new distribution into cluster with full restart
> 6. Assert physical and logical topologies are correct
> 7. Await all partitions of all tables in zones [DEFAULT_ZONE] local state is 
> "HEALTHY" and global state is "AVAILABLE".
> *Expected:*
> Partitions changed state to "HEALTHY"
> *Actual:*
> During 120 seconds partitions were in states `[INITIALIZING, HEALTHY]`.
> In logs on one node there are repeating messages:
> {code:java}
> 2025-09-10 04:02:04:678 +0000 
> [INFO][%PdsCompatibility3NodesTest_cluster_0%JRaft-ElectionTimer-1][NodeImpl] 
> Unsuccessful election round number 35, group '17_part_4' {code}
> Server logs are in attachment.
> The issue is flaky, but reproducible rate about 90%.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-26414) [Flaky] Cannot upgrade from 3.0.0 to 3.1.0

Reply via email to