[jira] [Updated] (IGNITE-22242) Inject LogStorageFactory dependency in Loza & JRaftServer

2024-05-24 Thread Roman Puchkovskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-22242:
---
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Inject LogStorageFactory dependency in Loza & JRaftServer
> -
>
> Key: IGNITE-22242
> URL: https://issues.apache.org/jira/browse/IGNITE-22242
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Tiago Marques Godinho
>Assignee: Tiago Marques Godinho
>Priority: Minor
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Currently, the default{color:#00} LogStorageFactory in being created in 
> the JRaftServerImpl constructor.
> This makes the current solution tightly coupled to this implementation.
> In order to better control the lifecycle of the Raft Log we should inject 
> this dependency.
> {color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19544) Thin 3.0: Data Streamer with Receiver

2024-05-24 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-19544:

Ignite Flags: Docs Required,Release Notes Required
Release Note: Java thin: Added Data Streamer with Receiver

> Thin 3.0: Data Streamer with Receiver
> -
>
> Key: IGNITE-19544
> URL: https://issues.apache.org/jira/browse/IGNITE-19544
> Project: Ignite
>  Issue Type: Task
>  Components: thin client
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: iep-102, iep-121, ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Implement data streamer with receiver in Java client - see 
> [IEP-121|https://cwiki.apache.org/confluence/display/IGNITE/IEP-121%3A+Data+Streamer+with+Receiver]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-19544) Thin 3.0: Data Streamer with Receiver

2024-05-24 Thread Pavel Tupitsyn (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-19544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849246#comment-17849246
 ] 

Pavel Tupitsyn commented on IGNITE-19544:
-

Merged to main: 
[d3ab33124127dd18f36727148100d632fffbd02e|https://github.com/apache/ignite-3/commit/d3ab33124127dd18f36727148100d632fffbd02e]

> Thin 3.0: Data Streamer with Receiver
> -
>
> Key: IGNITE-19544
> URL: https://issues.apache.org/jira/browse/IGNITE-19544
> Project: Ignite
>  Issue Type: Task
>  Components: thin client
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: iep-102, iep-121, ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Implement data streamer with receiver in Java client - see 
> [IEP-121|https://cwiki.apache.org/confluence/display/IGNITE/IEP-121%3A+Data+Streamer+with+Receiver]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22323) Remove duplicate tests from ItFunctionsTest

2024-05-24 Thread Andrey Mashenkov (Jira)
Andrey Mashenkov created IGNITE-22323:
-

 Summary: Remove duplicate tests from ItFunctionsTest
 Key: IGNITE-22323
 URL: https://issues.apache.org/jira/browse/IGNITE-22323
 Project: Ignite
  Issue Type: Improvement
  Components: sql
Reporter: Andrey Mashenkov


Some tests in ItFunctionsTest duplicates SQL Logic test and can be safely 
dropped.
Most of tests could be moved to SQL Logic test suite.
Let’s just do this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22322) Removal of MVCC WAL record types

2024-05-24 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-22322:
---
Description: 
Records type to remove:
* {{RecordType.MVCC_DATA_PAGE_MARK_UPDATED_RECORD}}
* {{RecordType.MVCC_DATA_PAGE_NEW_TX_STATE_HINT_UPDATED_RECORD}}
* {{RecordType.MVCC_DATA_PAGE_TX_STATE_HINT_UPDATED_RECORD}}
* {{RecordType.MVCC_DATA_RECORD}}
* {{RecordType.MVCC_TX_RECORD}}

+ test on possible compatibility issues.

  was:
Classes to remove
# NestedTxMode
# Remove txAllowed flag in JdbcConnection

JDBC code must be complaint with a scpecification (see IGNITE-5339). Changes 
must provide full compatibility with older versions of servers and clients.


> Removal of MVCC WAL record types
> 
>
> Key: IGNITE-22322
> URL: https://issues.apache.org/jira/browse/IGNITE-22322
> Project: Ignite
>  Issue Type: Sub-task
>  Components: mvcc
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: ise
>
> Records type to remove:
> * {{RecordType.MVCC_DATA_PAGE_MARK_UPDATED_RECORD}}
> * {{RecordType.MVCC_DATA_PAGE_NEW_TX_STATE_HINT_UPDATED_RECORD}}
> * {{RecordType.MVCC_DATA_PAGE_TX_STATE_HINT_UPDATED_RECORD}}
> * {{RecordType.MVCC_DATA_RECORD}}
> * {{RecordType.MVCC_TX_RECORD}}
> + test on possible compatibility issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22321) MVCC: cleanup tests

2024-05-24 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-22321:
---
Description: Cleanup test code, added in MVCC commits  (was: Cleanup code, 
added in MVCC commits)

> MVCC: cleanup tests
> ---
>
> Key: IGNITE-22321
> URL: https://issues.apache.org/jira/browse/IGNITE-22321
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ilya Shishkov
>Assignee: Ilya Shishkov
>Priority: Trivial
>  Labels: ise
>
> Cleanup test code, added in MVCC commits



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21835) MVCC removal: final cleanup

2024-05-24 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-21835:
---
Summary: MVCC removal: final cleanup  (was: MVCC: final cleanup)

> MVCC removal: final cleanup
> ---
>
> Key: IGNITE-21835
> URL: https://issues.apache.org/jira/browse/IGNITE-21835
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Julia Bakulina
>Assignee: Ilya Shishkov
>Priority: Major
>  Labels: ise
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Final cleanup of MVCC code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22148) BinaryTupleFormatException for UUID

2024-05-24 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-22148:
--
Fix Version/s: 3.0.0-beta2

> BinaryTupleFormatException for UUID
> ---
>
> Key: IGNITE-22148
> URL: https://issues.apache.org/jira/browse/IGNITE-22148
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Iurii Gerzhedovich
>Assignee: Maksim Zhuravkov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Simple example to reproduce the issue:
> {code:java}
> sql("create table tdd(a uuid default gen_random_uuid, b int, primary key (a) 
> )"); 
> sql("insert into tdd(b) values(22)");{code}
> we got
> {code:java}
> Caused by: org.apache.ignite.internal.binarytuple.BinaryTupleFormatException: 
> IGN-CMN-65535 TraceId:5dfdd34c-6722-41ad-85f3-13aa0c483454 Invalid length for 
> a tuple element: 36
>     at 
> org.apache.ignite.internal.binarytuple.BinaryTupleParser.uuidValue(BinaryTupleParser.java:377)
>     at 
> org.apache.ignite.internal.binarytuple.BinaryTupleReader.uuidValue(BinaryTupleReader.java:305)
>     at 
> org.apache.ignite.internal.sql.engine.util.Commons.readValue(Commons.java:487)
>     at 
> org.apache.ignite.internal.sql.engine.exec.SqlOutputBinaryRow.newRow(SqlOutputBinaryRow.java:85)
>     at 
> org.apache.ignite.internal.sql.engine.exec.TableRowConverterImpl.toBinaryRow(TableRowConverterImpl.java:83)
>     at 
> org.apache.ignite.internal.sql.engine.exec.UpdatableTableImpl.insert(UpdatableTableImpl.java:187)
>     at 
> org.apache.ignite.internal.sql.engine.prepare.KeyValueModifyPlan.lambda$execute$1(KeyValueModifyPlan.java:133)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1187)
>  {code}
> Let's add validation for `create table` at least for defaults



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22148) BinaryTupleFormatException for UUID

2024-05-24 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-22148:
--
Ignite Flags:   (was: Docs Required,Release Notes Required)

> BinaryTupleFormatException for UUID
> ---
>
> Key: IGNITE-22148
> URL: https://issues.apache.org/jira/browse/IGNITE-22148
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Iurii Gerzhedovich
>Assignee: Maksim Zhuravkov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Simple example to reproduce the issue:
> {code:java}
> sql("create table tdd(a uuid default gen_random_uuid, b int, primary key (a) 
> )"); 
> sql("insert into tdd(b) values(22)");{code}
> we got
> {code:java}
> Caused by: org.apache.ignite.internal.binarytuple.BinaryTupleFormatException: 
> IGN-CMN-65535 TraceId:5dfdd34c-6722-41ad-85f3-13aa0c483454 Invalid length for 
> a tuple element: 36
>     at 
> org.apache.ignite.internal.binarytuple.BinaryTupleParser.uuidValue(BinaryTupleParser.java:377)
>     at 
> org.apache.ignite.internal.binarytuple.BinaryTupleReader.uuidValue(BinaryTupleReader.java:305)
>     at 
> org.apache.ignite.internal.sql.engine.util.Commons.readValue(Commons.java:487)
>     at 
> org.apache.ignite.internal.sql.engine.exec.SqlOutputBinaryRow.newRow(SqlOutputBinaryRow.java:85)
>     at 
> org.apache.ignite.internal.sql.engine.exec.TableRowConverterImpl.toBinaryRow(TableRowConverterImpl.java:83)
>     at 
> org.apache.ignite.internal.sql.engine.exec.UpdatableTableImpl.insert(UpdatableTableImpl.java:187)
>     at 
> org.apache.ignite.internal.sql.engine.prepare.KeyValueModifyPlan.lambda$execute$1(KeyValueModifyPlan.java:133)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1187)
>  {code}
> Let's add validation for `create table` at least for defaults



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21676) Sql. Move system view definitions to a separate package of a catalog module

2024-05-24 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-21676:
--
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Sql. Move system view definitions to a separate package of a catalog module
> ---
>
> Key: IGNITE-21676
> URL: https://issues.apache.org/jira/browse/IGNITE-21676
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Maksim Zhuravkov
>Assignee: Maksim Zhuravkov
>Priority: Minor
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Let's move code related to system view definitions to a separate package, so 
> CatalogManagerImpl
> only contains an implementation of SystemViewProvider interface/or a list of 
> a "imports" of system view definitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21676) Sql. Move system view definitions to a separate package of a catalog module

2024-05-24 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-21676:
--
Fix Version/s: 3.0.0-beta2

> Sql. Move system view definitions to a separate package of a catalog module
> ---
>
> Key: IGNITE-21676
> URL: https://issues.apache.org/jira/browse/IGNITE-21676
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Maksim Zhuravkov
>Assignee: Maksim Zhuravkov
>Priority: Minor
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Let's move code related to system view definitions to a separate package, so 
> CatalogManagerImpl
> only contains an implementation of SystemViewProvider interface/or a list of 
> a "imports" of system view definitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22324) The exception "The primary replica has changed" on creation of 1000 tables

2024-05-24 Thread Igor (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor updated IGNITE-22324:
--
Affects Version/s: 3.0.0-beta2
   (was: 3.0.0-beta1)

> The exception "The primary replica has changed" on creation of 1000 tables
> --
>
> Key: IGNITE-22324
> URL: https://issues.apache.org/jira/browse/IGNITE-22324
> Project: Ignite
>  Issue Type: Bug
>  Components: general, persistence
>Affects Versions: 3.0.0-beta2
>Reporter: Igor
>Priority: Major
>  Labels: ignite-3
>
> *Steps to reproduce:*
> 1. Start cluster with 1 node with JVM options: "-Xms4096m -Xmx4096m"
> 2. Create 1000 tables with 200 varchar columns each  and insert 1 row into 
> each. One by one.
> *Expected result:*
> Tables are created.
> *Actual result:*
> On table 949 the exception is thrown:
> {code:java}
> java.sql.SQLException: Primary replica has expired, transaction will be 
> rolled back: [groupId = 1850_part_21, expected enlistment consistency token = 
> 112069202113202526, commit timestamp = HybridTimestamp [physical=2024-03-10 
> 03:13:16:057 +, logical=396, composite=112069207395991948], current 
> primary replica = null]
>   at 
> org.apache.ignite.internal.jdbc.proto.IgniteQueryErrorCode.createJdbcSqlException(IgniteQueryErrorCode.java:57)
>   at 
> org.apache.ignite.internal.jdbc.JdbcStatement.execute0(JdbcStatement.java:154)
>   at 
> org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeWithArguments(JdbcPreparedStatement.java:765)
>   at 
> org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:173)
>   at 
> org.gridgain.ai3tests.tests.TablesAmountCapacityTest.lambda$insertRowAndAssertTimeout$1(TablesAmountCapacityTest.java:166)
>   at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834) {code}
> In server logs there is an exception:
> {code:java}
> 2024-03-10 03:13:24:222 + 
> [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-8][TxManagerImpl]
>  Failed to finish Tx. The operation will be retried 
> [txId=018e2659-b09f-009c-23c0-6ab50001].
> java.util.concurrent.CompletionException: 
> org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: 
> IGN-REP-3 TraceId:7ff7e851-9f18-4212-b317-a70a0a92fdfe Replication is timed 
> out [replicaGrpId=1850_part_21]
>     at 
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
>     at 
> java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:704)
>     at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>     at 
> org.apache.ignite.internal.replicator.ReplicaService.lambda$sendToReplica$0(ReplicaService.java:110)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: 
> org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: 
> IGN-REP-3 TraceId:7ff7e851-9f18-4212-b317-a70a0a92fdfe Replication is timed 
> out [replicaGrpId=1850_part_21]
>     ... 4 more
> 2024-03-10 03:13:24:290 + 
> [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-22][TrackableNetworkMessageHandler]
>  Message handling has been too long [duration=67ms, message=[class 
> org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
> 2024-03-10 03:13:24:290 + 
> [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-11][TrackableNetworkMessageHandler]
>  Message handling has been too long [duration=67ms, message=[class 
> org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
> 2024-03-10 03:13:24:290 + 
> [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-19][TrackableNetworkMessageHandler]
>  Message handling has been too long [duration=67ms, message=[class 
> org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
> 2024-03-10 03:13:24:290 + 
> [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-17][TrackableNetworkMessageHandler]
>  Message handling has b

[jira] [Created] (IGNITE-22324) The exception "The primary replica has changed" on creation of 1000 tables

2024-05-24 Thread Igor (Jira)
Igor created IGNITE-22324:
-

 Summary: The exception "The primary replica has changed" on 
creation of 1000 tables
 Key: IGNITE-22324
 URL: https://issues.apache.org/jira/browse/IGNITE-22324
 Project: Ignite
  Issue Type: Bug
  Components: general, persistence
Affects Versions: 3.0.0-beta1
Reporter: Igor


*Steps to reproduce:*

1. Start cluster with 1 node with JVM options: "-Xms4096m -Xmx4096m"

2. Create 1000 tables with 200 varchar columns each  and insert 1 row into 
each. One by one.

*Expected result:*
Tables are created.

*Actual result:*

On table 949 the exception is thrown:
{code:java}
java.sql.SQLException: Primary replica has expired, transaction will be rolled 
back: [groupId = 1850_part_21, expected enlistment consistency token = 
112069202113202526, commit timestamp = HybridTimestamp [physical=2024-03-10 
03:13:16:057 +, logical=396, composite=112069207395991948], current primary 
replica = null]
  at 
org.apache.ignite.internal.jdbc.proto.IgniteQueryErrorCode.createJdbcSqlException(IgniteQueryErrorCode.java:57)
  at 
org.apache.ignite.internal.jdbc.JdbcStatement.execute0(JdbcStatement.java:154)
  at 
org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeWithArguments(JdbcPreparedStatement.java:765)
  at 
org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:173)
  at 
org.gridgain.ai3tests.tests.TablesAmountCapacityTest.lambda$insertRowAndAssertTimeout$1(TablesAmountCapacityTest.java:166)
  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
  at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
  at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
  at java.base/java.lang.Thread.run(Thread.java:834) {code}
In server logs there is an exception:
{code:java}
2024-03-10 03:13:24:222 + 
[WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-8][TxManagerImpl]
 Failed to finish Tx. The operation will be retried 
[txId=018e2659-b09f-009c-23c0-6ab50001].
java.util.concurrent.CompletionException: 
org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: 
IGN-REP-3 TraceId:7ff7e851-9f18-4212-b317-a70a0a92fdfe Replication is timed out 
[replicaGrpId=1850_part_21]
    at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
    at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
    at 
java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:704)
    at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
    at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
    at 
org.apache.ignite.internal.replicator.ReplicaService.lambda$sendToReplica$0(ReplicaService.java:110)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: 
org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: 
IGN-REP-3 TraceId:7ff7e851-9f18-4212-b317-a70a0a92fdfe Replication is timed out 
[replicaGrpId=1850_part_21]
    ... 4 more
2024-03-10 03:13:24:290 + 
[WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-22][TrackableNetworkMessageHandler]
 Message handling has been too long [duration=67ms, message=[class 
org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
2024-03-10 03:13:24:290 + 
[WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-11][TrackableNetworkMessageHandler]
 Message handling has been too long [duration=67ms, message=[class 
org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
2024-03-10 03:13:24:290 + 
[WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-19][TrackableNetworkMessageHandler]
 Message handling has been too long [duration=67ms, message=[class 
org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
2024-03-10 03:13:24:290 + 
[WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-17][TrackableNetworkMessageHandler]
 Message handling has been too long [duration=67ms, message=[class 
org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
2024-03-10 03:13:24:290 + 
[WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-23][TrackableNetworkMessageHandler]
 Message handling has been too long [duration=67ms, message=[class 
org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
2024-03-10 03:13:24:290 + 
[WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-6][TrackableNetworkMessage

[jira] [Assigned] (IGNITE-22288) ItTxResourcesVacuumTest may fail with NPE

2024-05-24 Thread Denis Chudov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Chudov reassigned IGNITE-22288:
-

Assignee: Denis Chudov

> ItTxResourcesVacuumTest may fail with NPE
> -
>
> Key: IGNITE-22288
> URL: https://issues.apache.org/jira/browse/IGNITE-22288
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Denis Chudov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> [2024-05-20T10:02:21,361][INFO ][Test worker][ItTxResourcesVacuumTest] >>> 
> Stopping test: 
> ItTxResourcesVacuumTest#testCommitPartitionPrimaryChangesBeforeVacuum, 
> displayName: testCommitPartitionPrimaryChangesBeforeVacuum(), cost: 11523ms.  
> java.lang.NullPointerException  java.lang.NullPointerException
> at 
> org.apache.ignite.internal.table.ItTxResourcesVacuumTest.checkValueReadOnly(ItTxResourcesVacuumTest.java:803)
> at 
> org.apache.ignite.internal.table.ItTxResourcesVacuumTest.testCommitPartitionPrimaryChangesBeforeVacuum(ItTxResourcesVacuumTest.java:536)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566) {code}
> [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/8127603?hideProblemsFromDependencies=false&expandBuildDeploymentsSection=false&hideTestsFromDependencies=false&expandCode+Inspection=true&expandBuildChangesSection=true&expandBuildTestsSection=true&expandBuildProblemsSection=true]
> The reason of failure is a bug in vacuum. Persistent tx state is vacuumized 
> earlier than cleanup is completed, which causes tx recovery on RO tx with 
> rollback outcome for RW tx that was committed and the abort of write intent.
> Scenario: tx resource TTL is over, so we check the tx state meta (see
> VolatileTxStateMetaStorage#vacuum() ). There is commitPartId (because it's 
> commit partition), and we add the tx id to collection that should be passed 
> to persistent vacuumizer. Persistent vacuumizer erases the persistent state 
> only if we are on the commit partition primary replica, but it doesn't check 
> the presense of cleanupCompletionTimestamp, as it should. As a result, it 
> erases the persistent state before cleanupCompletionTimestamp appears.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22324) The exception "The primary replica has changed" on creation of 1000 tables

2024-05-24 Thread Igor (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor updated IGNITE-22324:
--
Description: 
*Steps to reproduce:*

1. Start cluster with 1 node with JVM options: "-Xms4096m -Xmx4096m"

2. Create 1000 tables with 200 varchar columns each  and insert 1 row into 
each. One by one.

*Expected result:*
Tables are created.

*Actual result:*

On table 850 the exception is thrown:
{code:java}
java.sql.SQLException: The primary replica has changed 
[expectedLeaseholderName=TablesAmountCapacityTest_cluster_0, 
currentLeaseholderName=null, 
expectedLeaseholderId=bf69f842-d6c8-4f7a-b7e4-96458a4d92cb, 
currentLeaseholderId=null, 
expectedEnlistmentConsistencyToken=112491691050598880, 
currentEnlistmentConsistencyToken=null]  at 
org.apache.ignite.internal.jdbc.proto.IgniteQueryErrorCode.createJdbcSqlException(IgniteQueryErrorCode.java:57)
  at 
org.apache.ignite.internal.jdbc.JdbcStatement.execute0(JdbcStatement.java:154)  
at 
org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeWithArguments(JdbcPreparedStatement.java:765)
  at 
org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:173)
  at 
org.gridgain.ai3tests.tests.amountcapacity.TablesAmountCapacityBaseTest.lambda$insertRowAndAssertTimeout$2(TablesAmountCapacityBaseTest.java:92)
  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)  at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
  at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
  at java.base/java.lang.Thread.run(Thread.java:834){code}
In server logs there is an exception:
{code:java}
2024-05-23 17:57:19:570 + 
[WARNING][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable 
error during the request occurred (will be retried on the randomly selected 
node) [request=WriteActionRequestImpl [command=[0, 9, 41, -58, -128, -112, -21, 
-103, -45, -23, -57, 1], deserializedCommand=SafeTimeSyncCommandImpl 
[safeTimeLong=112491694408335429], groupId=3402_part_7], peer=Peer 
[consistentId=TablesAmountCapacityTest_cluster_0, idx=0], newPeer=Peer 
[consistentId=TablesAmountCapacityTest_cluster_0, idx=0]].
java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException
at 
java.base/java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
at 
java.base/java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
at 
java.base/java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019)
at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
at 
java.base/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.util.concurrent.TimeoutException
... 7 more
2024-05-23 17:57:19:570 + 
[WARNING][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable 
error during the request occurred (will be retried on the randomly selected 
node) [request=WriteActionRequestImpl [command=[0, 9, 41, -106, -128, -108, 
-21, -103, -45, -23, -57, 1], deserializedCommand=SafeTimeSyncCommandImpl 
[safeTimeLong=112491694408400917], groupId=3402_part_21], peer=Peer 
[consistentId=TablesAmountCapacityTest_cluster_0, idx=0], newPeer=Peer 
[consistentId=TablesAmountCapacityTest_cluster_0, idx=0]].
java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException
at 
java.base/java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
at 
java.base/java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
at 
java.base/java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019)
at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
at 
java.base/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
 

[jira] [Created] (IGNITE-22325) BinaryTupleBuilder accepts LocalDate/LocalDateTime that can not be represented by BinaryTuple

2024-05-24 Thread Maksim Zhuravkov (Jira)
Maksim Zhuravkov created IGNITE-22325:
-

 Summary: BinaryTupleBuilder accepts LocalDate/LocalDateTime that 
can not be represented by BinaryTuple
 Key: IGNITE-22325
 URL: https://issues.apache.org/jira/browse/IGNITE-22325
 Project: Ignite
  Issue Type: Improvement
Reporter: Maksim Zhuravkov
 Fix For: 3.0.0-beta2


For both LocalDate and LocalDateTime BinaryTuple format 
(https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach)
 can only store year that can be stored in 14 bit + 1 sign bit.

{code:java}
@Test
public void dateTest2() {
LocalDate value = LocalDate.of(200, 1, 1);

BinaryTupleBuilder builder = new BinaryTupleBuilder(1);
ByteBuffer bytes = builder.appendDate(value).build();
assertEquals(3, bytes.get(1));
assertEquals(5, bytes.limit());

BinaryTupleReader reader = new BinaryTupleReader(1, bytes);
assertEquals(value, reader.dateValue(0));
}

org.opentest4j.AssertionFailedError: 
Expected :+200-01-01
Actual   :1152-01-01
{code}

It should be possible to store such values in BinaryTuple format.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22325) BinaryTupleBuilder accepts LocalDate/LocalDateTime that can not be represented by BinaryTuple

2024-05-24 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-22325:
--
Description: 
For both LocalDate and LocalDateTime BinaryTuple format 
(https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach)
 can only store year as 14 value bits + 1 sign bit.

{code:java}
@Test
public void dateTest2() {
LocalDate value = LocalDate.of(200, 1, 1);

BinaryTupleBuilder builder = new BinaryTupleBuilder(1);
ByteBuffer bytes = builder.appendDate(value).build();
assertEquals(3, bytes.get(1));
assertEquals(5, bytes.limit());

BinaryTupleReader reader = new BinaryTupleReader(1, bytes);
assertEquals(value, reader.dateValue(0));
}

org.opentest4j.AssertionFailedError: 
Expected :+200-01-01
Actual   :1152-01-01
{code}

It should be possible to store such values in BinaryTuple format.



  was:
For both LocalDate and LocalDateTime BinaryTuple format 
(https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach)
 can only store year that can be stored in 14 bit + 1 sign bit.

{code:java}
@Test
public void dateTest2() {
LocalDate value = LocalDate.of(200, 1, 1);

BinaryTupleBuilder builder = new BinaryTupleBuilder(1);
ByteBuffer bytes = builder.appendDate(value).build();
assertEquals(3, bytes.get(1));
assertEquals(5, bytes.limit());

BinaryTupleReader reader = new BinaryTupleReader(1, bytes);
assertEquals(value, reader.dateValue(0));
}

org.opentest4j.AssertionFailedError: 
Expected :+200-01-01
Actual   :1152-01-01
{code}

It should be possible to store such values in BinaryTuple format.




> BinaryTupleBuilder accepts LocalDate/LocalDateTime that can not be 
> represented by BinaryTuple
> -
>
> Key: IGNITE-22325
> URL: https://issues.apache.org/jira/browse/IGNITE-22325
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Maksim Zhuravkov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> For both LocalDate and LocalDateTime BinaryTuple format 
> (https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach)
>  can only store year as 14 value bits + 1 sign bit.
> {code:java}
> @Test
> public void dateTest2() {
> LocalDate value = LocalDate.of(200, 1, 1);
> BinaryTupleBuilder builder = new BinaryTupleBuilder(1);
> ByteBuffer bytes = builder.appendDate(value).build();
> assertEquals(3, bytes.get(1));
> assertEquals(5, bytes.limit());
> BinaryTupleReader reader = new BinaryTupleReader(1, bytes);
> assertEquals(value, reader.dateValue(0));
> }
> org.opentest4j.AssertionFailedError: 
> Expected :+200-01-01
> Actual   :1152-01-01
> {code}
> It should be possible to store such values in BinaryTuple format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22325) BinaryTupleBuilder accepts LocalDate/LocalDateTime that can not be represented by BinaryTuple

2024-05-24 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-22325:
--
Description: 
For both LocalDate and LocalDateTime BinaryTuple format 
(https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach)
 can only store year as 14 value bits + 1 sign bit.

{code:java}
@Test
public void dateTest2() {
LocalDate value = LocalDate.of(200, 1, 1);

BinaryTupleBuilder builder = new BinaryTupleBuilder(1);
ByteBuffer bytes = builder.appendDate(value).build();
assertEquals(3, bytes.get(1));
assertEquals(5, bytes.limit());

BinaryTupleReader reader = new BinaryTupleReader(1, bytes);
assertEquals(value, reader.dateValue(0));
}

org.opentest4j.AssertionFailedError: 
Expected :+200-01-01
Actual   :1152-01-01
{code}

It should be possible to store values that do not lie within a range defined by 
BinaryTuple format.



  was:
For both LocalDate and LocalDateTime BinaryTuple format 
(https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach)
 can only store year as 14 value bits + 1 sign bit.

{code:java}
@Test
public void dateTest2() {
LocalDate value = LocalDate.of(200, 1, 1);

BinaryTupleBuilder builder = new BinaryTupleBuilder(1);
ByteBuffer bytes = builder.appendDate(value).build();
assertEquals(3, bytes.get(1));
assertEquals(5, bytes.limit());

BinaryTupleReader reader = new BinaryTupleReader(1, bytes);
assertEquals(value, reader.dateValue(0));
}

org.opentest4j.AssertionFailedError: 
Expected :+200-01-01
Actual   :1152-01-01
{code}

It should be possible to store values that do not lie within the year range 
defined by BinaryTuple format.




> BinaryTupleBuilder accepts LocalDate/LocalDateTime that can not be 
> represented by BinaryTuple
> -
>
> Key: IGNITE-22325
> URL: https://issues.apache.org/jira/browse/IGNITE-22325
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Maksim Zhuravkov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> For both LocalDate and LocalDateTime BinaryTuple format 
> (https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach)
>  can only store year as 14 value bits + 1 sign bit.
> {code:java}
> @Test
> public void dateTest2() {
> LocalDate value = LocalDate.of(200, 1, 1);
> BinaryTupleBuilder builder = new BinaryTupleBuilder(1);
> ByteBuffer bytes = builder.appendDate(value).build();
> assertEquals(3, bytes.get(1));
> assertEquals(5, bytes.limit());
> BinaryTupleReader reader = new BinaryTupleReader(1, bytes);
> assertEquals(value, reader.dateValue(0));
> }
> org.opentest4j.AssertionFailedError: 
> Expected :+200-01-01
> Actual   :1152-01-01
> {code}
> It should be possible to store values that do not lie within a range defined 
> by BinaryTuple format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22325) BinaryTupleBuilder accepts LocalDate/LocalDateTime that can not be represented by BinaryTuple

2024-05-24 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-22325:
--
Description: 
For both LocalDate and LocalDateTime BinaryTuple format 
(https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach)
 can only store year as 14 value bits + 1 sign bit.

{code:java}
@Test
public void dateTest2() {
LocalDate value = LocalDate.of(200, 1, 1);

BinaryTupleBuilder builder = new BinaryTupleBuilder(1);
ByteBuffer bytes = builder.appendDate(value).build();
assertEquals(3, bytes.get(1));
assertEquals(5, bytes.limit());

BinaryTupleReader reader = new BinaryTupleReader(1, bytes);
assertEquals(value, reader.dateValue(0));
}

org.opentest4j.AssertionFailedError: 
Expected :+200-01-01
Actual   :1152-01-01
{code}

It should be possible to store values that do not lie within the year range 
defined by BinaryTuple format.



  was:
For both LocalDate and LocalDateTime BinaryTuple format 
(https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach)
 can only store year as 14 value bits + 1 sign bit.

{code:java}
@Test
public void dateTest2() {
LocalDate value = LocalDate.of(200, 1, 1);

BinaryTupleBuilder builder = new BinaryTupleBuilder(1);
ByteBuffer bytes = builder.appendDate(value).build();
assertEquals(3, bytes.get(1));
assertEquals(5, bytes.limit());

BinaryTupleReader reader = new BinaryTupleReader(1, bytes);
assertEquals(value, reader.dateValue(0));
}

org.opentest4j.AssertionFailedError: 
Expected :+200-01-01
Actual   :1152-01-01
{code}

It should be possible to store such values in BinaryTuple format.




> BinaryTupleBuilder accepts LocalDate/LocalDateTime that can not be 
> represented by BinaryTuple
> -
>
> Key: IGNITE-22325
> URL: https://issues.apache.org/jira/browse/IGNITE-22325
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Maksim Zhuravkov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> For both LocalDate and LocalDateTime BinaryTuple format 
> (https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach)
>  can only store year as 14 value bits + 1 sign bit.
> {code:java}
> @Test
> public void dateTest2() {
> LocalDate value = LocalDate.of(200, 1, 1);
> BinaryTupleBuilder builder = new BinaryTupleBuilder(1);
> ByteBuffer bytes = builder.appendDate(value).build();
> assertEquals(3, bytes.get(1));
> assertEquals(5, bytes.limit());
> BinaryTupleReader reader = new BinaryTupleReader(1, bytes);
> assertEquals(value, reader.dateValue(0));
> }
> org.opentest4j.AssertionFailedError: 
> Expected :+200-01-01
> Actual   :1152-01-01
> {code}
> It should be possible to store values that do not lie within the year range 
> defined by BinaryTuple format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22326) KeyValue/RecordView. Update marshaller code to restrict range of values legal for year field of LocalDate/LocalDateTime types

2024-05-24 Thread Maksim Zhuravkov (Jira)
Maksim Zhuravkov created IGNITE-22326:
-

 Summary: KeyValue/RecordView. Update marshaller code to restrict 
range of values legal for year field of LocalDate/LocalDateTime types
 Key: IGNITE-22326
 URL: https://issues.apache.org/jira/browse/IGNITE-22326
 Project: Ignite
  Issue Type: Improvement
Reporter: Maksim Zhuravkov


KeyValue/RecordView handling of date type logic should be consistent with SQL 
and allow only values legal for both APIs.

Let's update marshaller code to reject dates that can not be stored via SQL API.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22327) Error "StateMachine meet critical error" on restart

2024-05-24 Thread Igor (Jira)
Igor created IGNITE-22327:
-

 Summary:  Error "StateMachine meet critical error" on restart
 Key: IGNITE-22327
 URL: https://issues.apache.org/jira/browse/IGNITE-22327
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 3.0.0-beta2
 Environment: 2 nodes (with arguments "-Xms4096m", "-Xmx4096m" ) on *1 
host*
cpuCount=10
memorySizeMb=15360
Reporter: Igor


*Steps to reproduce:*
 # Start cluster of 2 nodes on single host.
 # Create 5 tables and insert 1000 rows into each.
 # Kill 1 server.
 # Start the killed server.
 # Check logs for errors.

*Expected:*

No errors in logs.

*Actual:*
Errors in logs
{code:java}
2024-05-17 04:26:37:808 + 
[ERROR][%ClusterFailover2NodesTest_cluster_0%common-scheduler-0][CriticalWorkerWatchdog]
 A critical thread is blocked for 688 ms that is more than the allowed 500 ms, 
it is "ClusterFailover2NodesTest_cluster_0-srv-worker-3" prio=10 Id=41 RUNNABLE
    at 
app//org.apache.ignite.internal.network.message.InvokeResponseDeserializer.getMessage(InvokeResponseDeserializer.java:25)
    at 
app//org.apache.ignite.internal.network.message.InvokeResponseDeserializer.getMessage(InvokeResponseDeserializer.java:11)
    at 
app//org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:136)
    at 
app//io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
    at 
app//io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
    at 
app//io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at 
app//io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at 
app//io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at 
app//io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at 
app//io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
    at 
app//io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
    at 
app//io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
    at app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
    at 
app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at 
app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at 
app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base@17.0.6/java.lang.Thread.run(Thread.java:833) {code}
GC calls of node ClusterFailover2NodesTest_cluster_0 (LOG: [^ignite3db-0.log])
!image-2024-05-17-18-06-23-594.png! GC calls of node 
ClusterFailover2NodesTest_cluster_1 (LOG: [^ignite3db-0.log])
!image-2024-05-17-18-06-04-081.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22327) Error "StateMachine meet critical error" on restart

2024-05-24 Thread Igor (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor updated IGNITE-22327:
--
Environment: 3 nodes (with arguments "-Xms4096m", "-Xmx4096m" ) on *3 
hosts*  (was: 3 nodes (with arguments "-Xms4096m", "-Xmx4096m" ) on *3 host*)

>  Error "StateMachine meet critical error" on restart
> 
>
> Key: IGNITE-22327
> URL: https://issues.apache.org/jira/browse/IGNITE-22327
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 3.0.0-beta2
> Environment: 3 nodes (with arguments "-Xms4096m", "-Xmx4096m" ) on *3 
> hosts*
>Reporter: Igor
>Priority: Major
>  Labels: ignite-3
>
> *Steps to reproduce:*
>  # Start cluster of 2 nodes on single host.
>  # Create 5 tables and insert 1000 rows into each.
>  # Kill 1 server.
>  # Start the killed server.
>  # Check logs for errors.
> *Expected:*
> No errors in logs.
> *Actual:*
> Errors in logs
> {code:java}
> 2024-05-17 04:26:37:808 + 
> [ERROR][%ClusterFailover2NodesTest_cluster_0%common-scheduler-0][CriticalWorkerWatchdog]
>  A critical thread is blocked for 688 ms that is more than the allowed 500 
> ms, it is "ClusterFailover2NodesTest_cluster_0-srv-worker-3" prio=10 Id=41 
> RUNNABLE
>     at 
> app//org.apache.ignite.internal.network.message.InvokeResponseDeserializer.getMessage(InvokeResponseDeserializer.java:25)
>     at 
> app//org.apache.ignite.internal.network.message.InvokeResponseDeserializer.getMessage(InvokeResponseDeserializer.java:11)
>     at 
> app//org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:136)
>     at 
> app//io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
>     at 
> app//io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
>     at 
> app//io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
>     at 
> app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
>     at 
> app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
>     at 
> app//io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
>     at 
> app//io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
>     at 
> app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
>     at 
> app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
>     at 
> app//io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
>     at 
> app//io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
>     at 
> app//io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
>     at 
> app//io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
>     at 
> app//io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
>     at app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
>     at 
> app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>     at 
> app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>     at 
> app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>     at java.base@17.0.6/java.lang.Thread.run(Thread.java:833) {code}
> GC calls of node ClusterFailover2NodesTest_cluster_0 (LOG: [^ignite3db-0.log])
> !image-2024-05-17-18-06-23-594.png! GC calls of node 
> ClusterFailover2NodesTest_cluster_1 (LOG: [^ignite3db-0.log])
> !image-2024-05-17-18-06-04-081.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22327) Error "StateMachine meet critical error" on restart

2024-05-24 Thread Igor (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor updated IGNITE-22327:
--
Environment: 3 nodes (with arguments "-Xms4096m", "-Xmx4096m" ) on *3 host* 
 (was: 2 nodes (with arguments "-Xms4096m", "-Xmx4096m" ) on *1 host*
cpuCount=10
memorySizeMb=15360)

>  Error "StateMachine meet critical error" on restart
> 
>
> Key: IGNITE-22327
> URL: https://issues.apache.org/jira/browse/IGNITE-22327
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 3.0.0-beta2
> Environment: 3 nodes (with arguments "-Xms4096m", "-Xmx4096m" ) on *3 
> host*
>Reporter: Igor
>Priority: Major
>  Labels: ignite-3
>
> *Steps to reproduce:*
>  # Start cluster of 2 nodes on single host.
>  # Create 5 tables and insert 1000 rows into each.
>  # Kill 1 server.
>  # Start the killed server.
>  # Check logs for errors.
> *Expected:*
> No errors in logs.
> *Actual:*
> Errors in logs
> {code:java}
> 2024-05-17 04:26:37:808 + 
> [ERROR][%ClusterFailover2NodesTest_cluster_0%common-scheduler-0][CriticalWorkerWatchdog]
>  A critical thread is blocked for 688 ms that is more than the allowed 500 
> ms, it is "ClusterFailover2NodesTest_cluster_0-srv-worker-3" prio=10 Id=41 
> RUNNABLE
>     at 
> app//org.apache.ignite.internal.network.message.InvokeResponseDeserializer.getMessage(InvokeResponseDeserializer.java:25)
>     at 
> app//org.apache.ignite.internal.network.message.InvokeResponseDeserializer.getMessage(InvokeResponseDeserializer.java:11)
>     at 
> app//org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:136)
>     at 
> app//io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
>     at 
> app//io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
>     at 
> app//io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
>     at 
> app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
>     at 
> app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
>     at 
> app//io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
>     at 
> app//io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
>     at 
> app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
>     at 
> app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
>     at 
> app//io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
>     at 
> app//io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
>     at 
> app//io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
>     at 
> app//io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
>     at 
> app//io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
>     at app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
>     at 
> app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
>     at 
> app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>     at 
> app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>     at java.base@17.0.6/java.lang.Thread.run(Thread.java:833) {code}
> GC calls of node ClusterFailover2NodesTest_cluster_0 (LOG: [^ignite3db-0.log])
> !image-2024-05-17-18-06-23-594.png! GC calls of node 
> ClusterFailover2NodesTest_cluster_1 (LOG: [^ignite3db-0.log])
> !image-2024-05-17-18-06-04-081.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22327) Error "StateMachine meet critical error" on restart

2024-05-24 Thread Igor (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor updated IGNITE-22327:
--
Description: 
*Steps to reproduce:*
 # Start cluster of 3 nodes on 3 hosts.
 # Create 10 tables and insert 10 rows into each.
 # Kill 1 server.
 # Start the killed server.
 # Check logs for errors.

*Expected:*

No errors in logs.

*Actual:*
Errors in logs
{code:java}
2024-05-17 04:26:37:808 + 
[ERROR][%ClusterFailover2NodesTest_cluster_0%common-scheduler-0][CriticalWorkerWatchdog]
 A critical thread is blocked for 688 ms that is more than the allowed 500 ms, 
it is "ClusterFailover2NodesTest_cluster_0-srv-worker-3" prio=10 Id=41 RUNNABLE
    at 
app//org.apache.ignite.internal.network.message.InvokeResponseDeserializer.getMessage(InvokeResponseDeserializer.java:25)
    at 
app//org.apache.ignite.internal.network.message.InvokeResponseDeserializer.getMessage(InvokeResponseDeserializer.java:11)
    at 
app//org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:136)
    at 
app//io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
    at 
app//io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
    at 
app//io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at 
app//io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at 
app//io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at 
app//io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at 
app//io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
    at 
app//io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
    at 
app//io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
    at app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
    at 
app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at 
app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at 
app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base@17.0.6/java.lang.Thread.run(Thread.java:833) {code}
GC calls of node ClusterFailover2NodesTest_cluster_0 (LOG: [^ignite3db-0.log])
!image-2024-05-17-18-06-23-594.png! GC calls of node 
ClusterFailover2NodesTest_cluster_1 (LOG: [^ignite3db-0.log])
!image-2024-05-17-18-06-04-081.png!

  was:
*Steps to reproduce:*
 # Start cluster of 2 nodes on single host.
 # Create 5 tables and insert 1000 rows into each.
 # Kill 1 server.
 # Start the killed server.
 # Check logs for errors.

*Expected:*

No errors in logs.

*Actual:*
Errors in logs
{code:java}
2024-05-17 04:26:37:808 + 
[ERROR][%ClusterFailover2NodesTest_cluster_0%common-scheduler-0][CriticalWorkerWatchdog]
 A critical thread is blocked for 688 ms that is more than the allowed 500 ms, 
it is "ClusterFailover2NodesTest_cluster_0-srv-worker-3" prio=10 Id=41 RUNNABLE
    at 
app//org.apache.ignite.internal.network.message.InvokeResponseDeserializer.getMessage(InvokeResponseDeserializer.java:25)
    at 
app//org.apache.ignite.internal.network.message.InvokeResponseDeserializer.getMessage(InvokeResponseDeserializer.java:11)
    at 
app//org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:136)
    at 
app//io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
    at 
app//io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
    at 
app//io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at 
app//io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at 
app//io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at 
app//io.netty.channel.AbstractC

[jira] [Updated] (IGNITE-22327) Error "StateMachine meet critical error" on restart

2024-05-24 Thread Igor (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor updated IGNITE-22327:
--
Description: 
*Steps to reproduce:*
 # Start cluster of 3 nodes on 3 hosts.
 # Create 10 tables and insert 10 rows into each.
 # Kill 1 server.
 # Start the killed server.
 # Check logs for errors.

*Expected:*

No errors in logs.

*Actual:*
Errors in logs
{code:java}
2024-05-23 21:09:52:473 +0300 
[ERROR][%ClusterFailover3NodesTest_cluster_0%JRaft-FSMCaller-Disruptor_stripe_3-0][StateMachineAdapter]
 Encountered an error=Status[ESTATEMACHINE<10002>: StateMachine meet critical 
error when applying one or more tasks since index=2, 
Status[ESTATEMACHINE<10002>: No serializer provider defined for group type 40 
and message type 8]] on StateMachine 
org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine,
 it's highly recommended to implement this method as raft stops working since 
some error occurs, you should figure out the cause and repair or remove this 
node.
Error [type=ERROR_TYPE_STATE_MACHINE, status=Status[ESTATEMACHINE<10002>: 
StateMachine meet critical error when applying one or more tasks since index=2, 
Status[ESTATEMACHINE<10002>: No serializer provider defined for group type 40 
and message type 8]]]
at 
org.apache.ignite.raft.jraft.core.IteratorImpl.getOrCreateError(IteratorImpl.java:156)
at 
org.apache.ignite.raft.jraft.core.IteratorImpl.setErrorAndRollback(IteratorImpl.java:147)
at 
org.apache.ignite.raft.jraft.core.IteratorWrapper.setErrorAndRollback(IteratorWrapper.java:72)
at 
org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine.onApply(JraftServerImpl.java:803)
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doApplyTasks(FSMCallerImpl.java:557)
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:525)
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:444)
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:136)
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:130)
at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:340)
at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:278)
at 
com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:167)
at 
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:122)
at java.base/java.lang.Thread.run(Thread.java:842)
2024-05-23 21:09:52:473 +0300 
[WARNING][%ClusterFailover3NodesTest_cluster_0%JRaft-FSMCaller-Disruptor_stripe_3-0][NodeImpl]
 Node <18_part_19/ClusterFailover3NodesTest_cluster_0> got error: Error 
[type=ERROR_TYPE_STATE_MACHINE, status=Status[ESTATEMACHINE<10002>: 
StateMachine meet critical error when applying one or more tasks since index=2, 
Status[ESTATEMACHINE<10002>: No serializer provider defined for group type 40 
and message type 8]]].
2024-05-23 21:09:52:473 +0300 
[WARNING][%ClusterFailover3NodesTest_cluster_0%JRaft-FSMCaller-Disruptor_stripe_3-0][FSMCallerImpl]
 FSMCaller already in error status, ignore new error
Error [type=ERROR_TYPE_STATE_MACHINE, status=Status[ESTATEMACHINE<10002>: 
StateMachine meet critical error when applying one or more tasks since index=2, 
Status[ESTATEMACHINE<10002>: No serializer provider defined for group type 40 
and message type 8]]]
at 
org.apache.ignite.raft.jraft.core.IteratorImpl.getOrCreateError(IteratorImpl.java:156)
at 
org.apache.ignite.raft.jraft.core.IteratorImpl.setErrorAndRollback(IteratorImpl.java:147)
at 
org.apache.ignite.raft.jraft.core.IteratorWrapper.setErrorAndRollback(IteratorWrapper.java:72)
at 
org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine.onApply(JraftServerImpl.java:803)
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doApplyTasks(FSMCallerImpl.java:557)
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:525)
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:444)
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:136)
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:130)
at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:340)
at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:278)
at 
com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:167)
at 
com.lmax.disruptor.BatchEventProcess

[jira] [Created] (IGNITE-22328) Improve test coverage for SQL planner optimization for JOIN

2024-05-24 Thread Iurii Gerzhedovich (Jira)
Iurii Gerzhedovich created IGNITE-22328:
---

 Summary: Improve test coverage for SQL planner optimization for 
JOIN
 Key: IGNITE-22328
 URL: https://issues.apache.org/jira/browse/IGNITE-22328
 Project: Ignite
  Issue Type: Improvement
  Components: sql
Reporter: Iurii Gerzhedovich


During implementation  planner optimization for JOIN in SQL ( IGNITE-18749 ) 
were added set of tests. However added set of tests is insufficient.
Let's add the following test scenarios:
h2. Performance Testing
 # Check bound intersection timings, i.e. between  MAX_SIZE_OF_JOIN_TO_OPTIMIZE 
up to MAX_SIZE_OF_JOIN_TO_OPTIMIZE + 1 tables joining. Take into account that 
this approach is applies only to the planning *phase* of execution engine thus 
statement for N joins (with empty data rows in table) will need to consume 
equal time in comparison with an equal query but with N+1 joins instead.
 # Check there is no sufficient difference between involved N and N+1 tables. 
We need to store (somehow) performance results of such a checks from build to 
build

h2. Functional and End-to-End Testing:

pre requisites:

Current implementation uses MAX_SIZE_OF_JOIN_TO_OPTIMIZE = 5; constant as a 
threshold for disabling JOIN COMMUTE rules, thus all sql statements need to 
have joins with more than 5 tables\sources.

 
 # All possible joins INNER, OUTER(LEFT, RIGHT), NATURAL, SELF need to be 
checked.
 # Not only tables can be used as a sources for JOIN operations, but 
subqueries(with and without table sources), *system_range* function and system 
views.
 # Due to non optimal plans for some statements are raised - fill distributed 
table (more than 3 nodes) with data step by step (10k 100k and so on rows) and 
run all from p1.
 # Mutate statements to change sequence of join order, i.e. for: ON T1.custId = 
T2.custId also need to be checked: ON T2.custId = T1.custId. Self check: you 
need to obtain two different plans (explain plan for sql statement):

 #   Ignite.*Join

  ...

  Ignite.*Scan(table=[[PUBLIC, T1]]

...

  Ignite.*Scan(table=[[PUBLIC, T2]]

 
 #   Ignite.*Join

  ...

  Ignite.*Scan(table=[[PUBLIC, T2]]

...

  Ignite.*Scan(table=[[PUBLIC, T1]]

 
 #  Check over > 1000 tables

 

Functional tests complete successfully if no timeout or any other exceptions 
are defined in a log and all statements (up to 50 different tables\sources) are 
passed.





Example of SELF join:

SELECT _column_name(s)_

FROM _table1 T1, table1 T2_

WHERE {_}condition{_};

 

Example of join with subqueries:

SELECT t1.a, t2.b from t1, (SELECT 1 as b) as t2 where t1.a=t2.b

SELECT t1.a, t2.b from t1, (SELECT b as b from integers1 where b>1) as t2 where 
t1.a=t2.b



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22328) Improve test coverage for SQL planner optimization for JOIN

2024-05-24 Thread Iurii Gerzhedovich (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Iurii Gerzhedovich updated IGNITE-22328:

Epic Link: IGNITE-20729

> Improve test coverage for SQL planner optimization for JOIN
> ---
>
> Key: IGNITE-22328
> URL: https://issues.apache.org/jira/browse/IGNITE-22328
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Iurii Gerzhedovich
>Priority: Major
>  Labels: ignite-3
>
> During implementation  planner optimization for JOIN in SQL ( IGNITE-18749 ) 
> were added set of tests. However added set of tests is insufficient.
> Let's add the following test scenarios:
> h2. Performance Testing
>  # Check bound intersection timings, i.e. between  
> MAX_SIZE_OF_JOIN_TO_OPTIMIZE up to MAX_SIZE_OF_JOIN_TO_OPTIMIZE + 1 tables 
> joining. Take into account that this approach is applies only to the planning 
> *phase* of execution engine thus statement for N joins (with empty data rows 
> in table) will need to consume equal time in comparison with an equal query 
> but with N+1 joins instead.
>  # Check there is no sufficient difference between involved N and N+1 tables. 
> We need to store (somehow) performance results of such a checks from build to 
> build
> h2. Functional and End-to-End Testing:
> pre requisites:
> Current implementation uses MAX_SIZE_OF_JOIN_TO_OPTIMIZE = 5; constant as a 
> threshold for disabling JOIN COMMUTE rules, thus all sql statements need to 
> have joins with more than 5 tables\sources.
>  
>  # All possible joins INNER, OUTER(LEFT, RIGHT), NATURAL, SELF need to be 
> checked.
>  # Not only tables can be used as a sources for JOIN operations, but 
> subqueries(with and without table sources), *system_range* function and 
> system views.
>  # Due to non optimal plans for some statements are raised - fill distributed 
> table (more than 3 nodes) with data step by step (10k 100k and so on rows) 
> and run all from p1.
>  # Mutate statements to change sequence of join order, i.e. for: ON T1.custId 
> = T2.custId also need to be checked: ON T2.custId = T1.custId. Self check: 
> you need to obtain two different plans (explain plan for sql statement):
>  #   Ignite.*Join
>   ...
>   Ignite.*Scan(table=[[PUBLIC, T1]]
> ...
>   Ignite.*Scan(table=[[PUBLIC, T2]]
>  
>  #   Ignite.*Join
>   ...
>   Ignite.*Scan(table=[[PUBLIC, T2]]
> ...
>   Ignite.*Scan(table=[[PUBLIC, T1]]
>  
>  #  Check over > 1000 tables
>  
> Functional tests complete successfully if no timeout or any other exceptions 
> are defined in a log and all statements (up to 50 different tables\sources) 
> are passed.
> Example of SELF join:
> SELECT _column_name(s)_
> FROM _table1 T1, table1 T2_
> WHERE {_}condition{_};
>  
> Example of join with subqueries:
> SELECT t1.a, t2.b from t1, (SELECT 1 as b) as t2 where t1.a=t2.b
> SELECT t1.a, t2.b from t1, (SELECT b as b from integers1 where b>1) as t2 
> where t1.a=t2.b



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22328) Improve test coverage for SQL planner optimization for JOIN

2024-05-24 Thread Iurii Gerzhedovich (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Iurii Gerzhedovich updated IGNITE-22328:

Description: 
During implementation  planner optimization for JOIN in SQL ( IGNITE-18749 ) 
were added set of tests. However added set of tests is insufficient. 
Let's add the following test scenarios:
h2. Performance Testing
 # Check bound intersection timings, i.e. between  MAX_SIZE_OF_JOIN_TO_OPTIMIZE 
up to MAX_SIZE_OF_JOIN_TO_OPTIMIZE + 1 tables joining. Take into account that 
this approach is applies only to the planning *phase* of execution engine thus 
statement for N joins (with empty data rows in table) will need to consume 
equal time in comparison with an equal query but with N+1 joins instead.
 # Check there is no sufficient difference between involved N and N+1 tables. 
We need to store (somehow) performance results of such a checks from build to 
build

h2. Functional and End-to-End Testing:

pre requisites:

Current implementation uses MAX_SIZE_OF_JOIN_TO_OPTIMIZE = 5; constant as a 
threshold for disabling JOIN COMMUTE rules, thus all sql statements need to 
have joins with more than 5 tables\sources.

 
 # All possible joins INNER, OUTER(LEFT, RIGHT), NATURAL, SELF need to be 
checked.
 # Not only tables can be used as a sources for JOIN operations, but 
subqueries(with and without table sources), *system_range* function and system 
views.
 # Due to non optimal plans for some statements are raised - fill distributed 
table (more than 3 nodes) with data step by step (10k 100k and so on rows) and 
run all from p1.
 # Mutate statements to change sequence of join order, i.e. for: ON T1.custId = 
T2.custId also need to be checked: ON T2.custId = T1.custId. Self check: you 
need to obtain two different plans (explain plan for sql statement):

 #   Ignite.*Join

  ...

  Ignite.*Scan(table=[[PUBLIC, T1]]

...

  Ignite.*Scan(table=[[PUBLIC, T2]]

 
 #   Ignite.*Join

  ...

  Ignite.*Scan(table=[[PUBLIC, T2]]

...

  Ignite.*Scan(table=[[PUBLIC, T1]]

 
 # Check over > 1000 tables

 

Functional tests complete successfully if no timeout or any other exceptions 
are defined in a log and all statements (up to 50 different tables\sources) are 
passed.

Example of SELF join:

SELECT _column_name(s)_

FROM _table1 T1, table1 T2_

WHERE {_}condition{_};

 

Example of join with subqueries:

SELECT t1.a, t2.b from t1, (SELECT 1 as b) as t2 where t1.a=t2.b

SELECT t1.a, t2.b from t1, (SELECT b as b from integers1 where b>1) as t2 where 
t1.a=t2.b

  was:
During implementation  planner optimization for JOIN in SQL ( IGNITE-18749 ) 
were added set of tests. However added set of tests is insufficient.
Let's add the following test scenarios:
h2. Performance Testing
 # Check bound intersection timings, i.e. between  MAX_SIZE_OF_JOIN_TO_OPTIMIZE 
up to MAX_SIZE_OF_JOIN_TO_OPTIMIZE + 1 tables joining. Take into account that 
this approach is applies only to the planning *phase* of execution engine thus 
statement for N joins (with empty data rows in table) will need to consume 
equal time in comparison with an equal query but with N+1 joins instead.
 # Check there is no sufficient difference between involved N and N+1 tables. 
We need to store (somehow) performance results of such a checks from build to 
build

h2. Functional and End-to-End Testing:

pre requisites:

Current implementation uses MAX_SIZE_OF_JOIN_TO_OPTIMIZE = 5; constant as a 
threshold for disabling JOIN COMMUTE rules, thus all sql statements need to 
have joins with more than 5 tables\sources.

 
 # All possible joins INNER, OUTER(LEFT, RIGHT), NATURAL, SELF need to be 
checked.
 # Not only tables can be used as a sources for JOIN operations, but 
subqueries(with and without table sources), *system_range* function and system 
views.
 # Due to non optimal plans for some statements are raised - fill distributed 
table (more than 3 nodes) with data step by step (10k 100k and so on rows) and 
run all from p1.
 # Mutate statements to change sequence of join order, i.e. for: ON T1.custId = 
T2.custId also need to be checked: ON T2.custId = T1.custId. Self check: you 
need to obtain two different plans (explain plan for sql statement):

 #   Ignite.*Join

  ...

  Ignite.*Scan(table=[[PUBLIC, T1]]

...

  Ignite.*Scan(table=[[PUBLIC, T2]]

 
 #   Ignite.*Join

  ...

  Ignite.*Scan(table=[[PUBLIC, T2]]

...

  Ignite.*Scan(table=[[PUBLIC, T1]]

 
 #  Check over > 1000 tables

 

Functional tests complete successfully if no timeout or any other exceptions 
are defined in a log and all statements (up to 50 different tables\sources) are 
passed.





Example of SELF join:

SELECT _column_name(s)_

FROM _table1 T1, table1 T2_

WHERE {_}condition{_};

 

Example of join with subqueries:

SELECT t1.a, t2.b from t1, (SELECT 1 as b) as t2 where t1.a=t2.b

SELECT t1.a, t2.b from t1, (SELECT b as b

[jira] [Updated] (IGNITE-22279) Missed comments in PR IGNITE-22130

2024-05-24 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-22279:
---
Description: 
https://github.com/apache/ignite-3/pull/3704
h3. Motivation
Unfortunately, we merged PR before all the comments were published. Except for 
other formalities, there is one main note:
{code}
if (retryExecutor != null && matchAny(unwrapCause(errResp.throwable()), 
ACQUIRE_LOCK_ERR, REPLICA_MISS_ERR)) {
retryExecutor.schedule(
// Need to resubmit again to pool which is valid for synchronous IO 
execution.
() -> partitionOperationsExecutor.execute(() -> 
res.completeExceptionally(errResp.throwable())),
RETRY_TIMEOUT_MILLIS, MILLISECONDS);
}
{code}
In the snipped, we put the response off in 20 milliseconds to retry the request 
in when the perious cercumstances are changed. But here we delay of handling 
the request intntinaly, that coud be a perfomance issue. Moreover, the delay 
does not have to apply for the replica miss exception (this exception is 
handled in client side through using the placement driver API).

h3. Definition of done

Excluded the handler for the replica miss exception and handled this exception 
on the client side.

Add the logic of calculating the count of retries here, and do not delay 
response when the retry is not assumed.

Move the timeout configuration (it cannot be a const) to the configuration 
property in the same place where the configuration of dead lock prevention is 
stored.

Look at all the other comments in the PR 
(https://github.com/apache/ignite-3/pull/3704)

  was:
https://github.com/apache/ignite-3/pull/3704
h3. Motivation
{code}
if (retryExecutor != null && matchAny(unwrapCause(errResp.throwable()), 
ACQUIRE_LOCK_ERR, REPLICA_MISS_ERR)) {
retryExecutor.schedule(
// Need to resubmit again to pool which is 
valid for synchronous IO execution.
() -> 
partitionOperationsExecutor.execute(() -> 
res.completeExceptionally(errResp.throwable())),
RETRY_TIMEOUT_MILLIS, MILLISECONDS);
}
{code}


> Missed comments in PR IGNITE-22130
> --
>
> Key: IGNITE-22279
> URL: https://issues.apache.org/jira/browse/IGNITE-22279
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> https://github.com/apache/ignite-3/pull/3704
> h3. Motivation
> Unfortunately, we merged PR before all the comments were published. Except 
> for other formalities, there is one main note:
> {code}
> if (retryExecutor != null && matchAny(unwrapCause(errResp.throwable()), 
> ACQUIRE_LOCK_ERR, REPLICA_MISS_ERR)) {
> retryExecutor.schedule(
> // Need to resubmit again to pool which is valid for synchronous 
> IO execution.
> () -> partitionOperationsExecutor.execute(() -> 
> res.completeExceptionally(errResp.throwable())),
> RETRY_TIMEOUT_MILLIS, MILLISECONDS);
> }
> {code}
> In the snipped, we put the response off in 20 milliseconds to retry the 
> request in when the perious cercumstances are changed. But here we delay of 
> handling the request intntinaly, that coud be a perfomance issue. Moreover, 
> the delay does not have to apply for the replica miss exception (this 
> exception is handled in client side through using the placement driver API).
> h3. Definition of done
> Excluded the handler for the replica miss exception and handled this 
> exception on the client side.
> Add the logic of calculating the count of retries here, and do not delay 
> response when the retry is not assumed.
> Move the timeout configuration (it cannot be a const) to the configuration 
> property in the same place where the configuration of dead lock prevention is 
> stored.
> Look at all the other comments in the PR 
> (https://github.com/apache/ignite-3/pull/3704)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22329) Check that cluster is operable after changing the deadlock prevention policy to "Timeout wait"

2024-05-24 Thread Denis Chudov (Jira)
Denis Chudov created IGNITE-22329:
-

 Summary: Check that cluster is operable after changing the 
deadlock prevention policy to "Timeout wait"
 Key: IGNITE-22329
 URL: https://issues.apache.org/jira/browse/IGNITE-22329
 Project: Ignite
  Issue Type: Task
Reporter: Denis Chudov


*Motivation* 

"Timeout wait" deadlock prevention policy (see 
org.apache.ignite.internal.tx.TimeoutDeadlockPreventionTest ) is different from 
WaitDie that is default. Before any benchmarking, we should check that changing 
the policy would not lead to inoperability of the transactions engine (multiple 
TC failures beside the specific tx tests checking the deadlock prevention).

*Definition of done*
 * check the deadlock prevention policy (no other way from hardcode it)
 * check the teamcity tests after it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-22286) Remove waitTimeout in deadlock prevention policy

2024-05-24 Thread Denis Chudov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849340#comment-17849340
 ] 

Denis Chudov commented on IGNITE-22286:
---

The profit of the client-side timeout is not clear in comparison to server-side 
timeout. We should do some benchmarks first.

I created IGNITE-22329 to check that another (timeout) deadlock prevention 
policy would be operable.

> Remove waitTimeout in deadlock prevention policy
> 
>
> Key: IGNITE-22286
> URL: https://issues.apache.org/jira/browse/IGNITE-22286
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Alexey Scherbakov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0
>
>
> After IGNITE-21540 and IGNITE-20127 we now have proper retries on client side.
> This means we no longer need 
> org.apache.ignite.internal.tx.DeadlockPreventionPolicy#waitTimeout as a part 
> of deadock prevention policy.
> Moreover, client side retries has benefit in the following scenario (having 
> in mind WAIT_DIE prevention):
>  # tx1 takes lock at timestamp 10
>  # tx2 tries to take lock at timestamp 20 and goes for retry (without holding 
> lock)
>  # tx1 lock is released
>  # tx3 takes lock at timestamp 30
>  # tx3 lock is released
>  # tx2 attemps to lock after retry and succeeds
> Without retry (without holding locks) on step 2 tx3 would retry too on step 4.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-22286) Remove waitTimeout in deadlock prevention policy

2024-05-24 Thread Denis Chudov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849340#comment-17849340
 ] 

Denis Chudov edited comment on IGNITE-22286 at 5/24/24 4:12 PM:


The profit of the client-side timeout is not clear in comparison to server-side 
timeout. With server-side waiting the locks can be acquired just after the 
concurrent lock is released, on the client we should wait for a fixed amount of 
time. We should do some benchmarks first.

I created IGNITE-22329 to check that another (timeout) deadlock prevention 
policy would be operable.


was (Author: denis chudov):
The profit of the client-side timeout is not clear in comparison to server-side 
timeout. With server side locks can be acquired just after the concurrent lock 
is released, on the client we should wait for a fixed amount of time. We should 
do some benchmarks first.

I created IGNITE-22329 to check that another (timeout) deadlock prevention 
policy would be operable.

> Remove waitTimeout in deadlock prevention policy
> 
>
> Key: IGNITE-22286
> URL: https://issues.apache.org/jira/browse/IGNITE-22286
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Alexey Scherbakov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0
>
>
> After IGNITE-21540 and IGNITE-20127 we now have proper retries on client side.
> This means we no longer need 
> org.apache.ignite.internal.tx.DeadlockPreventionPolicy#waitTimeout as a part 
> of deadock prevention policy.
> Moreover, client side retries has benefit in the following scenario (having 
> in mind WAIT_DIE prevention):
>  # tx1 takes lock at timestamp 10
>  # tx2 tries to take lock at timestamp 20 and goes for retry (without holding 
> lock)
>  # tx1 lock is released
>  # tx3 takes lock at timestamp 30
>  # tx3 lock is released
>  # tx2 attemps to lock after retry and succeeds
> Without retry (without holding locks) on step 2 tx3 would retry too on step 4.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-22286) Remove waitTimeout in deadlock prevention policy

2024-05-24 Thread Denis Chudov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849340#comment-17849340
 ] 

Denis Chudov edited comment on IGNITE-22286 at 5/24/24 4:12 PM:


The profit of the client-side timeout is not clear in comparison to server-side 
timeout. With server-side waiting a lock can be acquired just after the 
concurrent lock is released, on the client we should wait for a fixed amount of 
time. We should do some benchmarks first.

I created IGNITE-22329 to check that another (timeout) deadlock prevention 
policy would be operable.


was (Author: denis chudov):
The profit of the client-side timeout is not clear in comparison to server-side 
timeout. With server-side waiting the locks can be acquired just after the 
concurrent lock is released, on the client we should wait for a fixed amount of 
time. We should do some benchmarks first.

I created IGNITE-22329 to check that another (timeout) deadlock prevention 
policy would be operable.

> Remove waitTimeout in deadlock prevention policy
> 
>
> Key: IGNITE-22286
> URL: https://issues.apache.org/jira/browse/IGNITE-22286
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Alexey Scherbakov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0
>
>
> After IGNITE-21540 and IGNITE-20127 we now have proper retries on client side.
> This means we no longer need 
> org.apache.ignite.internal.tx.DeadlockPreventionPolicy#waitTimeout as a part 
> of deadock prevention policy.
> Moreover, client side retries has benefit in the following scenario (having 
> in mind WAIT_DIE prevention):
>  # tx1 takes lock at timestamp 10
>  # tx2 tries to take lock at timestamp 20 and goes for retry (without holding 
> lock)
>  # tx1 lock is released
>  # tx3 takes lock at timestamp 30
>  # tx3 lock is released
>  # tx2 attemps to lock after retry and succeeds
> Without retry (without holding locks) on step 2 tx3 would retry too on step 4.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-22286) Remove waitTimeout in deadlock prevention policy

2024-05-24 Thread Denis Chudov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849340#comment-17849340
 ] 

Denis Chudov edited comment on IGNITE-22286 at 5/24/24 4:12 PM:


The profit of the client-side timeout is not clear in comparison to server-side 
timeout. With server side locks can be acquired just after the concurrent lock 
is released, on the client we should wait for a fixed amount of time. We should 
do some benchmarks first.

I created IGNITE-22329 to check that another (timeout) deadlock prevention 
policy would be operable.


was (Author: denis chudov):
The profit of the client-side timeout is not clear in comparison to server-side 
timeout. We should do some benchmarks first.

I created IGNITE-22329 to check that another (timeout) deadlock prevention 
policy would be operable.

> Remove waitTimeout in deadlock prevention policy
> 
>
> Key: IGNITE-22286
> URL: https://issues.apache.org/jira/browse/IGNITE-22286
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Alexey Scherbakov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0
>
>
> After IGNITE-21540 and IGNITE-20127 we now have proper retries on client side.
> This means we no longer need 
> org.apache.ignite.internal.tx.DeadlockPreventionPolicy#waitTimeout as a part 
> of deadock prevention policy.
> Moreover, client side retries has benefit in the following scenario (having 
> in mind WAIT_DIE prevention):
>  # tx1 takes lock at timestamp 10
>  # tx2 tries to take lock at timestamp 20 and goes for retry (without holding 
> lock)
>  # tx1 lock is released
>  # tx3 takes lock at timestamp 30
>  # tx3 lock is released
>  # tx2 attemps to lock after retry and succeeds
> Without retry (without holding locks) on step 2 tx3 would retry too on step 4.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22330) ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is flacky

2024-05-24 Thread Mikhail Efremov (Jira)
Mikhail Efremov created IGNITE-22330:


 Summary: 
ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is 
flacky
 Key: IGNITE-22330
 URL: https://issues.apache.org/jira/browse/IGNITE-22330
 Project: Ignite
  Issue Type: Bug
Reporter: Mikhail Efremov
 Attachments: image-2024-05-25-01-56-21-327.png

This test fails at least on main {{4c6662}} with strange (series of failures) 
rate:  !image-2024-05-25-01-56-21-327.png!

The common issue is {{{}TimeoutException{}}}:

 
{code:java}
java.lang.AssertionError: java.util.concurrent.TimeoutException
    at 
org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:78)
    at 
org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
    at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:67)
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
    at 
org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.awaitPrimaryReplica(ItDisasterRecoveryReconfigurationTest.java:306)
    at 
org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.testManualRebalanceIfMajorityIsLost(ItDisasterRecoveryReconfigurationTest.java:209)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at 
java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at java.base/java.util.stream.IntPipeline$1$1.accept(IntPipeline.java:180)
    at 
java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
    at 
java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:699)
    at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
    at 
java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
    at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
    at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
Caused by: java.util.concurrent.TimeoutException
    at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
    at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
    at 
org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:74)
    ... 32 more

java.util.concurrent.TimeoutException
    at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
    at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
    at 
org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:74)
    at 
org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
    at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:67)
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
    at 
org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.awaitPrimaryReplica(ItDisasterRecoveryReconfigurationTest

[jira] [Updated] (IGNITE-22330) ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is flaky

2024-05-24 Thread Mikhail Efremov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Efremov updated IGNITE-22330:
-
Summary: 
ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is 
flaky  (was: 
ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is 
flacky)

> ItDisasterRecoveryReconfigurationTest#testManualRebalanceIfMajorityIsLost is 
> flaky
> --
>
> Key: IGNITE-22330
> URL: https://issues.apache.org/jira/browse/IGNITE-22330
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mikhail Efremov
>Priority: Major
>  Labels: ignite-3
> Attachments: image-2024-05-25-01-56-21-327.png
>
>
> This test fails at least on main {{4c6662}} with strange (series of failures) 
> rate:  !image-2024-05-25-01-56-21-327.png!
> The common issue is {{{}TimeoutException{}}}:
>  
> {code:java}
> java.lang.AssertionError: java.util.concurrent.TimeoutException
>     at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:78)
>     at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
>     at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:67)
>     at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)
>     at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>     at 
> org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.awaitPrimaryReplica(ItDisasterRecoveryReconfigurationTest.java:306)
>     at 
> org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.testManualRebalanceIfMajorityIsLost(ItDisasterRecoveryReconfigurationTest.java:209)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>     at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
>     at 
> java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
>     at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>     at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
>     at java.base/java.util.stream.IntPipeline$1$1.accept(IntPipeline.java:180)
>     at 
> java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
>     at 
> java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:699)
>     at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
>     at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>     at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>     at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
>     at 
> java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
>     at 
> java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
>     at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
>     at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>     at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>     at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>     at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
>     at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>     at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
> Caused by: java.util.concurrent.TimeoutException
>     at 
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
>     at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
>     at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:74)
>     ... 32 more
> java.util.concurrent.TimeoutException
>     at 
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
>     at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2