[jira] [Commented] (IGNITE-18875) Sql. Drop AbstractPlannerTest.TestTable.

2023-04-06 Thread Gael Yimen Yimga (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-18875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17709542#comment-17709542
 ] 

Gael Yimen Yimga commented on IGNITE-18875:
---

[~zstan] Please can you take a look to the PR [1]again, I made a little 
progress.

[1] [https://github.com/apache/ignite-3/pull/1873]

 

> Sql. Drop AbstractPlannerTest.TestTable.
> 
>
> Key: IGNITE-18875
> URL: https://issues.apache.org/jira/browse/IGNITE-18875
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Andrey Mashenkov
>Assignee: Gael Yimen Yimga
>Priority: Major
>  Labels: ignite-3, newbie, tech-debt-test
> Fix For: 3.0.0-beta2
>
> Attachments: Screen Shot 2023-04-03 at 1.04.39 AM.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Use test framework for schema configuration in tests.
> Replace 
> {code:java}
> org.apache.ignite.internal.sql.engine.planner.AbstractPlannerTest.TestTable
> {code}
> with 
> {code:java}
> org.apache.ignite.internal.sql.engine.framework.TestTable
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19252) The incremental snapshot restore operation fails if there is a node not from the baseline.

2023-04-06 Thread Nikita Amelchev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Amelchev updated IGNITE-19252:
-
Ignite Flags:   (was: Release Notes Required)

> The incremental snapshot restore operation fails if there is a node not from 
> the baseline.
> --
>
> Key: IGNITE-19252
> URL: https://issues.apache.org/jira/browse/IGNITE-19252
> Project: Ignite
>  Issue Type: Bug
>Reporter: Nikita Amelchev
>Assignee: Nikita Amelchev
>Priority: Major
>  Labels: ise
> Fix For: 2.15
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The incremental snapshot restore operation fails if there is a node not from 
> the baseline:
> {noformat}
> 21:20:40.324 [disco-notifier-worker-#147%server-1%] ERROR 
> org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotRestoreProcess
>  - Failed to restore snapshot cache groups 
> [reqId=55eead09-4da7-4232-8e98-976dba117d91].
> org.apache.ignite.IgniteCheckedException: Snapshot metafile cannot be read 
> due to it doesn't exist: 
> /work/snapshots/snp1/increments/0001/server_3.smf
>   at 
> org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManager.readFromFile(IgniteSnapshotManager.java:2001)
>  ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManager.readIncrementalSnapshotMetadata(IgniteSnapshotManager.java:1098)
>  ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.snapshot.IncrementalSnapshotProcessor.process(IncrementalSnapshotProcessor.java:94)
>  ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotRestoreProcess.restoreIncrementalSnapshot(SnapshotRestoreProcess.java:1466)
>  ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotRestoreProcess.lambda$incrementalSnapshotRestore$35(SnapshotRestoreProcess.java:1417)
>  ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
>   at 
> org.apache.ignite.internal.processors.security.thread.SecurityAwareRunnable.run(SecurityAwareRunnable.java:51)
>  ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_201]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_201]
>   at 
> org.apache.ignite.internal.processors.security.thread.SecurityAwareRunnable.run(SecurityAwareRunnable.java:51)
>  ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_201]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_201]
>   at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_201]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19252) The incremental snapshot restore operation fails if there is a node not from the baseline.

2023-04-06 Thread Nikita Amelchev (Jira)
Nikita Amelchev created IGNITE-19252:


 Summary: The incremental snapshot restore operation fails if there 
is a node not from the baseline.
 Key: IGNITE-19252
 URL: https://issues.apache.org/jira/browse/IGNITE-19252
 Project: Ignite
  Issue Type: Bug
Reporter: Nikita Amelchev
Assignee: Nikita Amelchev
 Fix For: 2.15


The incremental snapshot restore operation fails if there is a node not from 
the baseline:


{noformat}
21:20:40.324 [disco-notifier-worker-#147%server-1%] ERROR 
org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotRestoreProcess
 - Failed to restore snapshot cache groups 
[reqId=55eead09-4da7-4232-8e98-976dba117d91].
org.apache.ignite.IgniteCheckedException: Snapshot metafile cannot be read due 
to it doesn't exist: 
/work/snapshots/snp1/increments/0001/server_3.smf
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManager.readFromFile(IgniteSnapshotManager.java:2001)
 ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManager.readIncrementalSnapshotMetadata(IgniteSnapshotManager.java:1098)
 ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.IncrementalSnapshotProcessor.process(IncrementalSnapshotProcessor.java:94)
 ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotRestoreProcess.restoreIncrementalSnapshot(SnapshotRestoreProcess.java:1466)
 ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotRestoreProcess.lambda$incrementalSnapshotRestore$35(SnapshotRestoreProcess.java:1417)
 ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
at 
org.apache.ignite.internal.processors.security.thread.SecurityAwareRunnable.run(SecurityAwareRunnable.java:51)
 ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_201]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_201]
at 
org.apache.ignite.internal.processors.security.thread.SecurityAwareRunnable.run(SecurityAwareRunnable.java:51)
 ~[ignite-core-15.0.0-SNAPSHOT.jar:15.0.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_201]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_201]
{noformat}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19238) ItDataTypesTest and ItCreateTableDdlTest are flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Summary: ItDataTypesTest and ItCreateTableDdlTest are flaky  (was: 
ItDataTypesTest and is flaky)

> ItDataTypesTest and ItCreateTableDdlTest are flaky
> --
>
> Key: IGNITE-19238
> URL: https://issues.apache.org/jira/browse/IGNITE-19238
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
> Attachments: Снимок экрана от 2023-04-06 10-39-32.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> h3. Description & Root cause
> 1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests 
> failed to stop replicas on node stop:
> !Снимок экрана от 2023-04-06 10-39-32.png!
> {code:java}
> java.lang.AssertionError: There are replicas alive 
> [replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
>     at 
> org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
>     at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
> 2. The reason why we failed to stop replicas is the race between 
> tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.
> On TableManager stop, we stop and cleanup all table resources like replicas 
> and raft nodes
> {code:java}
> public void stop() {
>   ...
>   Map tables = tablesByIdVv.latest();  // 1*
>   cleanUpTablesResources(tables); 
>   cleanUpTablesResources(tablesToStopInCaseOfError);
>   ...
> }{code}
> where tablesToStopInCaseOfError is a sort of pending tables list which one is 
> cleared on cfg storage revision update.
> tablesByIdVv *listens same storage revision update event* in order to publish 
> tables related to the given revision or in other words make such tables 
> accessible from tablesByIdVv.latest(); that one that is used in order to 
> retrieve tables for cleanup on components stop (see // 1* above)
> {code:java}
> public TableManager(
>   ... 
>   tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);
>   registry.accept(token -> {
> tablesToStopInCaseOfError.clear();
> 
> return completedFuture(null);
>   });
>   {code}
> However inside IncrementalVersionedValue we have async storageRevision update 
> processing
> {code:java}
> updaterFuture = updaterFuture.whenComplete((v, t) -> 
> versionedValue.complete(causalityToken, localUpdaterFuture)); {code}
> As a result it's possible that we will clear tablesToStopInCaseOfError before 
> publishing same revision tables to tablesByIdVv, so that we will miss that 
> cleared tables in tablesByIdVv.latest() which is used in TableManager#stop.
> h3. Implementation Notes
> 1. First of all I've renamed tablesToStopInCaseOfError to pending tables, 
> because they aren't only ...InCaseOfError.
> 2. I've also reworked tablesToStopInCaseOfError cleanup by substituting 
> tablesToStopInCaseOfError.clear on revision change with
> {code:java}
> tablesByIdVv.get(causalityToken).thenAccept(ignored -> inBusyLock(busyLock,  
> ()-> {  
>   pendingTables.remove(tblId);
> })); {code}
> meaning that we
> 2.1. remove specific table by id instead of ready.
> 2.2. do that removal on corresponding table publishing wihtin tablesByIdVv.
> 3. That means that at some point right after the publishing but before 
> removal it's possible to have same table both within tablesByIdVv and 
> pendingTables thus in order not to stop same table twice (which is safe by 
> the way because of idempotentce) I've substituted
> {code:java}
> cleanUpTablesResources(tables);
> cleanUpTablesResources(tablesToStopInCaseOfError); {code}
> with
> {code:java}
> Map tablesToStop = 
> Stream.concat(tablesByIdVv.latest().entrySet().stream(), 
> pendingTables.entrySet().stream()).
> collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (v1, 
> v2) -> v1));
> cleanUpTablesResources(tablesToStop); {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19238) ItDataTypesTest and is flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Summary: ItDataTypesTest and is flaky  (was: ItDataTypesTest is flaky)

> ItDataTypesTest and is flaky
> 
>
> Key: IGNITE-19238
> URL: https://issues.apache.org/jira/browse/IGNITE-19238
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
> Attachments: Снимок экрана от 2023-04-06 10-39-32.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> h3. Description & Root cause
> 1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests 
> failed to stop replicas on node stop:
> !Снимок экрана от 2023-04-06 10-39-32.png!
> {code:java}
> java.lang.AssertionError: There are replicas alive 
> [replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
>     at 
> org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
>     at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
> 2. The reason why we failed to stop replicas is the race between 
> tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.
> On TableManager stop, we stop and cleanup all table resources like replicas 
> and raft nodes
> {code:java}
> public void stop() {
>   ...
>   Map tables = tablesByIdVv.latest();  // 1*
>   cleanUpTablesResources(tables); 
>   cleanUpTablesResources(tablesToStopInCaseOfError);
>   ...
> }{code}
> where tablesToStopInCaseOfError is a sort of pending tables list which one is 
> cleared on cfg storage revision update.
> tablesByIdVv *listens same storage revision update event* in order to publish 
> tables related to the given revision or in other words make such tables 
> accessible from tablesByIdVv.latest(); that one that is used in order to 
> retrieve tables for cleanup on components stop (see // 1* above)
> {code:java}
> public TableManager(
>   ... 
>   tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);
>   registry.accept(token -> {
> tablesToStopInCaseOfError.clear();
> 
> return completedFuture(null);
>   });
>   {code}
> However inside IncrementalVersionedValue we have async storageRevision update 
> processing
> {code:java}
> updaterFuture = updaterFuture.whenComplete((v, t) -> 
> versionedValue.complete(causalityToken, localUpdaterFuture)); {code}
> As a result it's possible that we will clear tablesToStopInCaseOfError before 
> publishing same revision tables to tablesByIdVv, so that we will miss that 
> cleared tables in tablesByIdVv.latest() which is used in TableManager#stop.
> h3. Implementation Notes
> 1. First of all I've renamed tablesToStopInCaseOfError to pending tables, 
> because they aren't only ...InCaseOfError.
> 2. I've also reworked tablesToStopInCaseOfError cleanup by substituting 
> tablesToStopInCaseOfError.clear on revision change with
> {code:java}
> tablesByIdVv.get(causalityToken).thenAccept(ignored -> inBusyLock(busyLock,  
> ()-> {  
>   pendingTables.remove(tblId);
> })); {code}
> meaning that we
> 2.1. remove specific table by id instead of ready.
> 2.2. do that removal on corresponding table publishing wihtin tablesByIdVv.
> 3. That means that at some point right after the publishing but before 
> removal it's possible to have same table both within tablesByIdVv and 
> pendingTables thus in order not to stop same table twice (which is safe by 
> the way because of idempotentce) I've substituted
> {code:java}
> cleanUpTablesResources(tables);
> cleanUpTablesResources(tablesToStopInCaseOfError); {code}
> with
> {code:java}
> Map tablesToStop = 
> Stream.concat(tablesByIdVv.latest().entrySet().stream(), 
> pendingTables.entrySet().stream()).
> collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (v1, 
> v2) -> v1));
> cleanUpTablesResources(tablesToStop); {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19238) ItDataTypesTest is flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Description: 
h3. Description & Root cause

1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.

On TableManager stop, we stop and cleanup all table resources like replicas and 
raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update.

tablesByIdVv *listens same storage revision update event* in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing
{code:java}
updaterFuture = updaterFuture.whenComplete((v, t) -> 
versionedValue.complete(causalityToken, localUpdaterFuture)); {code}
As a result it's possible that we will clear tablesToStopInCaseOfError before 
publishing same revision tables to tablesByIdVv, so that we will miss that 
cleared tables in tablesByIdVv.latest() which is used in TableManager#stop.
h3. Implementation Notes

1. First of all I've renamed tablesToStopInCaseOfError to pending tables, 
because they aren't only ...InCaseOfError.

2. I've also reworked tablesToStopInCaseOfError cleanup by substituting 
tablesToStopInCaseOfError.clear on revision change with
{code:java}
tablesByIdVv.get(causalityToken).thenAccept(ignored -> inBusyLock(busyLock,  
()-> {  
  pendingTables.remove(tblId);
})); {code}
meaning that we

2.1. remove specific table by id instead of ready.

2.2. do that removal on corresponding table publishing wihtin tablesByIdVv.

3. That means that at some point right after the publishing but before removal 
it's possible to have same table both within tablesByIdVv and pendingTables 
thus in order not to stop same table twice (which is safe by the way because of 
idempotentce) I've substituted
{code:java}
cleanUpTablesResources(tables);
cleanUpTablesResources(tablesToStopInCaseOfError); {code}
with
{code:java}
Map tablesToStop = 
Stream.concat(tablesByIdVv.latest().entrySet().stream(), 
pendingTables.entrySet().stream()).
collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (v1, 
v2) -> v1));

cleanUpTablesResources(tablesToStop); {code}

  was:
h3. Description & Root cause

1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why 

[jira] [Updated] (IGNITE-19231) Change thread pool for metastore raft group

2023-04-06 Thread Kirill Tkalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-19231:
-
Reviewer: Roman Puchkovskiy

> Change thread pool for metastore raft group
> ---
>
> Key: IGNITE-19231
> URL: https://issues.apache.org/jira/browse/IGNITE-19231
> Project: Ignite
>  Issue Type: Bug
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It was discovered that the common thread pool is used for raft group the 
> metastorage and partitions, which can lead to deadlocks. The metastorage 
> needs its own thread pool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19238) ItDataTypesTest is flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Description: 
h3. Description & Root cause

1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.

On TableManager stop, we stop and cleanup all table resources like replicas and 
raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update.

tablesByIdVv *listens same storage revision update event* in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing
{code:java}
updaterFuture = updaterFuture.whenComplete((v, t) -> 
versionedValue.complete(causalityToken, localUpdaterFuture)); {code}
As a result it's possible that we will clear tablesToStopInCaseOfError before 
publishing same revision tables to tablesByIdVv, so that we will miss that 
cleared tables in tablesByIdVv.latest() which is used in TableManager#stop.
h3. Implementation Notes

1. First of all I've renamed tablesToStopInCaseOfError to pending tables, 
because they aren't only ...InCaseOfError.

2. I've also reworked tablesToStopInCaseOfError cleanup by substituting 
tablesToStopInCaseOfError.clear on revision change with
{code:java}
tablesByIdVv.get(causalityToken).thenAccept(ignored -> inBusyLock(busyLock,  
()-> {  
  pendingTables.remove(tblId);
})); {code}
meaning that we

2.1. remove specific table by id instead of ready.

2.2. do that removal on corresponding table publishing wihtin tablesByIdVv.

3. That means that at some point right after the publishing but before removal 
it's possible to have same table both within tablesByIdVv and pendingTables 
thus in order not to stop same table twice (which is safe by the way because of 
idempotentce) I've substituted
{code:java}
cleanUpTablesResources(tables);
cleanUpTablesResources(tablesToStopInCaseOfError); {code}
with
{code:java}
Stream tablesToStop =
Stream.concat(tablesByIdVv.latest().entrySet().stream(), 
pendingTables.entrySet().stream()).distinct().
map(Map.Entry::getValue);

cleanUpTablesResources(tablesToStop); {code}

  was:
h3. Description & Root cause

1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is 

[jira] [Updated] (IGNITE-19116) Sql. UPDATE statement fails with NPE when table does not exist

2023-04-06 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-19116:
--
Fix Version/s: 3.0.0-beta2

> Sql. UPDATE statement fails with NPE when table does not exist
> --
>
> Key: IGNITE-19116
> URL: https://issues.apache.org/jira/browse/IGNITE-19116
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 3.0.0-beta2
>Reporter: Maksim Zhuravkov
>Assignee: Pavel Pereslegin
>Priority: Minor
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> UPDATE statement fails with NPE when table does not exist.
> {code:java}
> @Test
> public void test() {
>sql("UPDATE unknown SET j = j + 1");
> }
> {code}
> Error:
> {code:java}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgniteSqlValidator.createSourceSelectForUpdate(IgniteSqlValidator.java:175)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1476)
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgniteSqlValidator.performUnconditionalRewrites(IgniteSqlValidator.java:383)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:1046)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:759)
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgniteSqlValidator.validate(IgniteSqlValidator.java:135)
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgnitePlanner.validate(IgnitePlanner.java:189)
> {code}
> *Expected behavoir*
> It should return throw objectNotFound error:
> {code:java}
> Object 'UNKNOWN' not found
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19116) Sql. UPDATE statement fails with NPE when table does not exist

2023-04-06 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-19116:
--
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Sql. UPDATE statement fails with NPE when table does not exist
> --
>
> Key: IGNITE-19116
> URL: https://issues.apache.org/jira/browse/IGNITE-19116
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 3.0.0-beta2
>Reporter: Maksim Zhuravkov
>Assignee: Pavel Pereslegin
>Priority: Minor
>  Labels: ignite-3
>
> UPDATE statement fails with NPE when table does not exist.
> {code:java}
> @Test
> public void test() {
>sql("UPDATE unknown SET j = j + 1");
> }
> {code}
> Error:
> {code:java}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgniteSqlValidator.createSourceSelectForUpdate(IgniteSqlValidator.java:175)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1476)
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgniteSqlValidator.performUnconditionalRewrites(IgniteSqlValidator.java:383)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:1046)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:759)
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgniteSqlValidator.validate(IgniteSqlValidator.java:135)
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgnitePlanner.validate(IgnitePlanner.java:189)
> {code}
> *Expected behavoir*
> It should return throw objectNotFound error:
> {code:java}
> Object 'UNKNOWN' not found
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-18454) Explain treading model in corresponding README.md file for TableManager

2023-04-06 Thread Vyacheslav Koptilin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-18454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin reassigned IGNITE-18454:


Assignee: Denis Chudov

> Explain treading model in corresponding README.md file for TableManager
> ---
>
> Key: IGNITE-18454
> URL: https://issues.apache.org/jira/browse/IGNITE-18454
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Denis Chudov
>Priority: Major
>  Labels: ignite-3
>
> Use ignite-3/modules/raft/README.md as a reference thread-model explanation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-18461) Add fish-like suggestions to CLI

2023-04-06 Thread Aleksandr (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-18461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17709426#comment-17709426
 ] 

Aleksandr commented on IGNITE-18461:


merged into main: c46a971ecbc4493fdcc30f1f95f349831d1da5be

> Add fish-like suggestions to CLI
> 
>
> Key: IGNITE-18461
> URL: https://issues.apache.org/jira/browse/IGNITE-18461
> Project: Ignite
>  Issue Type: Task
>  Components: cli
>Reporter: Aleksandr
>Assignee: Aleksandr
>Priority: Major
>  Labels: ignite-3, ignite-3-cli-tool
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We can add fish-like autosuggestions of typed text 
> https://github.com/jline/jline3/wiki/Autosuggestions 
> The user should be able to switch off such behavior. I suggest doing it via 
> CLI profile but maybe there is a better way.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote} 
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out. 
{quote} 

{quote} 
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}] 
{quote} 

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads): 
{quote} 
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ..., stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})] 
{quote} 
 

Is I understand, such errors do not affect restoring, but can confuse.

 

How to reproduce:
 # Set checkpoint frequency less than failure detection timeout.
 # Ensure, that cache groups partitions states restoring lasts more than 
failure detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer: [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch]

  was:
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote} 
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out. 
{quote} 

{quote} 
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}] 
{quote} 

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads): 
{quote} 
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ..., stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})] 
{quote} 
 

Is I understand, such errors do not affect restoring, but such error messages 
can confuse.

 

How to reproduce:
 # Set checkpoint frequency less than failure detection timeout.
 # Ensure, that cache groups partitions states restoring lasts more than 
failure detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer: [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch]


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There may be possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start):
> {quote} 
> [2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
>  Checkpoint read lock acquisition has been timed out. 
> {quote} 
> {quote} 
> [2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
>  Blocked system-critical thread has been detected. This can lead to 
> cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
> threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
>  {color:red}blockedFor=100s{color}] 
> {quote} 
> Also there are active exchange process, which finishes with such timings 
> (timing will be approximatelly equal to blocking time of threads): 
> {quote} 
> [2023-04-06T10:55:52,211][INFO 
> 

[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote} 
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out. 
{quote} 

{quote} 
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}] 
{quote} 

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads): 
{quote} 
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ..., stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})] 
{quote} 
 

Is I understand, such errors does not affect restoring, but such error messages 
can confuse.

 

How to reproduce:
 # Set checkpoint frequency less than failure detection timeout.
 # Ensure, that cache groups partitions states restoring lasts more than 
failure detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer: [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch]

  was:
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}[2023-04-06T10:55:46,561][ERROR][ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}
{quote}[2023-04-06T10:55:47,487][ERROR][tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:#FF}blockedFor=100s{color}]
{quote}
Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads):
{quote}[2023-04-06T10:55:52,211][INFO 
][exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange timings 
[startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ..., stage="Restore partition states" 
({color:#FF}100163 ms{color}), ..., stage="Total time" 
({color:#FF}100334 ms{color})]
{quote}
 

Is I understand, such errors does not affect restoring, but such error messages 
can confuse.

 

How to reproduce:
 # Set checkpoint frequency less than failure detection timeout.
 # Ensure, that cache groups partitions states restoring lasts more than 
failure detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer: [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch]


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There may be possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start):
> {quote} 
> [2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
>  Checkpoint read lock acquisition has been timed out. 
> {quote} 
> {quote} 
> [2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
>  Blocked system-critical thread has been detected. This can lead to 
> cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
> threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
>  {color:red}blockedFor=100s{color}] 
> {quote} 
> Also there are active exchange process, which finishes with such timings 
> (timing will be approximatelly equal to blocking time of threads): 
> {quote} 
> 

[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote} 
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out. 
{quote} 

{quote} 
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}] 
{quote} 

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads): 
{quote} 
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ..., stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})] 
{quote} 
 

Is I understand, such errors do not affect restoring, but such error messages 
can confuse.

 

How to reproduce:
 # Set checkpoint frequency less than failure detection timeout.
 # Ensure, that cache groups partitions states restoring lasts more than 
failure detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer: [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch]

  was:
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote} 
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out. 
{quote} 

{quote} 
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}] 
{quote} 

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads): 
{quote} 
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ..., stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})] 
{quote} 
 

Is I understand, such errors does not affect restoring, but such error messages 
can confuse.

 

How to reproduce:
 # Set checkpoint frequency less than failure detection timeout.
 # Ensure, that cache groups partitions states restoring lasts more than 
failure detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer: [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch]


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There may be possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start):
> {quote} 
> [2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
>  Checkpoint read lock acquisition has been timed out. 
> {quote} 
> {quote} 
> [2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
>  Blocked system-critical thread has been detected. This can lead to 
> cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
> threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
>  {color:red}blockedFor=100s{color}] 
> {quote} 
> Also there are active exchange process, which finishes with such timings 
> (timing will be approximatelly equal to blocking time of threads): 
> {quote} 
> 

[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}[2023-04-06T10:55:46,561][ERROR][ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}
{quote}[2023-04-06T10:55:47,487][ERROR][tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:#FF}blockedFor=100s{color}]
{quote}
Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads):
{quote}[2023-04-06T10:55:52,211][INFO 
][exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange timings 
[startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ..., stage="Restore partition states" 
({color:#FF}100163 ms{color}), ..., stage="Total time" 
({color:#FF}100334 ms{color})]
{quote}
 

Is I understand, such errors does not affect restoring, but such error messages 
can confuse.

 

How to reproduce:
 # Set checkpoint frequency less than failure detection timeout.
 # Ensure, that cache groups partitions states restoring lasts more than 
failure detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer: [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch]

  was:
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}

{quote}
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}]
{quote}

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads):
{quote}
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ...,  stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})]
{quote}

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There may be possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start):
> {quote}[2023-04-06T10:55:46,561][ERROR][ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
>  Checkpoint read lock acquisition has been timed out.
> {quote}
> {quote}[2023-04-06T10:55:47,487][ERROR][tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
>  Blocked system-critical thread has been detected. This can lead to 
> cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
> threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
>  {color:#FF}blockedFor=100s{color}]
> {quote}
> Also there are active exchange process, which finishes with such timings 
> (timing will be approximatelly equal to blocking time of threads):
> {quote}[2023-04-06T10:55:52,211][INFO 
> ][exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
> timings [startVer=AffinityTopologyVersion 

[jira] [Updated] (IGNITE-19211) ODBC 3.0: Align metainfo provided by driver with SQL engine in 3.0

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19211:
-
Epic Link: IGNITE-19250  (was: IGNITE-19131)

> ODBC 3.0: Align metainfo provided by driver with SQL engine in 3.0
> --
>
> Key: IGNITE-19211
> URL: https://issues.apache.org/jira/browse/IGNITE-19211
> Project: Ignite
>  Issue Type: Improvement
>  Components: odbc
>Reporter: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Scope: 
> - Make sure we return proper metainformation on SQL types. Check 
> ignite/odbc/meta, ignite/odbc/type_traits.h, etc;
> - Port tests that are applicable;
> - Add new tests where needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19208) ODBC 3.0: Port msi builder scripts properly

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19208:
-
Epic Link: IGNITE-19251  (was: IGNITE-19131)

> ODBC 3.0: Port msi builder scripts properly
> ---
>
> Key: IGNITE-19208
> URL: https://issues.apache.org/jira/browse/IGNITE-19208
> Project: Ignite
>  Issue Type: Improvement
>  Components: odbc
>Reporter: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> To do:
> Make sure CMake flag ENABLE_ODBC_MSI works properly;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19210) ODBC 3.0: Make sure DSN-managing UI works properly in Windows

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19210:
-
Epic Link: IGNITE-19251  (was: IGNITE-19131)

> ODBC 3.0: Make sure DSN-managing UI works properly in Windows
> -
>
> Key: IGNITE-19210
> URL: https://issues.apache.org/jira/browse/IGNITE-19210
> Project: Ignite
>  Issue Type: Improvement
>  Components: odbc
>Reporter: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Scope:
> - Properly port content of ignite/odbc/system;
> - Probably, come up with some kind of automatic tests for this functionality, 
> as it's always hard to make sure that UI is not broken.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19215) ODBC 3.0: Implement DML data batching

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19215:
-
Epic Link: IGNITE-19251  (was: IGNITE-19131)

> ODBC 3.0: Implement DML data batching
> -
>
> Key: IGNITE-19215
> URL: https://issues.apache.org/jira/browse/IGNITE-19215
> Project: Ignite
>  Issue Type: Improvement
>  Components: odbc
>Reporter: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Scope:
> - Implement server side request handling;
> - Port client side functionality;
> - Port applicable tests;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19251) ODBC 3.0 Enchantments

2023-04-06 Thread Igor Sapego (Jira)
Igor Sapego created IGNITE-19251:


 Summary: ODBC 3.0 Enchantments
 Key: IGNITE-19251
 URL: https://issues.apache.org/jira/browse/IGNITE-19251
 Project: Ignite
  Issue Type: Epic
  Components: odbc
Reporter: Igor Sapego
Assignee: Igor Sapego


Enchantments for the Ignite 3 ODBC driver



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19131) ODBC 3.0 Basic functionality

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19131:
-
Description: We need to implement basic ODBC driver for Ignite 3.  (was: We 
need to implement ODBC driver for Ignite 3.)

> ODBC 3.0 Basic functionality
> 
>
> Key: IGNITE-19131
> URL: https://issues.apache.org/jira/browse/IGNITE-19131
> Project: Ignite
>  Issue Type: Epic
>  Components: odbc
>Reporter: Igor Sapego
>Assignee: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> We need to implement basic ODBC driver for Ignite 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19218) ODBC 3.0: Implement special columns query

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19218:
-
Epic Link: IGNITE-19250  (was: IGNITE-19131)

> ODBC 3.0: Implement special columns query
> -
>
> Key: IGNITE-19218
> URL: https://issues.apache.org/jira/browse/IGNITE-19218
> Project: Ignite
>  Issue Type: Improvement
>  Components: odbc
>Reporter: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Probably should just port dummy functionality and tests from Ignite 2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19214) ODBC 3.0: Implement table metadata fetching

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19214:
-
Epic Link: IGNITE-19250  (was: IGNITE-19131)

> ODBC 3.0: Implement table metadata fetching
> ---
>
> Key: IGNITE-19214
> URL: https://issues.apache.org/jira/browse/IGNITE-19214
> Project: Ignite
>  Issue Type: Improvement
>  Components: odbc
>Reporter: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Scope:
> - Implement server side request handling;
> - Implement client side metadata handling;
> - Port applicable ports;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19217) ODBC 3.0: Implement foreign keys query

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19217:
-
Epic Link: IGNITE-19250  (was: IGNITE-19131)

> ODBC 3.0: Implement foreign keys query
> --
>
> Key: IGNITE-19217
> URL: https://issues.apache.org/jira/browse/IGNITE-19217
> Project: Ignite
>  Issue Type: Improvement
>  Components: odbc
>Reporter: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> As we do not support them natively, probably should just port dummy 
> functionality and tests from Ignite 2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19131) ODBC 3.0 Basic functionality

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19131:
-
Epic Name: ODBC 3.0 Basic functionality  (was: ODBC 3.0)

> ODBC 3.0 Basic functionality
> 
>
> Key: IGNITE-19131
> URL: https://issues.apache.org/jira/browse/IGNITE-19131
> Project: Ignite
>  Issue Type: Epic
>  Components: odbc
>Reporter: Igor Sapego
>Assignee: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> We need to implement ODBC driver for Ignite 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19216) ODBC 3.0: implement type info fetching

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19216:
-
Epic Link: IGNITE-19250  (was: IGNITE-19131)

> ODBC 3.0: implement type info fetching
> --
>
> Key: IGNITE-19216
> URL: https://issues.apache.org/jira/browse/IGNITE-19216
> Project: Ignite
>  Issue Type: Improvement
>  Components: odbc
>Reporter: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Scope:
> - Decide whether we need to implement type info fetching from server or 
> whether we can implement it locally;
> - Implement chosen solution;
> - Port/Add new tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19219) ODBC 3.0: Implement primary keys query

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19219:
-
Epic Link: IGNITE-19250  (was: IGNITE-19131)

> ODBC 3.0: Implement primary keys query
> --
>
> Key: IGNITE-19219
> URL: https://issues.apache.org/jira/browse/IGNITE-19219
> Project: Ignite
>  Issue Type: Improvement
>  Components: odbc
>Reporter: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> This functionality was not implemented properly in Ignite 2, so we probably 
> will need to re-implement it.
> Also port and add tests as needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19250) ODBC 3.0 Metainformation

2023-04-06 Thread Igor Sapego (Jira)
Igor Sapego created IGNITE-19250:


 Summary: ODBC 3.0 Metainformation
 Key: IGNITE-19250
 URL: https://issues.apache.org/jira/browse/IGNITE-19250
 Project: Ignite
  Issue Type: Epic
  Components: odbc
Reporter: Igor Sapego


ODBC features that related to metadata providing and handling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19131) ODBC 3.0 Basic functionality

2023-04-06 Thread Igor Sapego (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-19131:
-
Summary: ODBC 3.0 Basic functionality  (was: ODBC 3.0)

> ODBC 3.0 Basic functionality
> 
>
> Key: IGNITE-19131
> URL: https://issues.apache.org/jira/browse/IGNITE-19131
> Project: Ignite
>  Issue Type: Epic
>  Components: odbc
>Reporter: Igor Sapego
>Assignee: Igor Sapego
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> We need to implement ODBC driver for Ignite 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-19116) Sql. UPDATE statement fails with NPE when table does not exist

2023-04-06 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin reassigned IGNITE-19116:
-

Assignee: Pavel Pereslegin

> Sql. UPDATE statement fails with NPE when table does not exist
> --
>
> Key: IGNITE-19116
> URL: https://issues.apache.org/jira/browse/IGNITE-19116
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 3.0.0-beta2
>Reporter: Maksim Zhuravkov
>Assignee: Pavel Pereslegin
>Priority: Minor
>  Labels: ignite-3
>
> UPDATE statement fails with NPE when table does not exist.
> {code:java}
> @Test
> public void test() {
>sql("UPDATE unknown SET j = j + 1");
> }
> {code}
> Error:
> {code:java}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgniteSqlValidator.createSourceSelectForUpdate(IgniteSqlValidator.java:175)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1476)
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgniteSqlValidator.performUnconditionalRewrites(IgniteSqlValidator.java:383)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:1046)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:759)
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgniteSqlValidator.validate(IgniteSqlValidator.java:135)
>   at 
> org.apache.ignite.internal.sql.engine.prepare.IgnitePlanner.validate(IgnitePlanner.java:189)
> {code}
> *Expected behavoir*
> It should return throw objectNotFound error:
> {code:java}
> Object 'UNKNOWN' not found
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}

{quote}
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}]
{quote}

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads):
{quote}
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ...,  stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})]
{quote}

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 

  was:
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}

{quote}
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}]
{quote}

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads):
{quote}
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ...,  stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})]
{quote}

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There may be possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start):
> {quote}
> [2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
>  Checkpoint read lock acquisition has been timed out.
> {quote}
> {quote}
> [2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]\-#23%node%\-#446%node%][G]
>  Blocked system-critical thread has been detected. This can lead to 
> cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
> threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
>  {color:red}blockedFor=100s{color}]
> {quote}
> Also there are active exchange process, which finishes with such timings 
> (timing will be approximatelly equal to blocking time of threads):
> {quote}
> [2023-04-06T10:55:52,211][INFO 
> ]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
> timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
> resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 

[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}

{quote}
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}]
{quote}

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads):
{quote}
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ...,  stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})]
{quote}

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 

  was:
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}

{quote}
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}]
{quote}

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads):
{quote}
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ...,  stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})]
{quote}

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There may be possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start):
> {quote}
> [2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
>  Checkpoint read lock acquisition has been timed out.
> {quote}
> {quote}
> [2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
>  Blocked system-critical thread has been detected. This can lead to 
> cluster-wide undefined behaviour \[workerName=db-checkpoint-thread, 
> threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
>  {color:red}blockedFor=100s{color}]
> {quote}
> Also there are active exchange process, which finishes with such timings 
> (timing will be approximatelly equal to blocking time of threads):
> {quote}
> [2023-04-06T10:55:52,211][INFO 
> ]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
> timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
> resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 

[jira] [Assigned] (IGNITE-19249) Prohibit disabling a test without mentioning a ticket

2023-04-06 Thread Yury Gerzhedovich (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Gerzhedovich reassigned IGNITE-19249:
--

Assignee: Yury Gerzhedovich

> Prohibit disabling a test without mentioning a ticket
> -
>
> Key: IGNITE-19249
> URL: https://issues.apache.org/jira/browse/IGNITE-19249
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Yury Gerzhedovich
>Assignee: Yury Gerzhedovich
>Priority: Major
>  Labels: ignite-3
>
> Let's add a test to check that code doesn't have any muted test with no 
> ticket mention.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19249) Prohibit disabling a test without mentioning a ticket

2023-04-06 Thread Yury Gerzhedovich (Jira)
Yury Gerzhedovich created IGNITE-19249:
--

 Summary: Prohibit disabling a test without mentioning a ticket
 Key: IGNITE-19249
 URL: https://issues.apache.org/jira/browse/IGNITE-19249
 Project: Ignite
  Issue Type: Improvement
Reporter: Yury Gerzhedovich


Let's add a test to check that code doesn't have any muted test with no ticket 
mention.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19021) Support the directory deployment

2023-04-06 Thread Vadim Pakhnushev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Pakhnushev updated IGNITE-19021:
--
Summary: Support the directory deployment  (was: Support the dirrectory 
deployment)

> Support the directory deployment
> 
>
> Key: IGNITE-19021
> URL: https://issues.apache.org/jira/browse/IGNITE-19021
> Project: Ignite
>  Issue Type: Improvement
>  Components: cli, rest
>Reporter: Aleksandr
>Priority: Major
>  Labels: ignite-3
>
> Now it is impossible to deploy the directory. The case: deploy several jars 
> or just a directory with class files. 
> The solution might be:
> - zip the dir on the client side
> - deploy the zip 
> - unzip on the server side



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-19021) Support the directory deployment

2023-04-06 Thread Vadim Pakhnushev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Pakhnushev reassigned IGNITE-19021:
-

Assignee: Vadim Pakhnushev

> Support the directory deployment
> 
>
> Key: IGNITE-19021
> URL: https://issues.apache.org/jira/browse/IGNITE-19021
> Project: Ignite
>  Issue Type: Improvement
>  Components: cli, rest
>Reporter: Aleksandr
>Assignee: Vadim Pakhnushev
>Priority: Major
>  Labels: ignite-3
>
> Now it is impossible to deploy the directory. The case: deploy several jars 
> or just a directory with class files. 
> The solution might be:
> - zip the dir on the client side
> - deploy the zip 
> - unzip on the server side



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19248) Fix snapshot restore hanging if the prepare stage fails.

2023-04-06 Thread Nikita Amelchev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Amelchev updated IGNITE-19248:
-
Labels: ise  (was: )

> Fix snapshot restore hanging if the prepare stage fails.
> 
>
> Key: IGNITE-19248
> URL: https://issues.apache.org/jira/browse/IGNITE-19248
> Project: Ignite
>  Issue Type: Bug
>Reporter: Nikita Amelchev
>Assignee: Nikita Amelchev
>Priority: Major
>  Labels: ise
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Snapshot restore hangs if the prepare stage fails.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19248) Fix snapshot restore hanging if the prepare stage fails.

2023-04-06 Thread Nikita Amelchev (Jira)
Nikita Amelchev created IGNITE-19248:


 Summary: Fix snapshot restore hanging if the prepare stage fails.
 Key: IGNITE-19248
 URL: https://issues.apache.org/jira/browse/IGNITE-19248
 Project: Ignite
  Issue Type: Bug
Reporter: Nikita Amelchev
Assignee: Nikita Amelchev


Snapshot restore hangs if the prepare stage fails.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19164) Improve message about requested partitions during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19164:
---
Description: 
Currently, during snapshot restore message is logged before requesting 
partitions from remote nodes:
{quote}
[2023-03-24T18:06:59,910][INFO 
]\[disco-notifier-worker-#792%node%|#792%node%][SnapshotRestoreProcess] Trying 
to request partitions from remote nodes 
[reqId=ff682204-9554-4fbb-804c-38a79c0b286a, snapshot=snapshot_name, 
map={*{color:#FF}76e22ef5-3c76-4987-bebd-9a6222a0{color}*={*{color:#FF}-903566235{color}*=[0,2,4,6,11,12,18,98,100,170,190,194,1015],
 
*{color:#FF}1544803905{color}*=[1,11,17,18,22,25,27,35,37,42,45,51,62,64,67,68,73,76,1017]}}]
{quote}

It is necessary to make this output "human readable":
# Print messages per node instead of one message for all nodes.
# Print node consistent id and address.
# Print cache / group name.

  was:
Currently, during snapshot restore message is logged before requesting 
partitions from remote nodes:
{quote}
[2023-03-24T18:06:59,910][INFO 
][disco-notifier-worker-#792%node%|#792%node%][SnapshotRestoreProcess] Trying 
to request partitions from remote nodes 
[reqId=ff682204-9554-4fbb-804c-38a79c0b286a, snapshot=snapshot_name, 
map={*{color:#FF}76e22ef5-3c76-4987-bebd-9a6222a0{color}*={*{color:#FF}-903566235{color}*=[0,2,4,6,11,12,18,98,100,170,190,194,1015],
 
*{color:#FF}1544803905{color}*=[1,11,17,18,22,25,27,35,37,42,45,51,62,64,67,68,73,76,1017]}}]
{quote}

It is necessary to make this output "human readable":
# Print messages per node instead of one message for all nodes.
# Print node consistent id and address.
# Print cache / group name.


> Improve message about requested partitions during snapshot restore
> --
>
> Key: IGNITE-19164
> URL: https://issues.apache.org/jira/browse/IGNITE-19164
> Project: Ignite
>  Issue Type: Task
>Reporter: Ilya Shishkov
>Assignee: Julia Bakulina
>Priority: Minor
>  Labels: iep-43, ise
>
> Currently, during snapshot restore message is logged before requesting 
> partitions from remote nodes:
> {quote}
> [2023-03-24T18:06:59,910][INFO 
> ]\[disco-notifier-worker-#792%node%|#792%node%][SnapshotRestoreProcess] 
> Trying to request partitions from remote nodes 
> [reqId=ff682204-9554-4fbb-804c-38a79c0b286a, snapshot=snapshot_name, 
> map={*{color:#FF}76e22ef5-3c76-4987-bebd-9a6222a0{color}*={*{color:#FF}-903566235{color}*=[0,2,4,6,11,12,18,98,100,170,190,194,1015],
>  
> *{color:#FF}1544803905{color}*=[1,11,17,18,22,25,27,35,37,42,45,51,62,64,67,68,73,76,1017]}}]
> {quote}
> It is necessary to make this output "human readable":
> # Print messages per node instead of one message for all nodes.
> # Print node consistent id and address.
> # Print cache / group name.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-19153) Fix docker compose

2023-04-06 Thread Vadim Pakhnushev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Pakhnushev reassigned IGNITE-19153:
-

Assignee: Vadim Pakhnushev

> Fix docker compose
> --
>
> Key: IGNITE-19153
> URL: https://issues.apache.org/jira/browse/IGNITE-19153
> Project: Ignite
>  Issue Type: Task
>  Components: build
>Reporter: Vadim Pakhnushev
>Assignee: Vadim Pakhnushev
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After IGNITE-18581, ignite node entry point doesn't accept {{--join}} option 
> so we need to create a corresponding config file for the example compose file.
> Also there are leftover code in the {{IgniteRunner}} class for converting 
> {{NetworkAddress}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-18320) [IEP-94] Reimplement cache scan command to control.sh

2023-04-06 Thread Nikolay Izhikov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-18320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolay Izhikov reassigned IGNITE-18320:


Assignee: Aleksey Plekhanov  (was: Nikolay Izhikov)

> [IEP-94] Reimplement cache scan command to control.sh
> -
>
> Key: IGNITE-18320
> URL: https://issues.apache.org/jira/browse/IGNITE-18320
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Nikolay Izhikov
>Assignee: Aleksey Plekhanov
>Priority: Blocker
>  Labels: IEP-94
> Fix For: 2.15
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To decomission ignitevisorcmd.sh we need to move all useful commands to 
> control script.
>  
> Cache scan command is used by users to view cache content so we must provide 
> it via control.sh



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-18320) [IEP-94] Reimplement cache scan command to control.sh

2023-04-06 Thread Nikolay Izhikov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-18320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolay Izhikov reassigned IGNITE-18320:


Assignee: Nikolay Izhikov  (was: Aleksey Plekhanov)

> [IEP-94] Reimplement cache scan command to control.sh
> -
>
> Key: IGNITE-18320
> URL: https://issues.apache.org/jira/browse/IGNITE-18320
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Nikolay Izhikov
>Assignee: Nikolay Izhikov
>Priority: Blocker
>  Labels: IEP-94
> Fix For: 2.15
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To decomission ignitevisorcmd.sh we need to move all useful commands to 
> control script.
>  
> Cache scan command is used by users to view cache content so we must provide 
> it via control.sh



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19237) Dependency copying should happed on package phase instead of test-compile

2023-04-06 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-19237:
--
Description: 
According to the [plugin usage 
examples|https://maven.apache.org/plugins/maven-dependency-plugin/usage.html] 
the phase shoul be the `package`.
Lower phase may (will) cause sutuation when artifacts are not generated at 
multi-level projects.
And you may gain the following:
{noformat}
 Failed to execute goal 
org.apache.maven.plugins:maven-dependency-plugin:3.1.1:copy-dependencies 
(copy-libs) on project ignite-XXX-plugin: Artifact has not been packaged yet. 
When used on reactor artifact, copy should be executed after packaging: see 
MDEP-187. 
{noformat}

  was:
Otherwice there is nothing to copy :( and you may gain the following:
{noformat}
 Failed to execute goal 
org.apache.maven.plugins:maven-dependency-plugin:3.1.1:copy-dependencies 
(copy-libs) on project ignite-XXX-plugin: Artifact has not been packaged yet. 
When used on reactor artifact, copy should be executed after packaging: see 
MDEP-187. 
{noformat}

According to the [plugin usage 
examples|https://maven.apache.org/plugins/maven-dependency-plugin/usage.html] 
the phase shoul be the `package`.
Lower phase may (will) cause sutuation when artifacts are not generated at 
multi-level projects.


> Dependency copying should happed on package phase instead of test-compile
> -
>
> Key: IGNITE-19237
> URL: https://issues.apache.org/jira/browse/IGNITE-19237
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Anton Vinogradov
>Assignee: Anton Vinogradov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> According to the [plugin usage 
> examples|https://maven.apache.org/plugins/maven-dependency-plugin/usage.html] 
> the phase shoul be the `package`.
> Lower phase may (will) cause sutuation when artifacts are not generated at 
> multi-level projects.
> And you may gain the following:
> {noformat}
>  Failed to execute goal 
> org.apache.maven.plugins:maven-dependency-plugin:3.1.1:copy-dependencies 
> (copy-libs) on project ignite-XXX-plugin: Artifact has not been packaged yet. 
> When used on reactor artifact, copy should be executed after packaging: see 
> MDEP-187. 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19237) Dependency copying should happed on package phase instead of test-compile

2023-04-06 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-19237:
--
Description: 
Otherwice there is nothing to copy :( and you may gain the following:
{noformat}
 Failed to execute goal 
org.apache.maven.plugins:maven-dependency-plugin:3.1.1:copy-dependencies 
(copy-libs) on project ignite-XXX-plugin: Artifact has not been packaged yet. 
When used on reactor artifact, copy should be executed after packaging: see 
MDEP-187. 
{noformat}

According to the [plugin usage 
examples|https://maven.apache.org/plugins/maven-dependency-plugin/usage.html] 
the phase shoul be the `package`.
Lower phase may (will) cause sutuation when artifacts are not generated at 
multi-level projects.

  was:
Otherwice there is nothing to copy :( and you may gain the following:
{noformat}
 Failed to execute goal 
org.apache.maven.plugins:maven-dependency-plugin:3.1.1:copy-dependencies 
(copy-libs) on project ignite-XXX-plugin: Artifact has not been packaged yet. 
When used on reactor artifact, copy should be executed after packaging: see 
MDEP-187. 
{noformat}


> Dependency copying should happed on package phase instead of test-compile
> -
>
> Key: IGNITE-19237
> URL: https://issues.apache.org/jira/browse/IGNITE-19237
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Anton Vinogradov
>Assignee: Anton Vinogradov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Otherwice there is nothing to copy :( and you may gain the following:
> {noformat}
>  Failed to execute goal 
> org.apache.maven.plugins:maven-dependency-plugin:3.1.1:copy-dependencies 
> (copy-libs) on project ignite-XXX-plugin: Artifact has not been packaged yet. 
> When used on reactor artifact, copy should be executed after packaging: see 
> MDEP-187. 
> {noformat}
> According to the [plugin usage 
> examples|https://maven.apache.org/plugins/maven-dependency-plugin/usage.html] 
> the phase shoul be the `package`.
> Lower phase may (will) cause sutuation when artifacts are not generated at 
> multi-level projects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19247) Replication is timed out

2023-04-06 Thread Alexander Belyak (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Belyak updated IGNITE-19247:
--
Description: 
This is very basic acceptance test.

Code below just create  tables with  columns (int key and 
varchar cols) and insert  rows into each table (with SLEEP ms interval 
between operations, with  attemps.

 
{noformat}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class TimeoutExceptionReproducer {
private static final String DB_URL = "jdbc:ignite:thin://172.24.1.2:10800";
private static final int COLUMNS = 10;

private static final String TABLE_NAME = "K";
private static final int ROWS = 1000;

private static final int TABLES = 10;

private static final int BATCH_SIZE = 10;

private static final int SLEEP = 30;

private static final int RETRY = 10;

private static String getCreateSql(String tableName) {
StringBuilder sql = new StringBuilder("create table 
").append(tableName).append(" (id int primary key");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", col").append(i).append(" varchar NOT NULL");
}

sql.append(")");

return sql.toString();
}

private static final void s() {
if (SLEEP > 0) {
try {
Thread.sleep(SLEEP);
} catch (InterruptedException e) {
// NoOp
}
}
}

private static void createTables(Connection connection, String tableName) 
throws SQLException {
try (Statement stmt = connection.createStatement()) {
System.out.println("Creating " + tableName);

stmt.executeUpdate("drop table if exists " + tableName );
s();
stmt.executeUpdate(getCreateSql(tableName));
s();
}
}

private static String getInsertSql(String tableName) {
StringBuilder sql = new StringBuilder("insert into 
").append(tableName).append(" values(?");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", ?");
}

sql.append(")");

return sql.toString();
}

private static void insertBatch(PreparedStatement ps) {
int retryCounter = 0;
while(retryCounter <= RETRY) {
try {
ps.executeBatch();

return;
} catch (SQLException e) {
System.err.println(retryCounter + " error while executing " + 
ps + ":" + e);

retryCounter++;
}
}
}

private static void insertData(Connection connection, String tableName) 
throws SQLException {
long ts = System.currentTimeMillis();
try (PreparedStatement ps = 
connection.prepareStatement(getInsertSql(tableName))) {
int batch = 0;

for (int i = 0; i < ROWS; i++) {
ps.setInt(1, i);

for (int j = 2; j < COLUMNS + 2; j++) {
ps.setString(j, "value" + i + "_" + j);
}

ps.addBatch();
batch++;

if (batch == BATCH_SIZE) {
batch = 0;
insertBatch(ps);
ps.clearBatch();

System.out.println("Batch " + BATCH_SIZE + " took " + 
(System.currentTimeMillis() - ts) + " to get " + i + " rows");

s();
ts = System.currentTimeMillis();
}
}

if (batch > 0) {
insertBatch(ps);
ps.clearBatch();
s();
}
}
}

private static int testData(Connection connection, String tableName) throws 
SQLException {
try (Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery("select count(*) from " + 
tableName);) {
rs.next();

int count = rs.getInt(1);

int result = ROWS - count;

if (result == 0) {
System.out.println("Found " + count + " rows in " + tableName);
} else {
System.err.println("Found " + count + " rows in " + tableName + 
" instead of " + ROWS);
}

s();

return result;
}
}

public static void main(String[] args) throws SQLException {
int lostRows = 0;
try (Connection connection = DriverManager.getConnection(DB_URL)) {
for (int i = 0; i < TABLES; i++) {
String tableName = TABLE_NAME + i;
createTables(connection, tableName);

insertData(connection, tableName);

lostRows += testData(connection, tableName);
}
}

System.exit(lostRows);
}
}

[jira] [Updated] (IGNITE-19247) Replication is timed out

2023-04-06 Thread Alexander Belyak (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Belyak updated IGNITE-19247:
--
Description: 
{color:#0033b3}import {color}{color:#00}java.sql.Connection{color};
{color:#0033b3}import {color}{color:#00}java.sql.DriverManager{color};
{color:#0033b3}import {color}{color:#00}java.sql.PreparedStatement{color};
{color:#0033b3}import {color}{color:#00}java.sql.ResultSet{color};
{color:#0033b3}import {color}{color:#00}java.sql.SQLException{color};
{color:#0033b3}import {color}{color:#00}java.sql.Statement{color};

{color:#0033b3}public class {color}{color:#00}TimeoutExceptionReproducer 
{color}{
{color:#0033b3}private static final {color}{color:#00}String 
{color}{color:#871094}DB_URL {color}= 
{color:#067d17}"jdbc:ignite:thin://172.24.1.2:10800"{color};
{color:#0033b3}private static final int {color}{color:#871094}COLUMNS {color}= 
{color:#1750eb}10{color};

{color:#0033b3}private static final {color}{color:#00}String 
{color}{color:#871094}TABLE_NAME {color}= {color:#067d17}"K"{color};
{color:#0033b3}private static final int {color}{color:#871094}ROWS {color}= 
{color:#1750eb}1000{color};

{color:#0033b3}private static final int {color}{color:#871094}TABLES {color}= 
{color:#1750eb}10{color};

{color:#0033b3}private static final int {color}{color:#871094}BATCH_SIZE 
{color}= {color:#1750eb}10{color};

{color:#0033b3}private static final int {color}{color:#871094}SLEEP {color}= 
{color:#1750eb}30{color};

{color:#0033b3}private static final int {color}{color:#871094}RETRY {color}= 
{color:#1750eb}10{color};

{color:#0033b3}private static {color}{color:#00}String 
{color}{color:#00627a}getCreateSql{color}({color:#00}String 
{color}tableName) {
{color:#00}StringBuilder sql {color}= {color:#0033b3}new 
{color}StringBuilder({color:#067d17}"create table 
"{color}).append(tableName).append({color:#067d17}" (id int primary 
key"{color});

{color:#0033b3}for {color}({color:#0033b3}int {color}i = 
{color:#1750eb}0{color}; i < {color:#871094}COLUMNS{color}; i++) {
{color:#00}sql{color}.append({color:#067d17}", 
col"{color}).append(i).append({color:#067d17}" varchar NOT NULL"{color});
}

{color:#00}sql{color}.append({color:#067d17}")"{color});

{color:#0033b3}return {color}{color:#00}sql{color}.toString();
}

{color:#0033b3}private static final void {color}{color:#00627a}s{color}() {
{color:#0033b3}if {color}({color:#871094}SLEEP {color}> 
{color:#1750eb}0{color}) {
{color:#0033b3}try {color}{
{color:#00}Thread{color}.sleep({color:#871094}SLEEP{color});
} {color:#0033b3}catch {color}({color:#00}InterruptedException {color}e) {
{color:#8c8c8c}// NoOp
{color}{color:#8c8c8c} {color}}
}
}

{color:#0033b3}private static void 
{color}{color:#00627a}createTables{color}({color:#00}Connection 
{color}connection, {color:#00}String {color}tableName) 
{color:#0033b3}throws {color}{color:#00}SQLException {color}{
{color:#0033b3}try {color}({color:#00}Statement stmt {color}= 
connection.createStatement()) {
{color:#00}System{color}.{color:#871094}out{color}.println({color:#067d17}"Creating
 " {color}+ tableName);

{color:#00}stmt{color}.executeUpdate({color:#067d17}"{color}{color:#067d17}drop
 table if exists {color}{color:#067d17}" {color}+ tableName );
s();
{color:#00}stmt{color}.executeUpdate(getCreateSql(tableName));
s();
}
}

{color:#0033b3}private static {color}{color:#00}String 
{color}{color:#00627a}getInsertSql{color}({color:#00}String 
{color}tableName) {
{color:#00}StringBuilder sql {color}= {color:#0033b3}new 
{color}StringBuilder({color:#067d17}"insert into 
"{color}).append(tableName).append({color:#067d17}" values(?"{color});

{color:#0033b3}for {color}({color:#0033b3}int {color}i = 
{color:#1750eb}0{color}; i < {color:#871094}COLUMNS{color}; i++) {
{color:#00}sql{color}.append({color:#067d17}", ?"{color});
}

{color:#00}sql{color}.append({color:#067d17}")"{color});

{color:#0033b3}return {color}{color:#00}sql{color}.toString();
}

{color:#0033b3}private static void 
{color}{color:#00627a}insertBatch{color}({color:#00}PreparedStatement 
{color}ps) {
{color:#0033b3}int {color}retryCounter = {color:#1750eb}0{color};
{color:#0033b3}while{color}(retryCounter <= {color:#871094}RETRY{color}) {
{color:#0033b3}try {color}{
ps.executeBatch();

{color:#0033b3}return{color};
} {color:#0033b3}catch {color}({color:#00}SQLException {color}e) {
{color:#00}System{color}.{color:#871094}err{color}.println(retryCounter + 
{color:#067d17}" error while executing " {color}+ ps + {color:#067d17}":" 
{color}+ e);

retryCounter++;
}
}
}

{color:#0033b3}private static void 
{color}{color:#00627a}insertData{color}({color:#00}Connection 
{color}connection, {color:#00}String {color}tableName) 
{color:#0033b3}throws {color}{color:#00}SQLException {color}{
{color:#0033b3}long 

[jira] [Updated] (IGNITE-19238) ItDataTypesTest is flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Description: 
h3. Description & Root cause

1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.

On TableManager stop, we stop and cleanup all table resources like replicas and 
raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update.

tablesByIdVv *listens same storage revision update event* in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing
{code:java}
updaterFuture = updaterFuture.whenComplete((v, t) -> 
versionedValue.complete(causalityToken, localUpdaterFuture)); {code}
As a result it's possible that we will clear tablesToStopInCaseOfError before 
publishing same revision tables to tablesByIdVv, so that we will miss that 
cleared tables in tablesByIdVv.latest() which is used in TableManager#stop.
h3. Implementation Notes

1. First of all I've renamed tablesToStopInCaseOfError to pending tables, 
because they aren't only ...InCaseOfError.

2. I've also reworked tablesToStopInCaseOfError cleanup by substituting 
tablesToStopInCaseOfError.clear on revision change with
{code:java}
tablesByIdVv.get(causalityToken).thenAccept(ignored -> inBusyLock(busyLock,  
()-> {  
  pendingTables.remove(tblId);
})); {code}
meaning that we

2.1. remove specific table by id instead of ready.

2.2. do that removal on corresponding table publishing wihtin tablesByIdVv.

3. That means that at some point right after the publishing but before removal 
it's possible to have same table both within tablesByIdVv and pendingTables 
thus in order not to stop same table twice (which is safe by the way because of 
idempotentce) I've substituted
{code:java}
cleanUpTablesResources(tables);
cleanUpTablesResources(tablesToStopInCaseOfError); {code}
with
{code:java}
Stream tablesToStop =
Stream.concat(tablesByIdVv.latest().entrySet().stream(), 
pendingTables.entrySet().stream()).
map(Map.Entry::getValue);

cleanUpTablesResources(tablesToStop); {code}

  was:
h3. Description & Root cause

1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race 

[jira] [Updated] (IGNITE-19247) Replication is timed out

2023-04-06 Thread Alexander Belyak (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Belyak updated IGNITE-19247:
--
Description: import java.sql.Connection; import java.sql.DriverManager; 
import java.sql.PreparedStatement; import java.sql.ResultSet; import 
java.sql.SQLException; import java.sql.Statement; public class 
TimeoutExceptionReproducer \{ private static final String DB_URL = 
"jdbc:ignite:thin://172.24.1.2:10800"; private static final int COLUMNS = 10; 
private static final String TABLE_NAME = "K"; private static final int ROWS = 
10; private static final int TABLES = 10; private static final int BATCH_SIZE = 
10; private static final int SLEEP = 30; private static String 
getCreateSql(String tableName) { StringBuilder sql = new StringBuilder("create 
table ").append(tableName).append(" (id int primary key"); for (int i = 0; i < 
COLUMNS; i++) { sql.append(", col").append(i).append(" varchar NOT NULL"); } 
sql.append(")"); return sql.toString(); } private static final void s() \{ if 
(SLEEP > 0) { try { Thread.sleep(SLEEP); } catch (InterruptedException e) \{ // 
NoOp } } } private static void createTables(Connection connection, String 
tableName) throws SQLException \{ try (Statement stmt = 
connection.createStatement()) { System.out.println("Creating " + tableName); 
stmt.executeUpdate("drop table if exists " + tableName ); s(); 
stmt.executeUpdate(getCreateSql(tableName)); s(); } } private static String 
getInsertSql(String tableName) \{ StringBuilder sql = new StringBuilder("insert 
into ").append(tableName).append(" values(?"); for (int i = 0; i < COLUMNS; 
i++) { sql.append(", ?"); } sql.append(")"); return sql.toString(); } private 
static void insertData(Connection connection, String tableName) throws 
SQLException \{ long ts = System.currentTimeMillis(); try (PreparedStatement ps 
= connection.prepareStatement(getInsertSql(tableName))) { int batch = 0; for 
(int i = 0; i < ROWS; i++) { ps.setInt(1, i); for (int j = 2; j < COLUMNS + 2; 
j++) { ps.setString(j, "value" + i + "_" + j); } ps.addBatch(); batch++; if 
(batch == BATCH_SIZE) \{ batch = 0; ps.executeBatch(); ps.clearBatch(); 
System.out.println("Batch " + BATCH_SIZE + " took " + 
(System.currentTimeMillis() - ts) + " to get " + i + " rows"); s(); ts = 
System.currentTimeMillis(); } } if (batch > 0) \{ batch = 0; ps.executeBatch(); 
ps.clearBatch(); s(); } } } private static int testData(Connection connection, 
String tableName) throws SQLException \{ try (Statement stmt = 
connection.createStatement(); ResultSet rs = stmt.executeQuery("select count(*) 
from " + tableName);) { rs.next(); int count = rs.getInt(1); int result = ROWS 
- count; if (result == 0) { System.out.println("Found " + count + " rows in " + 
tableName); } else \{ System.err.println("Found " + count + " rows in " + 
tableName + " instead of " + ROWS); } s(); return result; } } public static 
void main(String[] args) throws SQLException \{ int lostRows = 0; try 
(Connection connection = DriverManager.getConnection(DB_URL)) { for (int i = 0; 
i < TABLES; i++) { String tableName = TABLE_NAME + i; createTables(connection, 
tableName); insertData(connection, tableName); lostRows += testData(connection, 
tableName); } } System.exit(lostRows); } }  (was: Code below just create TABLES 
tables with COLUMNS+1 columns and insert ROWS rows into each table (with SLEEP 
ms interval between operations).

Simple example:
{code:java}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class TimeoutExceptionReproducer {
private static final String DB_URL = "jdbc:ignite:thin://172.24.1.2:10800";
private static final int COLUMNS = 10;

private static final String TABLE_NAME = "K";
private static final int ROWS = 10;

private static final int TABLES = 10;

private static final int BATCH_SIZE = 10;

private static final int SLEEP = 30;

private static String getCreateSql(String tableName) {
StringBuilder sql = new StringBuilder("create table 
").append(tableName).append(" (id int primary key");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", col").append(i).append(" varchar NOT NULL");
}

sql.append(")");

return sql.toString();
}

private static final void s() {
try {
Thread.sleep(SLEEP);
} catch (InterruptedException e) {
// NoOp
}
}

private static void createTables(Connection connection, String tableName) 
throws SQLException {
try (Statement stmt = connection.createStatement()) {
System.out.println("Creating " + tableName);

stmt.executeUpdate("drop table if exists " + tableName );
s();
stmt.executeUpdate(getCreateSql(tableName));
s();

[jira] [Updated] (IGNITE-19247) Replication is timed out

2023-04-06 Thread Alexander Belyak (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Belyak updated IGNITE-19247:
--
Description: 
Code below just create TABLES tables with COLUMNS+1 columns and insert ROWS 
rows into each table (with SLEEP ms interval between operations).

Simple example:
{code:java}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class TimeoutExceptionReproducer {
private static final String DB_URL = "jdbc:ignite:thin://172.24.1.2:10800";
private static final int COLUMNS = 10;

private static final String TABLE_NAME = "K";
private static final int ROWS = 10;

private static final int TABLES = 10;

private static final int BATCH_SIZE = 10;

private static final int SLEEP = 30;

private static String getCreateSql(String tableName) {
StringBuilder sql = new StringBuilder("create table 
").append(tableName).append(" (id int primary key");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", col").append(i).append(" varchar NOT NULL");
}

sql.append(")");

return sql.toString();
}

private static final void s() {
try {
Thread.sleep(SLEEP);
} catch (InterruptedException e) {
// NoOp
}
}

private static void createTables(Connection connection, String tableName) 
throws SQLException {
try (Statement stmt = connection.createStatement()) {
System.out.println("Creating " + tableName);

stmt.executeUpdate("drop table if exists " + tableName );
s();
stmt.executeUpdate(getCreateSql(tableName));
s();
}
}

private static String getInsertSql(String tableName) {
StringBuilder sql = new StringBuilder("insert into 
").append(tableName).append(" values(?");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", ?");
}

sql.append(")");

return sql.toString();
}

private static void insertData(Connection connection, String tableName) 
throws SQLException {
long ts = System.currentTimeMillis();
try (PreparedStatement ps = 
connection.prepareStatement(getInsertSql(tableName))) {
int batch = 0;

for (int i = 0; i < ROWS; i++) {
ps.setInt(1, i);

for (int j = 2; j < COLUMNS + 2; j++) {
ps.setString(j, "value" + i + "_" + j);
}

ps.addBatch();
batch++;

if (batch == BATCH_SIZE) {
batch = 0;
ps.executeBatch();
ps.clearBatch();

System.out.println("Batch " + BATCH_SIZE + " took " + 
(System.currentTimeMillis() - ts) + " to get " + i + " rows");

s();
ts = System.currentTimeMillis();
}
}

if (batch > 0) {
batch = 0;
ps.executeBatch();
ps.clearBatch();
s();
}
}
}

private static int testData(Connection connection, String tableName) throws 
SQLException {
try (Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery("select count(*) from " + 
tableName);) {
rs.next();

int count = rs.getInt(1);

int result = ROWS - count;

if (result == 0) {
System.out.println("Found " + count + " rows in " + tableName);
} else {
System.err.println("Found " + count + " rows in " + tableName + 
" instead of " + ROWS);
}

return result;
}
}

public static void main(String[] args) throws SQLException {
int lostRows = 0;
try (Connection connection = DriverManager.getConnection(DB_URL)) {
for (int i = 0; i < TABLES; i++) {
String tableName = TABLE_NAME + i;
createTables(connection, tableName);

insertData(connection, tableName);

lostRows += testData(connection, tableName);
}
}

System.exit(lostRows);
}
}
   {code}
lead to timeout exception:
{code:java}
Batch 100 took 4228 to get 2899 rows
Batch 100 took 5669 to get 2999 rows
Batch 100 took 3902 to get 3099 rows
Exception in thread "main" java.sql.BatchUpdateException: IGN-REP-3 
TraceId:b2c2c9e5-b917-482e-91df-2e0576c443c7 Replication is timed out 
[replicaGrpId=76c2b69a-a2bc-4d16-838d-5aff014c6004_part_11]
    at 
org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:124)
    at 

[jira] [Updated] (IGNITE-19247) Replication is timed out

2023-04-06 Thread Alexander Belyak (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Belyak updated IGNITE-19247:
--
Description: 
Code below just create 1000 tables with 101 columns and insert 1000 rows into 
each table.

Simple example:
{code:java}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class TimeoutExceptionReproducer {
private static final String DB_URL = "jdbc:ignite:thin://172.24.1.2:10800";
private static final int COLUMNS = 10;

private static final String TABLE_NAME = "K";
private static final int ROWS = 10;

private static final int TABLES = 10;

private static final int BATCH_SIZE = 10;

private static final int SLEEP = 30;

private static String getCreateSql(String tableName) {
StringBuilder sql = new StringBuilder("create table 
").append(tableName).append(" (id int primary key");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", col").append(i).append(" varchar NOT NULL");
}

sql.append(")");

return sql.toString();
}

private static final void s() {
try {
Thread.sleep(SLEEP);
} catch (InterruptedException e) {
// NoOp
}
}

private static void createTables(Connection connection, String tableName) 
throws SQLException {
try (Statement stmt = connection.createStatement()) {
System.out.println("Creating " + tableName);

stmt.executeUpdate("drop table if exists " + tableName );
s();
stmt.executeUpdate(getCreateSql(tableName));
s();
}
}

private static String getInsertSql(String tableName) {
StringBuilder sql = new StringBuilder("insert into 
").append(tableName).append(" values(?");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", ?");
}

sql.append(")");

return sql.toString();
}

private static void insertData(Connection connection, String tableName) 
throws SQLException {
long ts = System.currentTimeMillis();
try (PreparedStatement ps = 
connection.prepareStatement(getInsertSql(tableName))) {
int batch = 0;

for (int i = 0; i < ROWS; i++) {
ps.setInt(1, i);

for (int j = 2; j < COLUMNS + 2; j++) {
ps.setString(j, "value" + i + "_" + j);
}

ps.addBatch();
batch++;

if (batch == BATCH_SIZE) {
batch = 0;
ps.executeBatch();
ps.clearBatch();

System.out.println("Batch " + BATCH_SIZE + " took " + 
(System.currentTimeMillis() - ts) + " to get " + i + " rows");

s();
ts = System.currentTimeMillis();
}
}

if (batch > 0) {
batch = 0;
ps.executeBatch();
ps.clearBatch();
s();
}
}
}

private static int testData(Connection connection, String tableName) throws 
SQLException {
try (Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery("select count(*) from " + 
tableName);) {
rs.next();

int count = rs.getInt(1);

int result = ROWS - count;

if (result == 0) {
System.out.println("Found " + count + " rows in " + tableName);
} else {
System.err.println("Found " + count + " rows in " + tableName + 
" instead of " + ROWS);
}

return result;
}
}

public static void main(String[] args) throws SQLException {
int lostRows = 0;
try (Connection connection = DriverManager.getConnection(DB_URL)) {
for (int i = 0; i < TABLES; i++) {
String tableName = TABLE_NAME + i;
createTables(connection, tableName);

insertData(connection, tableName);

lostRows += testData(connection, tableName);
}
}

System.exit(lostRows);
}
}
   {code}
lead to timeout exception:
{code:java}
Batch 100 took 4228 to get 2899 rows
Batch 100 took 5669 to get 2999 rows
Batch 100 took 3902 to get 3099 rows
Exception in thread "main" java.sql.BatchUpdateException: IGN-REP-3 
TraceId:b2c2c9e5-b917-482e-91df-2e0576c443c7 Replication is timed out 
[replicaGrpId=76c2b69a-a2bc-4d16-838d-5aff014c6004_part_11]
    at 
org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:124)
    at TimeoutExceptionReproducer.insertData(TimeoutExceptionReproducer.java:64)
    at 

[jira] [Updated] (IGNITE-19238) ItDataTypesTest is flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Description: 
h3. Description & Root cause

1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.

On TableManager stop, we stop and cleanup all table resources like replicas and 
raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update.

tablesByIdVv *listens same storage revision update event* in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing
{code:java}
updaterFuture = updaterFuture.whenComplete((v, t) -> 
versionedValue.complete(causalityToken, localUpdaterFuture)); {code}
As a result it's possible that we will clear tablesToStopInCaseOfError before 
publishing same revision tables to tablesByIdVv, so that we will miss that 
cleared tables in tablesByIdVv.latest() which is used in TableManager#stop.
h3. Implementation Notes

1. First of all I've renamed tablesToStopInCaseOfError to pending tables, 
because they aren't only ...InCaseOfError.

2. I've also reworked tablesToStopInCaseOfError cleanup by substituting 
tablesToStopInCaseOfError.clear on revision change with
{code:java}
tablesByIdVv.get(causalityToken).thenAccept(ignored -> 
inBusyLock(busyLock,  ()-> {

pendingTables.remove(tblId);

})); {code}

  was:
h3. Description & Root cause

1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.

On TableManager stop, we stop and cleanup all table resources like replicas and 
raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update.

tablesByIdVv *listens same storage revision update event* in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used 

[jira] [Updated] (IGNITE-19238) ItDataTypesTest is flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Description: 
h3. Description & Root cause

1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.

On TableManager stop, we stop and cleanup all table resources like replicas and 
raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update.

tablesByIdVv *listens same storage revision update event* in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing
{code:java}
updaterFuture = updaterFuture.whenComplete((v, t) -> 
versionedValue.complete(causalityToken, localUpdaterFuture)); {code}
As a result it's possible that we will clear tablesToStopInCaseOfError before 
publishing same revision tables to tablesByIdVv, so that we will miss that 
cleared tables in tablesByIdVv.latest() which is used in TableManager#stop.
h3. Implementation Notes

  was:
1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.

On TableManager stop, we stop and cleanup all table resources like replicas and 
raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update.

tablesByIdVv *listens same storage revision update event* in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing
{code:java}
updaterFuture = updaterFuture.whenComplete((v, t) -> 

[jira] [Updated] (IGNITE-19247) Replication is timed out

2023-04-06 Thread Alexander Belyak (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Belyak updated IGNITE-19247:
--
Description: 
Code below just create 1000 tables with 101 columns and insert 1000 rows into 
each table.

Simple example:
{code:java}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Statement;

public class TimeoutExceptionReproducer {
private static final String DB_URL = "jdbc:ignite:thin://172.24.1.2:10800";
private static final int COLUMNS = 100;

private static final String TABLE_NAME = "t1";
private static final int ROWS = 1000;

private static final int TABLES = 1000;

private static final int BATCH_SIZE = 100;

private static String getCreateSql(String tableName) {
StringBuilder sql = new StringBuilder("create table 
").append(tableName).append(" (id int primary key");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", col").append(i).append(" varchar NOT NULL");
}

sql.append(")");

return sql.toString();
}

private static void createTables(Connection connection, String tableName) 
throws SQLException {
try (Statement stmt = connection.createStatement()) {
System.out.println("Creating " + tableName);

stmt.executeUpdate("drop table if exists " + tableName );
stmt.executeUpdate(getCreateSql(tableName));
}
}

private static String getInsertSql(String tableName) {
StringBuilder sql = new StringBuilder("insert into 
").append(tableName).append(" values(?");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", ?");
}

sql.append(")");

return sql.toString();
}

private static void insertData(Connection connection, String tableName) 
throws SQLException {
long ts = System.currentTimeMillis();
try (PreparedStatement ps = 
connection.prepareStatement(getInsertSql(tableName))) {
int batch = 0;

for (int i = 0; i < ROWS; i++) {
ps.setInt(1, i);

for (int j = 2; j < COLUMNS + 2; j++) {
ps.setString(j, "value" + i + "_" + j);
}

ps.addBatch();
batch++;

if (batch == BATCH_SIZE) {
batch = 0;
ps.executeBatch();
ps.clearBatch();
long nextTs = System.currentTimeMillis();
System.out.println("Batch " + BATCH_SIZE + " took " + 
(nextTs - ts) + " to get " + i + " rows");
ts = nextTs;
}
}

if (batch > 0) {
batch = 0;
ps.executeBatch();
ps.clearBatch();
}
}
}

public static void main(String[] args) throws SQLException {
try (Connection connection = DriverManager.getConnection(DB_URL)) {
for (int i = 0; i < TABLES; i++) {
String tableName = TABLE_NAME + i;
createTables(connection, tableName);

insertData(connection, tableName);
}
}
}
}
  {code}
lead to timeout exception:
{code:java}
Batch 100 took 4228 to get 2899 rows
Batch 100 took 5669 to get 2999 rows
Batch 100 took 3902 to get 3099 rows
Exception in thread "main" java.sql.BatchUpdateException: IGN-REP-3 
TraceId:b2c2c9e5-b917-482e-91df-2e0576c443c7 Replication is timed out 
[replicaGrpId=76c2b69a-a2bc-4d16-838d-5aff014c6004_part_11]
    at 
org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:124)
    at TimeoutExceptionReproducer.insertData(TimeoutExceptionReproducer.java:64)
    at TimeoutExceptionReproducer.main(TimeoutExceptionReproducer.java:84){code}

  was:
Simple example:
{code:java}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Statement;

public class TimeoutExceptionReproducer {
private static final String DB_URL = "jdbc:ignite:thin://172.24.1.2:10800";
private static final int COLUMNS = 100;

private static final String TABLE_NAME = "t1";
private static final int ROWS = 1000;

private static final int TABLES = 1000;

private static final int BATCH_SIZE = 100;

private static String getCreateSql(String tableName) {
StringBuilder sql = new StringBuilder("create table 
").append(tableName).append(" (id int primary key");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", col").append(i).append(" varchar NOT NULL");
}

sql.append(")");

return sql.toString();
}

private static void createTables(Connection connection, String tableName) 
throws SQLException {
  

[jira] [Updated] (IGNITE-19247) Replication is timed out

2023-04-06 Thread Alexander Belyak (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Belyak updated IGNITE-19247:
--
Environment: (was: Simple example:
{code:java}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Statement;

public class TimeoutExceptionReproducer {
private static final String DB_URL = "jdbc:ignite:thin://172.24.1.2:10800";
private static final int COLUMNS = 100;

private static final String TABLE_NAME = "t1";
private static final int ROWS = 10;

private static final int BATCH_SIZE = 100;

private static String getCreateSql() {
StringBuilder sql = new StringBuilder("create table 
").append(TABLE_NAME).append(" (id int primary key");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", col").append(i).append(" varchar NOT NULL");
}

sql.append(")");

return sql.toString();
}

private static void createTable(Connection connection) throws SQLException {
try (Statement stmt = connection.createStatement()) {
stmt.executeUpdate("drop table if exists " + TABLE_NAME );
stmt.executeUpdate(getCreateSql());
}
}

private static String getInsertSql() {
StringBuilder sql = new StringBuilder("insert into t1 values(?");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", ?");
}

sql.append(")");

return sql.toString();
}

private static void insertData(Connection connection) throws SQLException {
long ts = System.currentTimeMillis();
try (PreparedStatement ps = 
connection.prepareStatement(getInsertSql())) {
int batch = 0;

for (int i = 0; i < ROWS; i++) {
ps.setInt(1, i);

for (int j = 2; j < COLUMNS + 2; j++) {
ps.setString(j, "value" + i + "_" + j);
}

ps.addBatch();
batch++;

if (batch == BATCH_SIZE) {
batch = 0;
ps.executeBatch();
ps.clearBatch();
long nextTs = System.currentTimeMillis();
System.out.println("Batch " + BATCH_SIZE + " took " + 
(nextTs - ts) + " to get " + i + " rows");
ts = nextTs;
}
}

if (batch > 0) {
batch = 0;
ps.executeBatch();
ps.clearBatch();
}
}
}

public static void main(String[] args) throws SQLException {
try (Connection connection = DriverManager.getConnection(DB_URL)) {
createTable(connection);

insertData(connection);
}
}
}
 {code}
lead to timeout exception:
{code:java}
Batch 100 took 4228 to get 2899 rows
Batch 100 took 5669 to get 2999 rows
Batch 100 took 3902 to get 3099 rows
Exception in thread "main" java.sql.BatchUpdateException: IGN-REP-3 
TraceId:b2c2c9e5-b917-482e-91df-2e0576c443c7 Replication is timed out 
[replicaGrpId=76c2b69a-a2bc-4d16-838d-5aff014c6004_part_11]
    at 
org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:124)
    at TimeoutExceptionReproducer.insertData(TimeoutExceptionReproducer.java:64)
    at 
TimeoutExceptionReproducer.main(TimeoutExceptionReproducer.java:84){code})

> Replication is timed out
> 
>
> Key: IGNITE-19247
> URL: https://issues.apache.org/jira/browse/IGNITE-19247
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 3.0
>Reporter: Alexander Belyak
>Priority: Critical
>  Labels: ignite-3
> Fix For: 3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19247) Replication is timed out

2023-04-06 Thread Alexander Belyak (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Belyak updated IGNITE-19247:
--
Description: 
Simple example:
{code:java}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Statement;

public class TimeoutExceptionReproducer {
private static final String DB_URL = "jdbc:ignite:thin://172.24.1.2:10800";
private static final int COLUMNS = 100;

private static final String TABLE_NAME = "t1";
private static final int ROWS = 1000;

private static final int TABLES = 1000;

private static final int BATCH_SIZE = 100;

private static String getCreateSql(String tableName) {
StringBuilder sql = new StringBuilder("create table 
").append(tableName).append(" (id int primary key");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", col").append(i).append(" varchar NOT NULL");
}

sql.append(")");

return sql.toString();
}

private static void createTables(Connection connection, String tableName) 
throws SQLException {
try (Statement stmt = connection.createStatement()) {
System.out.println("Creating " + tableName);

stmt.executeUpdate("drop table if exists " + tableName );
stmt.executeUpdate(getCreateSql(tableName));
}
}

private static String getInsertSql(String tableName) {
StringBuilder sql = new StringBuilder("insert into 
").append(tableName).append(" values(?");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", ?");
}

sql.append(")");

return sql.toString();
}

private static void insertData(Connection connection, String tableName) 
throws SQLException {
long ts = System.currentTimeMillis();
try (PreparedStatement ps = 
connection.prepareStatement(getInsertSql(tableName))) {
int batch = 0;

for (int i = 0; i < ROWS; i++) {
ps.setInt(1, i);

for (int j = 2; j < COLUMNS + 2; j++) {
ps.setString(j, "value" + i + "_" + j);
}

ps.addBatch();
batch++;

if (batch == BATCH_SIZE) {
batch = 0;
ps.executeBatch();
ps.clearBatch();
long nextTs = System.currentTimeMillis();
System.out.println("Batch " + BATCH_SIZE + " took " + 
(nextTs - ts) + " to get " + i + " rows");
ts = nextTs;
}
}

if (batch > 0) {
batch = 0;
ps.executeBatch();
ps.clearBatch();
}
}
}

public static void main(String[] args) throws SQLException {
try (Connection connection = DriverManager.getConnection(DB_URL)) {
for (int i = 0; i < TABLES; i++) {
String tableName = TABLE_NAME + i;
createTables(connection, tableName);

insertData(connection, tableName);
}
}
}
}
  {code}
lead to timeout exception:
{code:java}
Batch 100 took 4228 to get 2899 rows
Batch 100 took 5669 to get 2999 rows
Batch 100 took 3902 to get 3099 rows
Exception in thread "main" java.sql.BatchUpdateException: IGN-REP-3 
TraceId:b2c2c9e5-b917-482e-91df-2e0576c443c7 Replication is timed out 
[replicaGrpId=76c2b69a-a2bc-4d16-838d-5aff014c6004_part_11]
    at 
org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:124)
    at TimeoutExceptionReproducer.insertData(TimeoutExceptionReproducer.java:64)
    at TimeoutExceptionReproducer.main(TimeoutExceptionReproducer.java:84){code}

> Replication is timed out
> 
>
> Key: IGNITE-19247
> URL: https://issues.apache.org/jira/browse/IGNITE-19247
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 3.0
>Reporter: Alexander Belyak
>Priority: Critical
>  Labels: ignite-3
> Fix For: 3.0
>
>
> Simple example:
> {code:java}
> import java.sql.Connection;
> import java.sql.DriverManager;
> import java.sql.PreparedStatement;
> import java.sql.SQLException;
> import java.sql.Statement;
> public class TimeoutExceptionReproducer {
> private static final String DB_URL = 
> "jdbc:ignite:thin://172.24.1.2:10800";
> private static final int COLUMNS = 100;
> private static final String TABLE_NAME = "t1";
> private static final int ROWS = 1000;
> private static final int TABLES = 1000;
> private static final int BATCH_SIZE = 100;
> private static String getCreateSql(String tableName) {
> StringBuilder sql = new StringBuilder("create table 
> 

[jira] [Created] (IGNITE-19247) Replication is timed out

2023-04-06 Thread Alexander Belyak (Jira)
Alexander Belyak created IGNITE-19247:
-

 Summary: Replication is timed out
 Key: IGNITE-19247
 URL: https://issues.apache.org/jira/browse/IGNITE-19247
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 3.0
 Environment: Simple example:
{code:java}
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.Statement;

public class TimeoutExceptionReproducer {
private static final String DB_URL = "jdbc:ignite:thin://172.24.1.2:10800";
private static final int COLUMNS = 100;

private static final String TABLE_NAME = "t1";
private static final int ROWS = 10;

private static final int BATCH_SIZE = 100;

private static String getCreateSql() {
StringBuilder sql = new StringBuilder("create table 
").append(TABLE_NAME).append(" (id int primary key");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", col").append(i).append(" varchar NOT NULL");
}

sql.append(")");

return sql.toString();
}

private static void createTable(Connection connection) throws SQLException {
try (Statement stmt = connection.createStatement()) {
stmt.executeUpdate("drop table if exists " + TABLE_NAME );
stmt.executeUpdate(getCreateSql());
}
}

private static String getInsertSql() {
StringBuilder sql = new StringBuilder("insert into t1 values(?");

for (int i = 0; i < COLUMNS; i++) {
sql.append(", ?");
}

sql.append(")");

return sql.toString();
}

private static void insertData(Connection connection) throws SQLException {
long ts = System.currentTimeMillis();
try (PreparedStatement ps = 
connection.prepareStatement(getInsertSql())) {
int batch = 0;

for (int i = 0; i < ROWS; i++) {
ps.setInt(1, i);

for (int j = 2; j < COLUMNS + 2; j++) {
ps.setString(j, "value" + i + "_" + j);
}

ps.addBatch();
batch++;

if (batch == BATCH_SIZE) {
batch = 0;
ps.executeBatch();
ps.clearBatch();
long nextTs = System.currentTimeMillis();
System.out.println("Batch " + BATCH_SIZE + " took " + 
(nextTs - ts) + " to get " + i + " rows");
ts = nextTs;
}
}

if (batch > 0) {
batch = 0;
ps.executeBatch();
ps.clearBatch();
}
}
}

public static void main(String[] args) throws SQLException {
try (Connection connection = DriverManager.getConnection(DB_URL)) {
createTable(connection);

insertData(connection);
}
}
}
 {code}
lead to timeout exception:
{code:java}
Batch 100 took 4228 to get 2899 rows
Batch 100 took 5669 to get 2999 rows
Batch 100 took 3902 to get 3099 rows
Exception in thread "main" java.sql.BatchUpdateException: IGN-REP-3 
TraceId:b2c2c9e5-b917-482e-91df-2e0576c443c7 Replication is timed out 
[replicaGrpId=76c2b69a-a2bc-4d16-838d-5aff014c6004_part_11]
    at 
org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:124)
    at TimeoutExceptionReproducer.insertData(TimeoutExceptionReproducer.java:64)
    at TimeoutExceptionReproducer.main(TimeoutExceptionReproducer.java:84){code}
Reporter: Alexander Belyak
 Fix For: 3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-18170) Deadlock in TableManager#updateAssignmentInternal()

2023-04-06 Thread Roman Puchkovskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-18170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy resolved IGNITE-18170.

Resolution: Fixed

Fixed in IGNITE-18203

> Deadlock in TableManager#updateAssignmentInternal()
> ---
>
> Key: IGNITE-18170
> URL: https://issues.apache.org/jira/browse/IGNITE-18170
> Project: Ignite
>  Issue Type: Bug
>Reporter: Roman Puchkovskiy
>Assignee: Roman Puchkovskiy
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
> Attachments: threads_report.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, {{TableManager#updateAssignmentsInternal}} is fully synchronous. 
> The scenario is as follows:
>  # {{updateAssignmentsInternal}} starts a RAFT group for a partition
>  # {{FSMCallerImpl}} finds out that its applied index is below the group 
> committed index, so it starts to apply the missing log entries in its 
> {{init()}} method (this is still done synchronously)
>  # While doing so, it invokes {{{}PartitionListener{}}}, which tries to 
> execute an insert
>  # To make an insert, a PK is needed, so it the insertion code tries to 
> obtain a PK from its future like this: {{pkFuture.join()}}
>  # That future is completed from {{{}IndexManager#createIndexLocally(){}}}, 
> which is invoked by {{ConfigurationNotifier}} later than 
> {{updateassignmentsInternal}} in the same thread
>  # As a result, the PK future cannot be completed before the sync 
> {{updateAssignmentsInternal}} finishes its job and returns, and it cannot 
> finish its job before the PK future is completed
> We should make {{updateAssignmentsInternal}} async.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19238) ItDataTypesTest is flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Description: 
1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.

On TableManager stop, we stop and cleanup all table resources like replicas and 
raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update.

tablesByIdVv *listens same storage revision update event* in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing
{code:java}
updaterFuture = updaterFuture.whenComplete((v, t) -> 
versionedValue.complete(causalityToken, localUpdaterFuture)); {code}
As a result it's possible that we will clear tablesToStopInCaseOfError before 
publishing same revision tables to tablesByIdVv, so that we will miss that 
cleared tables in tablesByIdVv.latest() which is used in TableManager#stop.

 

  was:
1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!

 
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.

On TableManager stop, we stop and cleanup all table resources like replicas and 
raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update.

tablesByIdVv *listens same storage revision update event* in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing

 
{code:java}
updaterFuture = updaterFuture.whenComplete((v, t) -> 
versionedValue.complete(causalityToken, 

[jira] [Updated] (IGNITE-19238) ItDataTypesTest is flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Fix Version/s: 3.0.0-beta2

> ItDataTypesTest is flaky
> 
>
> Key: IGNITE-19238
> URL: https://issues.apache.org/jira/browse/IGNITE-19238
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
> Attachments: Снимок экрана от 2023-04-06 10-39-32.png
>
>
> 1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests 
> failed to stop replicas on node stop:
> !Снимок экрана от 2023-04-06 10-39-32.png!
>  
> {code:java}
> java.lang.AssertionError: There are replicas alive 
> [replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
>     at 
> org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
>     at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
> 2. The reason why we failed to stop replicas is the race between 
> tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.
> On TableManager stop, we stop and cleanup all table resources like replicas 
> and raft nodes
> {code:java}
> public void stop() {
>   ...
>   Map tables = tablesByIdVv.latest();  // 1*
>   cleanUpTablesResources(tables); 
>   cleanUpTablesResources(tablesToStopInCaseOfError);
>   ...
> }{code}
> where tablesToStopInCaseOfError is a sort of pending tables list which one is 
> cleared on cfg storage revision update.
> tablesByIdVv *listens same storage revision update event* in order to publish 
> tables related to the given revision or in other words make such tables 
> accessible from tablesByIdVv.latest(); that one that is used in order to 
> retrieve tables for cleanup on components stop (see // 1* above)
> {code:java}
> public TableManager(
>   ... 
>   tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);
>   registry.accept(token -> {
> tablesToStopInCaseOfError.clear();
> 
> return completedFuture(null);
>   });
>   {code}
> However inside IncrementalVersionedValue we have async storageRevision update 
> processing
>  
> {code:java}
> updaterFuture = updaterFuture.whenComplete((v, t) -> 
> versionedValue.complete(causalityToken, localUpdaterFuture)); {code}
> As a result it's possible that we will clear tablesToStopInCaseOfError before 
> publishing same revision tables to tablesByIdVv, so that we will miss that 
> cleared tables in tablesByIdVv.latest() which is used in TableManager#stop.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-19168) Command for testing that snapshot partitions will be redistributed during restore

2023-04-06 Thread Julia Bakulina (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julia Bakulina reassigned IGNITE-19168:
---

Assignee: Julia Bakulina

> Command for testing that snapshot partitions will be redistributed during 
> restore
> -
>
> Key: IGNITE-19168
> URL: https://issues.apache.org/jira/browse/IGNITE-19168
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Ilya Shishkov
>Assignee: Julia Bakulina
>Priority: Minor
>  Labels: iep-43, ise
>
> When data is restored from snapshot, taken on other baseline topology (eg. 
> with another consistent identifiers or different cluster size) there will be 
> two stages that can last long enough:
> # Partitions redistribution according to affinity function.
> # Index rebuilding.
> It would be nice to have command for checking, that such redistribution won't 
> or will occur, eg.:
> {noformat}
> control.sh --snapshot distribution snapshot_name
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-19163) Add logging of snapshot check

2023-04-06 Thread Julia Bakulina (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julia Bakulina reassigned IGNITE-19163:
---

Assignee: Julia Bakulina

> Add logging of snapshot check
> -
>
> Key: IGNITE-19163
> URL: https://issues.apache.org/jira/browse/IGNITE-19163
> Project: Ignite
>  Issue Type: Task
>Reporter: Ilya Shishkov
>Assignee: Julia Bakulina
>Priority: Minor
>  Labels: iep-43, ise
>
> Sever nodes do not log state of snapshot check process, but it is necessary 
> for further analysis to print messages in log, when snapshot check procedure 
> started and finished. Currently, snapshot check invoked at least by restore 
> and check commands.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-19160) Improve message about sent partition file during snapshot restore

2023-04-06 Thread Julia Bakulina (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julia Bakulina reassigned IGNITE-19160:
---

Assignee: Julia Bakulina

> Improve message about sent partition file during snapshot restore
> -
>
> Key: IGNITE-19160
> URL: https://issues.apache.org/jira/browse/IGNITE-19160
> Project: Ignite
>  Issue Type: Task
>Reporter: Ilya Shishkov
>Assignee: Julia Bakulina
>Priority: Minor
>  Labels: iep-43, ise
>
> Currently, message about partition is as below:
> {quote}
> [2023-03-29T18:31:44,773][INFO ][snapshot-runner-#863%node0%][SnapshotSender] 
> Partition file has been send [part=part-645.bin, 
> pair={color:red}GroupPartitionId [grpId=1544803905, partId=645]{color}, 
> length=45056]
> {quote}
> It does not tell us: 
> # Receiver node id / address / consistent id.
> # Cache or cache group name which partition belongs to. Numerical group id is 
> not convenient way to determine cache or cache group.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-19158) Improve message about received partition file during snapshot restore

2023-04-06 Thread Julia Bakulina (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julia Bakulina reassigned IGNITE-19158:
---

Assignee: Julia Bakulina

> Improve message about received partition file during snapshot restore
> -
>
> Key: IGNITE-19158
> URL: https://issues.apache.org/jira/browse/IGNITE-19158
> Project: Ignite
>  Issue Type: Task
>Reporter: Ilya Shishkov
>Assignee: Julia Bakulina
>Priority: Minor
>  Labels: iep-43, ise
>
> Currently, GridIoManager prints only name of a file and node id:
> {quote}
> [2023-03-24T18:07:00,747][INFO ]pub-#871%node1%[GridIoManager] File has been 
> received [name={color:red}part-233.bin{color}, transferred=53248, time=0.0 
> sec, {color:red}rmtId=76e22ef5-3c76-4987-bebd-9a6222a0{color}]
> {quote}
> This meager information does not allow to determine in a simple way which 
> file is received and from which node.
> For example, such message would be more informative:
> {quote}
> [2023-03-29T17:09:42,230][INFO ][pub-#869%node0%][GridIoManager] File has 
> been received 
> [{color:red}path=/ignite/db/node0/_tmp_snp_restore_cache-default/part-647.bin{color},
>  transferred=45056, time=0.0 sec, rmtId=de43d2e8-a1ab-4d7c-9cea-72615371, 
> {color:red}rmdAddr=/127.0.0.1:51773{color}]
> {quote}
> _Other ways might be investigated_ in order to improve logging of receiving 
> partition files during snapshot restore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19075) CLI should ask for SSL settings

2023-04-06 Thread Vadim Pakhnushev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Pakhnushev updated IGNITE-19075:
--
Summary: CLI should ask for SSL settings  (was: CLI should ask user for SSL 
settings, if needed )

> CLI should ask for SSL settings
> ---
>
> Key: IGNITE-19075
> URL: https://issues.apache.org/jira/browse/IGNITE-19075
> Project: Ignite
>  Issue Type: Improvement
>  Components: cli
>Reporter: Ivan Gagarkin
>Assignee: Vadim Pakhnushev
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, a user just gets an error, if tries to contest a node via HTTPS 
> without SSL settings. 
> The CLI should ask a user to set SSL settings if gets an error on a call:
>  # Set trust store path
>  # Set trust store password
>  # Set key store path
>  # Set key store password
> Save provided values to the config and repeat the call. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19246) CLI should ask for auth settings

2023-04-06 Thread Vadim Pakhnushev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Pakhnushev updated IGNITE-19246:
--
Description: 
Currently, a user just gets an error, if tries to connect to a node which has 
authentication configured.
The CLI should ask a user to set auth settings if gets an error on a call, save 
provided values to the config and repeat the call.

  was:
Currently, a user just gets an error, if tries to contest a node via HTTPS 
without SSL settings. 

The CLI should ask a user to set SSL settings if gets an error on a call:
 # Set trust store path
 # Set trust store password
 # Set key store path
 # Set key store password

Save provided values to the config and repeat the call. 


> CLI should ask for auth settings
> 
>
> Key: IGNITE-19246
> URL: https://issues.apache.org/jira/browse/IGNITE-19246
> Project: Ignite
>  Issue Type: Improvement
>  Components: cli
>Reporter: Vadim Pakhnushev
>Assignee: Vadim Pakhnushev
>Priority: Major
>  Labels: ignite-3
>
> Currently, a user just gets an error, if tries to connect to a node which has 
> authentication configured.
> The CLI should ask a user to set auth settings if gets an error on a call, 
> save provided values to the config and repeat the call.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19246) CLI should ask for auth settings

2023-04-06 Thread Vadim Pakhnushev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Pakhnushev updated IGNITE-19246:
--
Ignite Flags:   (was: Docs Required,Release Notes Required)

> CLI should ask for auth settings
> 
>
> Key: IGNITE-19246
> URL: https://issues.apache.org/jira/browse/IGNITE-19246
> Project: Ignite
>  Issue Type: Improvement
>  Components: cli
>Reporter: Vadim Pakhnushev
>Assignee: Vadim Pakhnushev
>Priority: Major
>  Labels: ignite-3
>
> Currently, a user just gets an error, if tries to connect to a node which has 
> authentication configured.
> The CLI should ask a user to set auth settings if gets an error on a call, 
> save provided values to the config and repeat the call.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19246) CLI should ask for auth settings

2023-04-06 Thread Vadim Pakhnushev (Jira)
Vadim Pakhnushev created IGNITE-19246:
-

 Summary: CLI should ask for auth settings
 Key: IGNITE-19246
 URL: https://issues.apache.org/jira/browse/IGNITE-19246
 Project: Ignite
  Issue Type: Improvement
  Components: cli
Reporter: Vadim Pakhnushev
Assignee: Vadim Pakhnushev


Currently, a user just gets an error, if tries to contest a node via HTTPS 
without SSL settings. 

The CLI should ask a user to set SSL settings if gets an error on a call:
 # Set trust store path
 # Set trust store password
 # Set key store path
 # Set key store password

Save provided values to the config and repeat the call. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19075) CLI should ask user for SSL settings, if needed

2023-04-06 Thread Vadim Pakhnushev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Pakhnushev updated IGNITE-19075:
--
Summary: CLI should ask user for SSL settings, if needed   (was: CLI should 
ask user set SSL and authentication settings, if needed )

> CLI should ask user for SSL settings, if needed 
> 
>
> Key: IGNITE-19075
> URL: https://issues.apache.org/jira/browse/IGNITE-19075
> Project: Ignite
>  Issue Type: Improvement
>  Components: cli
>Reporter: Ivan Gagarkin
>Assignee: Vadim Pakhnushev
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, a user just gets an error, if tries to contest a node via HTTPS 
> without SSL settings. 
> The CLI should ask a user to set SSL settings if gets an error on a call:
>  # Set trust store path
>  # Set trust store password
>  # Set key store path
>  # Set key store password
> Save provided values to the config and repeat the call. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19245) Handle SSL errors

2023-04-06 Thread Vadim Pakhnushev (Jira)
Vadim Pakhnushev created IGNITE-19245:
-

 Summary: Handle SSL errors
 Key: IGNITE-19245
 URL: https://issues.apache.org/jira/browse/IGNITE-19245
 Project: Ignite
  Issue Type: Improvement
  Components: cli
Reporter: Vadim Pakhnushev


When SSL configuration is incorrect, generic {{Unknown error Couldn't build 
REST client}} is displayed. More information can be extracted from the 
underlying exceptions like key store file not found or password is incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-19236) orphaned_tests.txt location calculation simplification

2023-04-06 Thread Anton Vinogradov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-19236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17709314#comment-17709314
 ] 

Anton Vinogradov commented on IGNITE-19236:
---

Merged to the master.
[~timonin.maksim], thanks for the review!

> orphaned_tests.txt location calculation simplification
> --
>
> Key: IGNITE-19236
> URL: https://issues.apache.org/jira/browse/IGNITE-19236
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Anton Vinogradov
>Assignee: Anton Vinogradov
>Priority: Major
> Fix For: 2.15
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Let's simplify the code :) 
> Current hardcode to the 'modules' does not allow to check projects based on 
> Ignite (where 'modules' folder is missed).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-16778) Support timestamp through jdbc

2023-04-06 Thread Evgeny Stanilovsky (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgeny Stanilovsky updated IGNITE-16778:

Ignite Flags:   (was: Docs Required,Release Notes Required)

> Support timestamp through jdbc
> --
>
> Key: IGNITE-16778
> URL: https://issues.apache.org/jira/browse/IGNITE-16778
> Project: Ignite
>  Issue Type: Bug
>  Components: jdbc, sql
>Reporter: Alexander Belyak
>Priority: Major
>  Labels: ignite-3
> Attachments: RunnerForTestNode.java
>
>
> Able to use timestamp data type through KV view by LocalDateTime, but no 
> through jdbc setTimestamp method. Example in attachment



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19244) Add file path completion to SSL config questions

2023-04-06 Thread Vadim Pakhnushev (Jira)
Vadim Pakhnushev created IGNITE-19244:
-

 Summary: Add file path completion to SSL config questions
 Key: IGNITE-19244
 URL: https://issues.apache.org/jira/browse/IGNITE-19244
 Project: Ignite
  Issue Type: Improvement
  Components: cli
Reporter: Vadim Pakhnushev


IGNITE-19075 introduced SSL configuration in REPL mode, file name completion 
should be added to questions that ask for the key store/trust store location.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-16778) Support timestamp through jdbc

2023-04-06 Thread Evgeny Stanilovsky (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgeny Stanilovsky updated IGNITE-16778:

Component/s: sql

> Support timestamp through jdbc
> --
>
> Key: IGNITE-16778
> URL: https://issues.apache.org/jira/browse/IGNITE-16778
> Project: Ignite
>  Issue Type: Bug
>  Components: jdbc, sql
>Reporter: Alexander Belyak
>Priority: Major
>  Labels: ignite-3
> Attachments: RunnerForTestNode.java
>
>
> Able to use timestamp data type through KV view by LocalDateTime, but no 
> through jdbc setTimestamp method. Example in attachment



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}
[2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}

{quote}
[2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 {color:red}blockedFor=100s{color}]
{quote}

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads):
{quote}
[2023-04-06T10:55:52,211][INFO 
]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ...,  stage="Restore partition states" 
({color:red}100163 ms{color}), ..., stage="Total time" ({color:red}100334 
ms{color})]
{quote}

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 

  was:
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}
[2023-04-06T10:55:46,561][ERROR][ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}

{quote}
[2023-04-06T10:55:47,487][ERROR][tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 *{color:red}blockedFor=100s{color}*]
{quote}

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads):
{quote}
[2023-04-06T10:55:52,211][INFO 
][exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange timings 
[startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ...,  stage="Restore partition states" 
(*{color:red}100163 ms{color}*), ..., stage="Total time" (*{color:red}100334 
ms{color}*)]
{quote}

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There may be possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start):
> {quote}
> [2023-04-06T10:55:46,561][ERROR]\[ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
>  Checkpoint read lock acquisition has been timed out.
> {quote}
> {quote}
> [2023-04-06T10:55:47,487][ERROR]\[tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
>  Blocked system-critical thread has been detected. This can lead to 
> cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
> threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
>  {color:red}blockedFor=100s{color}]
> {quote}
> Also there are active exchange process, which finishes with such timings 
> (timing will be approximatelly equal to blocking time of threads):
> {quote}
> [2023-04-06T10:55:52,211][INFO 
> ]\[exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
> timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
> resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 

[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}
[2023-04-06T10:55:46,561][ERROR][ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}

{quote}
[2023-04-06T10:55:47,487][ERROR][tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 *{color:red}blockedFor=100s{color}*]
{quote}

Also there are active exchange process, which finishes with such timings 
(timing will be approximatelly equal to blocking time of threads):
{quote}
[2023-04-06T10:55:52,211][INFO 
][exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange timings 
[startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ...,  stage="Restore partition states" 
(*{color:red}100163 ms{color}*), ..., stage="Total time" (*{color:red}100334 
ms{color}*)]
{quote}

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 

  was:
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}
[2023-04-06T10:55:46,561][ERROR][ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}

{quote}
[2023-04-06T10:55:47,487][ERROR][tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 *{color:red}blockedFor=100s{color}*]
{quote}

Also there are activ exchange process and after finish Exchange future will 
print such timing:
[2023-04-06T10:55:52,211][INFO 
][exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange timings 
[startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ...,  stage="Restore partition states" 
(*{color:red}100163 ms{color}*), ..., stage="Total time" (*{color:red}100334 
ms{color}*)]

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There may be possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start):
> {quote}
> [2023-04-06T10:55:46,561][ERROR][ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
>  Checkpoint read lock acquisition has been timed out.
> {quote}
> {quote}
> [2023-04-06T10:55:47,487][ERROR][tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
>  Blocked system-critical thread has been detected. This can lead to 
> cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
> threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
>  *{color:red}blockedFor=100s{color}*]
> {quote}
> Also there are active exchange process, which finishes with such timings 
> (timing will be approximatelly equal to blocking time of threads):
> {quote}
> [2023-04-06T10:55:52,211][INFO 
> ][exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
> timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
> resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
> exchange queue" (0 ms), ...,  stage="Restore 

[jira] [Updated] (IGNITE-19240) Use HTTPS port for dynamic completers when connected to SSL enabled node

2023-04-06 Thread Vadim Pakhnushev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Pakhnushev updated IGNITE-19240:
--
Description: 
Currently {{NodeNameRegistryImpl.urlFromClusterNode}} uses an HTTP port when 
constructing URLs for completion. HTTPS port should be used if the node is 
configured with SSL enabled.
Even if we construct proper URL, it might be incorrect due to the 
{{NodeMetadata.getRestHost}} returning some IP number which is not verifiable 
with provided trust store.

  was:Currently {{NodeNameRegistryImpl.urlFromClusterNode}} uses an HTTP port 
when constructing URLs for completion. HTTPS port should be used if the node is 
configured with SSL enabled.


> Use HTTPS port for dynamic completers when connected to SSL enabled node
> 
>
> Key: IGNITE-19240
> URL: https://issues.apache.org/jira/browse/IGNITE-19240
> Project: Ignite
>  Issue Type: Bug
>  Components: cli
>Reporter: Vadim Pakhnushev
>Assignee: Vadim Pakhnushev
>Priority: Major
>  Labels: ignite-3
>
> Currently {{NodeNameRegistryImpl.urlFromClusterNode}} uses an HTTP port when 
> constructing URLs for completion. HTTPS port should be used if the node is 
> configured with SSL enabled.
> Even if we construct proper URL, it might be incorrect due to the 
> {{NodeMetadata.getRestHost}} returning some IP number which is not verifiable 
> with provided trust store.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There may be possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start):
{quote}
[2023-04-06T10:55:46,561][ERROR][ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
 Checkpoint read lock acquisition has been timed out.
{quote}

{quote}
[2023-04-06T10:55:47,487][ERROR][tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
 Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
 *{color:red}blockedFor=100s{color}*]
{quote}

Also there are activ exchange process and after finish Exchange future will 
print such timing:
[2023-04-06T10:55:52,211][INFO 
][exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange timings 
[startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
exchange queue" (0 ms), ...,  stage="Restore partition states" 
(*{color:red}100163 ms{color}*), ..., stage="Total time" (*{color:red}100334 
ms{color}*)]

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 

  was:
There are possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start).

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There may be possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start):
> {quote}
> [2023-04-06T10:55:46,561][ERROR][ttl-cleanup-worker-#475%node%][CheckpointTimeoutLock]
>  Checkpoint read lock acquisition has been timed out.
> {quote}
> {quote}
> [2023-04-06T10:55:47,487][ERROR][tcp-disco-msg-worker-[crd]-#23%node%-#446%node%][G]
>  Blocked system-critical thread has been detected. This can lead to 
> cluster-wide undefined behaviour [workerName=db-checkpoint-thread, 
> threadName=db-checkpoint-thread-#457%snapshot.BlockingThreadsOnSnapshotRestoreReproducerTest0%,
>  *{color:red}blockedFor=100s{color}*]
> {quote}
> Also there are activ exchange process and after finish Exchange future will 
> print such timing:
> [2023-04-06T10:55:52,211][INFO 
> ][exchange-worker-#450%node%][GridDhtPartitionsExchangeFuture] Exchange 
> timings [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], 
> resVer=AffinityTopologyVersion [topVer=1, minorTopVer=5], stage="Waiting in 
> exchange queue" (0 ms), ...,  stage="Restore partition states" 
> (*{color:red}100163 ms{color}*), ..., stage="Total time" (*{color:red}100334 
> ms{color}*)]
> How to reproduce: 
> # Set checkpoint frequency less than failure detection timeout.
> # Ensure, that cache groups partitions states restoring lasts more than 
> failure detection timeout, i.e. it is actual to sufficiently large caches.
> Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-19187) Sql. Handle StorageRebalanceException during rowsCount estimation

2023-04-06 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin reassigned IGNITE-19187:
-

Assignee: Pavel Pereslegin

> Sql. Handle StorageRebalanceException during rowsCount estimation
> -
>
> Key: IGNITE-19187
> URL: https://issues.apache.org/jira/browse/IGNITE-19187
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Konstantin Orlov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: ignite-3
>
> We need to handle StorageRebalanceException which may be thrown from 
> {{org.apache.ignite.internal.storage.MvPartitionStorage#rowsCount}} during 
> row count estimation 
> ({{org.apache.ignite.internal.sql.engine.schema.IgniteTableImpl.StatisticsImpl#getRowCount}}).
> {code:java}
> Caused by: org.apache.ignite.internal.storage.StorageRebalanceException: 
> IGN-STORAGE-4 TraceId:a943b5f5-8018-4c4b-9e66-cc5060796848 Storage in the 
> process of rebalancing: [table=TEST, partitionId=0]
>   at 
> app//org.apache.ignite.internal.storage.util.StorageUtils.throwExceptionDependingOnStorageState(StorageUtils.java:129)
>   at 
> app//org.apache.ignite.internal.storage.util.StorageUtils.throwExceptionIfStorageNotInRunnableState(StorageUtils.java:51)
>   at 
> app//org.apache.ignite.internal.storage.pagememory.mv.AbstractPageMemoryMvPartitionStorage.throwExceptionIfStorageNotInRunnableState(AbstractPageMemoryMvPartitionStorage.java:894)
>   at 
> app//org.apache.ignite.internal.storage.pagememory.mv.AbstractPageMemoryMvPartitionStorage.lambda$rowsCount$24(AbstractPageMemoryMvPartitionStorage.java:707)
>   at 
> app//org.apache.ignite.internal.storage.pagememory.mv.AbstractPageMemoryMvPartitionStorage.busy(AbstractPageMemoryMvPartitionStorage.java:785)
>   at 
> app//org.apache.ignite.internal.storage.pagememory.mv.AbstractPageMemoryMvPartitionStorage.rowsCount(AbstractPageMemoryMvPartitionStorage.java:706)
>   at 
> app//org.apache.ignite.internal.sql.engine.schema.IgniteTableImpl$StatisticsImpl.getRowCount(IgniteTableImpl.java:551)
>   at 
> app//org.apache.calcite.prepare.RelOptTableImpl.getRowCount(RelOptTableImpl.java:238)
>   at 
> app//org.apache.ignite.internal.sql.engine.rel.ProjectableFilterableTableScan.computeSelfCost(ProjectableFilterableTableScan.java:156)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19243) C++ 3.0: propagate table schema updates to client on write-only operations

2023-04-06 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-19243:

Language: C++  (was: Java)

> C++ 3.0: propagate table schema updates to client on write-only operations
> --
>
> Key: IGNITE-19243
> URL: https://issues.apache.org/jira/browse/IGNITE-19243
> Project: Ignite
>  Issue Type: Improvement
>  Components: thin client
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Currently, C++ client receives table schema updates when write-read requests 
> are performed. For example, client performs TUPLE_GET request, sends key 
> tuple using old schema version, receives result tuple with the latest schema 
> version, and retrieves the latest schema.
> However, some requests are "write-only": client sends a tuple, but does not 
> receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
> case.
> To fix this, include the latest schema version into all write-only operation 
> responses:
> * TUPLE_UPSERT
> * TUPLE_UPSERT_ALL
> * TUPLE_INSERT
> * TUPLE_INSERT_ALL
> * TUPLE_REPLACE
> * TUPLE_REPLACE_EXACT
> * TUPLE_DELETE
> * TUPLE_DELETE_ALL
> * TUPLE_DELETE_EXACT
> * TUPLE_DELETE_ALL_EXACT
> * TUPLE_CONTAINS_KEY
> Client will compare this version to the known one and perform a background 
> update, if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19242) .NET: Thin 3.0: propagate table schema updates to client on write-only operations

2023-04-06 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-19242:

Language: C#  (was: Java)

> .NET: Thin 3.0: propagate table schema updates to client on write-only 
> operations
> -
>
> Key: IGNITE-19242
> URL: https://issues.apache.org/jira/browse/IGNITE-19242
> Project: Ignite
>  Issue Type: Improvement
>  Components: thin client
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Currently, .NET client receives table schema updates when write-read requests 
> are performed. For example, client performs TUPLE_GET request, sends key 
> tuple using old schema version, receives result tuple with the latest schema 
> version, and retrieves the latest schema.
> However, some requests are "write-only": client sends a tuple, but does not 
> receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
> case.
> To fix this, include the latest schema version into all write-only operation 
> responses:
> * TUPLE_UPSERT
> * TUPLE_UPSERT_ALL
> * TUPLE_INSERT
> * TUPLE_INSERT_ALL
> * TUPLE_REPLACE
> * TUPLE_REPLACE_EXACT
> * TUPLE_DELETE
> * TUPLE_DELETE_ALL
> * TUPLE_DELETE_EXACT
> * TUPLE_DELETE_ALL_EXACT
> * TUPLE_CONTAINS_KEY
> Client will compare this version to the known one and perform a background 
> update, if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19243) C++ 3.0:: propagate table schema updates to client on write-only operations

2023-04-06 Thread Pavel Tupitsyn (Jira)
Pavel Tupitsyn created IGNITE-19243:
---

 Summary: C++ 3.0:: propagate table schema updates to client on 
write-only operations
 Key: IGNITE-19243
 URL: https://issues.apache.org/jira/browse/IGNITE-19243
 Project: Ignite
  Issue Type: Improvement
  Components: thin client
Affects Versions: 3.0.0-beta1
Reporter: Pavel Tupitsyn
Assignee: Pavel Tupitsyn
 Fix For: 3.0.0-beta2


Currently, Java client receives table schema updates when write-read requests 
are performed. For example, client performs TUPLE_GET request, sends key tuple 
using old schema version, receives result tuple with the latest schema version, 
and retrieves the latest schema.

However, some requests are "write-only": client sends a tuple, but does not 
receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
case.

To fix this, include the latest schema version into all write-only operation 
responses:
* TUPLE_UPSERT
* TUPLE_UPSERT_ALL
* TUPLE_INSERT
* TUPLE_INSERT_ALL
* TUPLE_REPLACE
* TUPLE_REPLACE_EXACT
* TUPLE_DELETE
* TUPLE_DELETE_ALL
* TUPLE_DELETE_EXACT
* TUPLE_DELETE_ALL_EXACT
* TUPLE_CONTAINS_KEY

Client will compare this version to the known one and perform a background 
update, if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19243) C++ 3.0: propagate table schema updates to client on write-only operations

2023-04-06 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-19243:

Summary: C++ 3.0: propagate table schema updates to client on write-only 
operations  (was: C++ 3.0:: propagate table schema updates to client on 
write-only operations)

> C++ 3.0: propagate table schema updates to client on write-only operations
> --
>
> Key: IGNITE-19243
> URL: https://issues.apache.org/jira/browse/IGNITE-19243
> Project: Ignite
>  Issue Type: Improvement
>  Components: thin client
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Currently, Java client receives table schema updates when write-read requests 
> are performed. For example, client performs TUPLE_GET request, sends key 
> tuple using old schema version, receives result tuple with the latest schema 
> version, and retrieves the latest schema.
> However, some requests are "write-only": client sends a tuple, but does not 
> receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
> case.
> To fix this, include the latest schema version into all write-only operation 
> responses:
> * TUPLE_UPSERT
> * TUPLE_UPSERT_ALL
> * TUPLE_INSERT
> * TUPLE_INSERT_ALL
> * TUPLE_REPLACE
> * TUPLE_REPLACE_EXACT
> * TUPLE_DELETE
> * TUPLE_DELETE_ALL
> * TUPLE_DELETE_EXACT
> * TUPLE_DELETE_ALL_EXACT
> * TUPLE_CONTAINS_KEY
> Client will compare this version to the known one and perform a background 
> update, if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19243) C++ 3.0: propagate table schema updates to client on write-only operations

2023-04-06 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-19243:

Description: 
Currently, C++ client receives table schema updates when write-read requests 
are performed. For example, client performs TUPLE_GET request, sends key tuple 
using old schema version, receives result tuple with the latest schema version, 
and retrieves the latest schema.

However, some requests are "write-only": client sends a tuple, but does not 
receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
case.

To fix this, include the latest schema version into all write-only operation 
responses:
* TUPLE_UPSERT
* TUPLE_UPSERT_ALL
* TUPLE_INSERT
* TUPLE_INSERT_ALL
* TUPLE_REPLACE
* TUPLE_REPLACE_EXACT
* TUPLE_DELETE
* TUPLE_DELETE_ALL
* TUPLE_DELETE_EXACT
* TUPLE_DELETE_ALL_EXACT
* TUPLE_CONTAINS_KEY

Client will compare this version to the known one and perform a background 
update, if necessary.

  was:
Currently, Java client receives table schema updates when write-read requests 
are performed. For example, client performs TUPLE_GET request, sends key tuple 
using old schema version, receives result tuple with the latest schema version, 
and retrieves the latest schema.

However, some requests are "write-only": client sends a tuple, but does not 
receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
case.

To fix this, include the latest schema version into all write-only operation 
responses:
* TUPLE_UPSERT
* TUPLE_UPSERT_ALL
* TUPLE_INSERT
* TUPLE_INSERT_ALL
* TUPLE_REPLACE
* TUPLE_REPLACE_EXACT
* TUPLE_DELETE
* TUPLE_DELETE_ALL
* TUPLE_DELETE_EXACT
* TUPLE_DELETE_ALL_EXACT
* TUPLE_CONTAINS_KEY

Client will compare this version to the known one and perform a background 
update, if necessary.


> C++ 3.0: propagate table schema updates to client on write-only operations
> --
>
> Key: IGNITE-19243
> URL: https://issues.apache.org/jira/browse/IGNITE-19243
> Project: Ignite
>  Issue Type: Improvement
>  Components: thin client
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Currently, C++ client receives table schema updates when write-read requests 
> are performed. For example, client performs TUPLE_GET request, sends key 
> tuple using old schema version, receives result tuple with the latest schema 
> version, and retrieves the latest schema.
> However, some requests are "write-only": client sends a tuple, but does not 
> receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
> case.
> To fix this, include the latest schema version into all write-only operation 
> responses:
> * TUPLE_UPSERT
> * TUPLE_UPSERT_ALL
> * TUPLE_INSERT
> * TUPLE_INSERT_ALL
> * TUPLE_REPLACE
> * TUPLE_REPLACE_EXACT
> * TUPLE_DELETE
> * TUPLE_DELETE_ALL
> * TUPLE_DELETE_EXACT
> * TUPLE_DELETE_ALL_EXACT
> * TUPLE_CONTAINS_KEY
> Client will compare this version to the known one and perform a background 
> update, if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19242) .NET: Thin 3.0: propagate table schema updates to client on write-only operations

2023-04-06 Thread Pavel Tupitsyn (Jira)
Pavel Tupitsyn created IGNITE-19242:
---

 Summary: .NET: Thin 3.0: propagate table schema updates to client 
on write-only operations
 Key: IGNITE-19242
 URL: https://issues.apache.org/jira/browse/IGNITE-19242
 Project: Ignite
  Issue Type: Improvement
  Components: thin client
Affects Versions: 3.0.0-beta1
Reporter: Pavel Tupitsyn
Assignee: Pavel Tupitsyn
 Fix For: 3.0.0-beta2


Currently, Java client receives table schema updates when write-read requests 
are performed. For example, client performs TUPLE_GET request, sends key tuple 
using old schema version, receives result tuple with the latest schema version, 
and retrieves the latest schema.

However, some requests are "write-only": client sends a tuple, but does not 
receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
case.

To fix this, include the latest schema version into all write-only operation 
responses:
* TUPLE_UPSERT
* TUPLE_UPSERT_ALL
* TUPLE_INSERT
* TUPLE_INSERT_ALL
* TUPLE_REPLACE
* TUPLE_REPLACE_EXACT
* TUPLE_DELETE
* TUPLE_DELETE_ALL
* TUPLE_DELETE_EXACT
* TUPLE_DELETE_ALL_EXACT
* TUPLE_CONTAINS_KEY

Client will compare this version to the known one and perform a background 
update, if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19241) Java thin 3.0: propagate table schema updates to client on write-only operations

2023-04-06 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-19241:

Summary: Java thin 3.0: propagate table schema updates to client on 
write-only operations  (was: Java thin: propagate table schema updates to 
client on write-only operations)

> Java thin 3.0: propagate table schema updates to client on write-only 
> operations
> 
>
> Key: IGNITE-19241
> URL: https://issues.apache.org/jira/browse/IGNITE-19241
> Project: Ignite
>  Issue Type: Improvement
>  Components: thin client
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Currently, Java client receives table schema updates when write-read requests 
> are performed. For example, client performs TUPLE_GET request, sends key 
> tuple using old schema version, receives result tuple with the latest schema 
> version, and retrieves the latest schema.
> However, some requests are "write-only": client sends a tuple, but does not 
> receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
> case.
> To fix this, include the latest schema version into all write-only operation 
> responses:
> * TUPLE_UPSERT
> * TUPLE_UPSERT_ALL
> * TUPLE_INSERT
> * TUPLE_INSERT_ALL
> * TUPLE_REPLACE
> * TUPLE_REPLACE_EXACT
> * TUPLE_DELETE
> * TUPLE_DELETE_ALL
> * TUPLE_DELETE_EXACT
> * TUPLE_DELETE_ALL_EXACT
> * TUPLE_CONTAINS_KEY
> Client will compare this version to the known one and perform a background 
> update, if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19242) .NET: Thin 3.0: propagate table schema updates to client on write-only operations

2023-04-06 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-19242:

Description: 
Currently, .NET client receives table schema updates when write-read requests 
are performed. For example, client performs TUPLE_GET request, sends key tuple 
using old schema version, receives result tuple with the latest schema version, 
and retrieves the latest schema.

However, some requests are "write-only": client sends a tuple, but does not 
receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
case.

To fix this, include the latest schema version into all write-only operation 
responses:
* TUPLE_UPSERT
* TUPLE_UPSERT_ALL
* TUPLE_INSERT
* TUPLE_INSERT_ALL
* TUPLE_REPLACE
* TUPLE_REPLACE_EXACT
* TUPLE_DELETE
* TUPLE_DELETE_ALL
* TUPLE_DELETE_EXACT
* TUPLE_DELETE_ALL_EXACT
* TUPLE_CONTAINS_KEY

Client will compare this version to the known one and perform a background 
update, if necessary.

  was:
Currently, Java client receives table schema updates when write-read requests 
are performed. For example, client performs TUPLE_GET request, sends key tuple 
using old schema version, receives result tuple with the latest schema version, 
and retrieves the latest schema.

However, some requests are "write-only": client sends a tuple, but does not 
receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
case.

To fix this, include the latest schema version into all write-only operation 
responses:
* TUPLE_UPSERT
* TUPLE_UPSERT_ALL
* TUPLE_INSERT
* TUPLE_INSERT_ALL
* TUPLE_REPLACE
* TUPLE_REPLACE_EXACT
* TUPLE_DELETE
* TUPLE_DELETE_ALL
* TUPLE_DELETE_EXACT
* TUPLE_DELETE_ALL_EXACT
* TUPLE_CONTAINS_KEY

Client will compare this version to the known one and perform a background 
update, if necessary.


> .NET: Thin 3.0: propagate table schema updates to client on write-only 
> operations
> -
>
> Key: IGNITE-19242
> URL: https://issues.apache.org/jira/browse/IGNITE-19242
> Project: Ignite
>  Issue Type: Improvement
>  Components: thin client
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Currently, .NET client receives table schema updates when write-read requests 
> are performed. For example, client performs TUPLE_GET request, sends key 
> tuple using old schema version, receives result tuple with the latest schema 
> version, and retrieves the latest schema.
> However, some requests are "write-only": client sends a tuple, but does not 
> receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
> case.
> To fix this, include the latest schema version into all write-only operation 
> responses:
> * TUPLE_UPSERT
> * TUPLE_UPSERT_ALL
> * TUPLE_INSERT
> * TUPLE_INSERT_ALL
> * TUPLE_REPLACE
> * TUPLE_REPLACE_EXACT
> * TUPLE_DELETE
> * TUPLE_DELETE_ALL
> * TUPLE_DELETE_EXACT
> * TUPLE_DELETE_ALL_EXACT
> * TUPLE_CONTAINS_KEY
> Client will compare this version to the known one and perform a background 
> update, if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19241) Java thin: propagate table schema updates to client on write-only operations

2023-04-06 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-19241:

Description: 
Currently, Java client receives table schema updates when write-read requests 
are performed. For example, client performs TUPLE_GET request, sends key tuple 
using old schema version, receives result tuple with the latest schema version, 
and retrieves the latest schema.

However, some requests are "write-only": client sends a tuple, but does not 
receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
case.

To fix this, include the latest schema version into all write-only operation 
responses:
* TUPLE_UPSERT
* TUPLE_UPSERT_ALL
* TUPLE_INSERT
* TUPLE_INSERT_ALL
* TUPLE_REPLACE
* TUPLE_REPLACE_EXACT
* TUPLE_DELETE
* TUPLE_DELETE_ALL
* TUPLE_DELETE_EXACT
* TUPLE_DELETE_ALL_EXACT
* TUPLE_CONTAINS_KEY

Client will compare this version to the known one and perform a background 
update, if necessary.

> Java thin: propagate table schema updates to client on write-only operations
> 
>
> Key: IGNITE-19241
> URL: https://issues.apache.org/jira/browse/IGNITE-19241
> Project: Ignite
>  Issue Type: Improvement
>  Components: thin client
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Currently, Java client receives table schema updates when write-read requests 
> are performed. For example, client performs TUPLE_GET request, sends key 
> tuple using old schema version, receives result tuple with the latest schema 
> version, and retrieves the latest schema.
> However, some requests are "write-only": client sends a tuple, but does not 
> receive one back, like TUPLE_UPSERT. No schema updates are performed in this 
> case.
> To fix this, include the latest schema version into all write-only operation 
> responses:
> * TUPLE_UPSERT
> * TUPLE_UPSERT_ALL
> * TUPLE_INSERT
> * TUPLE_INSERT_ALL
> * TUPLE_REPLACE
> * TUPLE_REPLACE_EXACT
> * TUPLE_DELETE
> * TUPLE_DELETE_ALL
> * TUPLE_DELETE_EXACT
> * TUPLE_DELETE_ALL_EXACT
> * TUPLE_CONTAINS_KEY
> Client will compare this version to the known one and perform a background 
> update, if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19241) Java thin: propagate table schema updates to client on write-only operations

2023-04-06 Thread Pavel Tupitsyn (Jira)
Pavel Tupitsyn created IGNITE-19241:
---

 Summary: Java thin: propagate table schema updates to client on 
write-only operations
 Key: IGNITE-19241
 URL: https://issues.apache.org/jira/browse/IGNITE-19241
 Project: Ignite
  Issue Type: Improvement
  Components: thin client
Affects Versions: 3.0.0-beta1
Reporter: Pavel Tupitsyn
Assignee: Pavel Tupitsyn
 Fix For: 3.0.0-beta2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-16778) Support timestamp through jdbc

2023-04-06 Thread Ivan Artukhov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17709303#comment-17709303
 ] 

Ivan Artukhov commented on IGNITE-16778:


[~jooger] FYI ^^^

> Support timestamp through jdbc
> --
>
> Key: IGNITE-16778
> URL: https://issues.apache.org/jira/browse/IGNITE-16778
> Project: Ignite
>  Issue Type: Bug
>  Components: jdbc
>Reporter: Alexander Belyak
>Priority: Major
>  Labels: ignite-3
> Attachments: RunnerForTestNode.java
>
>
> Able to use timestamp data type through KV view by LocalDateTime, but no 
> through jdbc setTimestamp method. Example in attachment



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-19240) Use HTTPS port for dynamic completers when connected to SSL enabled node

2023-04-06 Thread Vadim Pakhnushev (Jira)
Vadim Pakhnushev created IGNITE-19240:
-

 Summary: Use HTTPS port for dynamic completers when connected to 
SSL enabled node
 Key: IGNITE-19240
 URL: https://issues.apache.org/jira/browse/IGNITE-19240
 Project: Ignite
  Issue Type: Bug
  Components: cli
Reporter: Vadim Pakhnushev
Assignee: Vadim Pakhnushev


Currently {{NodeNameRegistryImpl.urlFromClusterNode}} uses an HTTP port when 
constructing URLs for completion. HTTPS port should be used if the node is 
configured with SSL enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19238) ItDataTypesTest is flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Description: 
1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!

 
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv.

On TableManager stop, we stop and cleanup all table resources like replicas and 
raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update.

tablesByIdVv *listens same storage revision update event* in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing

 
{code:java}
updaterFuture = updaterFuture.whenComplete((v, t) -> 
versionedValue.complete(causalityToken, localUpdaterFuture)); {code}
As a result it's possible that we will clear tablesToStopInCaseOfError before 
publishing same revision tables to tablesByIdVv, so that we will miss that 
cleared tables in tablesByIdVv.latest() which is used in TableManager#stop.

 

  was:
1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!

 
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
 

2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv. 

2.1 On TableManager stop, we stop and cleanup all table resources like replicas 
and raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update. 

*!* tablesByIdVv listens same storage revision update event in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing

 

2.2 So that, we have following flow that touches tablesToStopInCaseOfError, 
tablesByIdVv

onCreateTable



[jira] [Updated] (IGNITE-19238) ItDataTypesTest is flaky

2023-04-06 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-19238:
-
Description: 
1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests failed 
to stop replicas on node stop:

!Снимок экрана от 2023-04-06 10-39-32.png!

 
{code:java}
java.lang.AssertionError: There are replicas alive 
[replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
    at 
org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
    at 
org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
    at 
org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
 

2. The reason why we failed to stop replicas is the race between 
tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv. 

2.1 On TableManager stop, we stop and cleanup all table resources like replicas 
and raft nodes
{code:java}
public void stop() {
  ...
  Map tables = tablesByIdVv.latest();  // 1*
  cleanUpTablesResources(tables); 
  cleanUpTablesResources(tablesToStopInCaseOfError);
  ...
}{code}
where tablesToStopInCaseOfError is a sort of pending tables list which one is 
cleared on cfg storage revision update. 

*!* tablesByIdVv listens same storage revision update event in order to publish 
tables related to the given revision or in other words make such tables 
accessible from tablesByIdVv.latest(); that one that is used in order to 
retrieve tables for cleanup on components stop (see // 1* above)
{code:java}
public TableManager(
  ... 
  tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);

  registry.accept(token -> {
tablesToStopInCaseOfError.clear();

return completedFuture(null);
  });
  {code}
However inside IncrementalVersionedValue we have async storageRevision update 
processing

 

2.2 So that, we have following flow that touches tablesToStopInCaseOfError, 
tablesByIdVv

onCreateTable

  was:It


> ItDataTypesTest is flaky
> 
>
> Key: IGNITE-19238
> URL: https://issues.apache.org/jira/browse/IGNITE-19238
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
> Attachments: Снимок экрана от 2023-04-06 10-39-32.png
>
>
> 1. ItDataTypesTest is flaky because previous ItCreateTableDdlTest tests 
> failed to stop replicas on node stop:
> !Снимок экрана от 2023-04-06 10-39-32.png!
>  
> {code:java}
> java.lang.AssertionError: There are replicas alive 
> [replicas=[b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_21, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_6, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_13, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_8, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_9, 
> b86c60a8-4ea3-4592-abef-6438cfc4cdb2_part_11]]
>     at 
> org.apache.ignite.internal.replicator.ReplicaManager.stop(ReplicaManager.java:341)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.lambda$stopAllComponents$1(LifecycleManager.java:133)
>     at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.stopAllComponents(LifecycleManager.java:131)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.stopNode(LifecycleManager.java:115){code}
>  
> 2. The reason why we failed to stop replicas is the race between 
> tablesToStopInCaseOfError cleanup and adding tables to tablesByIdVv. 
> 2.1 On TableManager stop, we stop and cleanup all table resources like 
> replicas and raft nodes
> {code:java}
> public void stop() {
>   ...
>   Map tables = tablesByIdVv.latest();  // 1*
>   cleanUpTablesResources(tables); 
>   cleanUpTablesResources(tablesToStopInCaseOfError);
>   ...
> }{code}
> where tablesToStopInCaseOfError is a sort of pending tables list which one is 
> cleared on cfg storage revision update. 
> *!* tablesByIdVv listens same storage revision update event in order to 
> publish tables related to the given revision or in other words make such 
> tables accessible from tablesByIdVv.latest(); that one that is used in order 
> to retrieve tables for cleanup on components stop (see // 1* above)
> {code:java}
> public TableManager(
>   ... 
>   tablesByIdVv = new IncrementalVersionedValue<>(registry, HashMap::new);
>   registry.accept(token -> {
> 

[jira] [Assigned] (IGNITE-19152) Named list support in local file configuration is broken.

2023-04-06 Thread Aleksandr (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr reassigned IGNITE-19152:
--

Assignee: Aleksandr

> Named list support in local file configuration is broken. 
> --
>
> Key: IGNITE-19152
> URL: https://issues.apache.org/jira/browse/IGNITE-19152
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Assignee: Aleksandr
>Priority: Major
>  Labels: ignite-3
>
> After IGNITE-18581 we have started to store local configuration in local 
> config file, instead of vault.
> The current flow with the saving configuration to a file has a bug. In the 
> method {{LocalFileConfigurationStorage#write}} we call 
> {{LocalFileConfigurationStorage#saveValues}} to save configuration fields to 
> a file, where we call 
> {{{}LocalFileConfigurationStorage#renderHoconString{}}}. Named list value has 
> internal id which is {{{}UUID{}}}, but {{com.typesafe}} do not support 
> {{{}UUID{}}}, so the whole process of saving configuration to a file fails 
> with
> {noformat}
> Caused by: com.typesafe.config.ConfigException$BugOrBroken: bug in method 
> caller: not valid to create ConfigValue from: 
> 489e16e8-3123-44a3-b27d-6e410863eb24
>   at 
> app//com.typesafe.config.impl.ConfigImpl.fromAnyRef(ConfigImpl.java:282)
>   at 
> app//com.typesafe.config.impl.PropertiesParser.fromPathMap(PropertiesParser.java:165)
>   at 
> app//com.typesafe.config.impl.PropertiesParser.fromPathMap(PropertiesParser.java:95)
>   at 
> app//com.typesafe.config.impl.ConfigImpl.fromAnyRef(ConfigImpl.java:265)
>   at 
> app//com.typesafe.config.impl.ConfigImpl.fromPathMap(ConfigImpl.java:201)
>   at 
> app//com.typesafe.config.ConfigFactory.parseMap(ConfigFactory.java:1225)
>   at 
> app//com.typesafe.config.ConfigFactory.parseMap(ConfigFactory.java:1236)
>   at 
> app//org.apache.ignite.internal.configuration.storage.LocalFileConfigurationStorage.renderHoconString(LocalFileConfigurationStorage.java:208)
>   at 
> app//org.apache.ignite.internal.configuration.storage.LocalFileConfigurationStorage.saveValues(LocalFileConfigurationStorage.java:185)
>   at 
> app//org.apache.ignite.internal.configuration.storage.LocalFileConfigurationStorage.write(LocalFileConfigurationStorage.java:138)
>   at 
> app//org.apache.ignite.internal.configuration.ConfigurationChanger.changeInternally0(ConfigurationChanger.java:606)
>   at 
> app//org.apache.ignite.internal.configuration.ConfigurationChanger.lambda$changeInternally$1(ConfigurationChanger.java:541)
> {noformat}
> h3. More details
> The problem is trickier than it may seem.
> Configuration storages receive data in "flat" data format, meaning that the 
> entire tree is converted into a list of pairs:
> {code:java}
> [{ "dot-separated key string", "serializable value" }]{code}
> LocalFileConfigurationStorage interprets keys as literal paths in HOCON 
> representation, which is simply not correct. These keys and values also have 
> meta-information, associated with them, such as:
>  * order of elements in named list configuration
>  * internal ids for named list elements
> To see, what's exactly in there, you may refer to the 
> {{{}org.apache.ignite.internal.configuration.tree.NamedListNodeTest{}}}. It 
> has everything laid out explicitly.
> h3. Proposed fix
> Well, the ideal approach would be rendering the configuration more or less 
> the same way, as we do it for REST.
> It means calling {{ConfigurationUtil#fillFromPrefixMap}} for every local root.
> Local roots can be retrieved using {{{}ConfigurationModule{}}}, by reading 
> them all from the class path.
> Resulting nodes are converted to maps using {{{}ConverterToMapVisitor{}}}. 
> Then maps are converted to HOCON using its own API.
> There are several hidden problems here.
>  * {-}we must check, that HOCON preserves order of keys{-}, and that we use 
> linked hash maps in {{fillFromPrefixMap}}
> EDIT: HOCON sorts keys alphabetically. Ok
>  * {{ConverterToMapVisitor}} does not expect null nodes, because it always 
> works with "full" trees. Fixing it would require some fine-tuning, otherwise 
> one may end up with a bunch of empty nodes in the config file, which is bad
>  * {{ConverterToMapVisitor}} uses array syntax for named lists. You can see 
> it in action in {{{}HoconConverterTest{}}}.
> Yes, there are two ways of representing named lists in the system. We should 
> make rendering mode configurable, because local configuration, at the moment, 
> only needs basic tree representation (for node attributes)
> We should also add tests for most of these improvements. First of all, to 
> {{{}HoconConverterTest{}}}.
> h3. Misc
> Another extremely uncertain thing is the way we handle default values. This 
> may be a topic for another 

[jira] [Commented] (IGNITE-16778) Support timestamp through jdbc

2023-04-06 Thread Alexander Belyak (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17709271#comment-17709271
 ] 

Alexander Belyak commented on IGNITE-16778:
---

Now Instant, LocalTime, LocalDate, LocalDateTime supported, but not the 
java.sql.Timestamp. Is there are some particular reason for that?

> Support timestamp through jdbc
> --
>
> Key: IGNITE-16778
> URL: https://issues.apache.org/jira/browse/IGNITE-16778
> Project: Ignite
>  Issue Type: Bug
>  Components: jdbc
>Reporter: Alexander Belyak
>Priority: Major
>  Labels: ignite-3
> Attachments: RunnerForTestNode.java
>
>
> Able to use timestamp data type through KV view by LocalDateTime, but no 
> through jdbc setTimestamp method. Example in attachment



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There are possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start).

How to reproduce: 
# Set checkpoint frequency less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 

  was:
There are possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start).

How to reproduce: 
# Set checkpoint frequency is less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There are possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start).
> How to reproduce: 
> # Set checkpoint frequency less than failure detection timeout.
> # Ensure, that cache groups partitions states restoring lasts more than 
> failure detection timeout, i.e. it is actual to sufficiently large caches.
> Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Description: 
There are possible error messages about checkpoint read lock acquisition 
timeouts and critical threads blocking during snapshot restore process (just 
after caches start).

How to reproduce: 
# Set checkpoint frequency is less than failure detection timeout.
# Ensure, that cache groups partitions states restoring lasts more than failure 
detection timeout, i.e. it is actual to sufficiently large caches.

Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 

  was:

When cache groups restore lasts more than failure detection timout


> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> There are possible error messages about checkpoint read lock acquisition 
> timeouts and critical threads blocking during snapshot restore process (just 
> after caches start).
> How to reproduce: 
> # Set checkpoint frequency is less than failure detection timeout.
> # Ensure, that cache groups partitions states restoring lasts more than 
> failure detection timeout, i.e. it is actual to sufficiently large caches.
> Reproducer:  [^BlockingThreadsOnSnapshotRestoreReproducerTest.patch] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19239) Checkpoint read lock acquisition timeouts during snapshot restore

2023-04-06 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-19239:
---
Attachment: BlockingThreadsOnSnapshotRestoreReproducerTest.patch

> Checkpoint read lock acquisition timeouts during snapshot restore
> -
>
> Key: IGNITE-19239
> URL: https://issues.apache.org/jira/browse/IGNITE-19239
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Shishkov
>Priority: Minor
>  Labels: iep-43, ise
> Attachments: BlockingThreadsOnSnapshotRestoreReproducerTest.patch
>
>
> When cache groups restore lasts more than failure detection timout



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-16779) Support decimal through jdbc

2023-04-06 Thread Alexander Belyak (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Belyak resolved IGNITE-16779.
---
Resolution: Cannot Reproduce

> Support decimal through jdbc
> 
>
> Key: IGNITE-16779
> URL: https://issues.apache.org/jira/browse/IGNITE-16779
> Project: Ignite
>  Issue Type: Bug
>  Components: jdbc
>Reporter: Alexander Belyak
>Priority: Major
>  Labels: ignite-3
>
> Unable to insert values into decimal type (like decimal(12, 2)) columns 
> through jdbc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-19164) Improve message about requested partitions during snapshot restore

2023-04-06 Thread Julia Bakulina (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julia Bakulina reassigned IGNITE-19164:
---

Assignee: Julia Bakulina

> Improve message about requested partitions during snapshot restore
> --
>
> Key: IGNITE-19164
> URL: https://issues.apache.org/jira/browse/IGNITE-19164
> Project: Ignite
>  Issue Type: Task
>Reporter: Ilya Shishkov
>Assignee: Julia Bakulina
>Priority: Minor
>  Labels: iep-43, ise
>
> Currently, during snapshot restore message is logged before requesting 
> partitions from remote nodes:
> {quote}
> [2023-03-24T18:06:59,910][INFO 
> ][disco-notifier-worker-#792%node%|#792%node%][SnapshotRestoreProcess] Trying 
> to request partitions from remote nodes 
> [reqId=ff682204-9554-4fbb-804c-38a79c0b286a, snapshot=snapshot_name, 
> map={*{color:#FF}76e22ef5-3c76-4987-bebd-9a6222a0{color}*={*{color:#FF}-903566235{color}*=[0,2,4,6,11,12,18,98,100,170,190,194,1015],
>  
> *{color:#FF}1544803905{color}*=[1,11,17,18,22,25,27,35,37,42,45,51,62,64,67,68,73,76,1017]}}]
> {quote}
> It is necessary to make this output "human readable":
> # Print messages per node instead of one message for all nodes.
> # Print node consistent id and address.
> # Print cache / group name.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   >