date:20231101

[jira] [Assigned] (IGNITE-18662) Sql. Numeric to/from decimal cast with overflow does not produce an error

2023-11-01 Thread Evgeny Stanilovsky (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-18662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgeny Stanilovsky reassigned IGNITE-18662:
---

Assignee: Evgeny Stanilovsky

> Sql. Numeric to/from decimal cast with overflow does not produce an error 
> --
>
> Key: IGNITE-18662
> URL: https://issues.apache.org/jira/browse/IGNITE-18662
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Maksim Zhuravkov
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: calcite2-required, calcite3-required, ignite-3
> Fix For: 3.0.0-beta2
>
>
> Casts from numeric type to decimal with overflow must fail but they return a 
> result:
> {code:java}
> SELECT 1000::BIGINT::DECIMAL(3,1)
> {code}
> Returns 
> {code:java}
> 
> 100
> {code}
> And the following query:
> {code:java}
> SELECT 2147483648::DECIMAL(18,0)::INTEGER
> 
> -2147483648
> # Integer.MIN_VALUE
> {code}
> {code:java}
> query I
> SELECT 128::DECIMAL(3,0)::TINYINT
> 
> -128
> # Byte.MIN_VALUE 
> {code}
> See skipif-ed examples in cast_to_decimal.test and cast_from_decimal.test



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-16204) Calcite integration. Muted JDBC multistatement test

2023-11-01 Thread Evgeny Stanilovsky (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-16204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgeny Stanilovsky reassigned IGNITE-16204:
---

Assignee: (was: Evgeny Stanilovsky)

> Calcite integration. Muted JDBC multistatement test
> ---
>
> Key: IGNITE-16204
> URL: https://issues.apache.org/jira/browse/IGNITE-16204
> Project: Ignite
>  Issue Type: New Feature
>  Components: sql
>Reporter: Vladimir Ermakov
>Priority: Major
>  Labels: ignite-3
>
> Let's remove muted test.. org.apache.ignite.jdbc.ItJdbcMultiStatementSelfTest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-20758) Fix LongDestroyDurableBackgroundTaskTest

2023-11-01 Thread Ilya Shishkov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov reassigned IGNITE-20758:
--

Assignee: Ilya Shishkov

> Fix LongDestroyDurableBackgroundTaskTest
> 
>
> Key: IGNITE-20758
> URL: https://issues.apache.org/jira/browse/IGNITE-20758
> Project: Ignite
>  Issue Type: Test
>Reporter: Ilya Shishkov
>Assignee: Ilya Shishkov
>Priority: Trivial
>  Labels: ise
>
> Some tests fails with error:
> {code}
> java.lang.IllegalArgumentException: Value for '--check-first' property should 
> be positive.
>   at 
> org.apache.ignite.internal.management.cache.CacheValidateIndexesCommandArg.ensurePositive(CacheValidateIndexesCommandArg.java:70)
>   at 
> org.apache.ignite.internal.management.cache.CacheValidateIndexesCommandArg.checkFirst(CacheValidateIndexesCommandArg.java:165)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest.validateIndexes(LongDestroyDurableBackgroundTaskTest.java:374)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest.testLongIndexDeletion(LongDestroyDurableBackgroundTaskTest.java:339)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest.testLongIndexDeletionSimple(LongDestroyDurableBackgroundTaskTest.java:630)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2499)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20697) Move physical records from WAL to another storage

2023-11-01 Thread Aleksey Plekhanov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Plekhanov updated IGNITE-20697:
---
Description: 
Currently, physycal records take most of the WAL size. But physical records in 
WAL files required only for crash recovery and these records are useful only 
for a short period of time (since last checkpoint). 
Size of physical records during checkpoint is more than size of all modified 
pages between checkpoints, since we need to store page snapshot record for each 
modified page and page delta records, if page is modified more than once 
between checkpoints.
We process WAL file several times in stable workflow (without crashes and 
rebalances):
 # We write records to WAL files
 # We copy WAL files to archive
 # We compact WAL files (remove phisical records + compress)

So, totally we write all physical records twice and read physical records at 
least twice.

To reduce disc workload we can move physical records to another storage and 
don't write them to WAL files. To provide the same crash recovery guarantees we 
can write modified pages twice during checkpoint. First time to some delta file 
and second time to the page storage. In this case we can recover any page if we 
crash during write to page storage from delta file (instead of WAL, as we do 
now).

This proposal has pros and cons.
Pros:
 - Less size of stored data (we don't store page delta files, only final state 
of the page)
 - Reduced disc workload (we write all modified pages once instead of 2 writes 
and 2 reads of larger amount of data)
 - Potentially reduced latency (instead of writing physical records 
synchronously during data modification we write to WAL only logical records and 
physical pages will be written by checkpointer threads)

Cons:
 - Increased checkpoint duration (we should write doubled amount of data during 
checkpoint)

Let's try to implement it and benchmark.

  was:
Currentrly, physycal records take most of the WAL size. But physical records in 
WAL files required only for crash recovery and these records are useful only 
for a short period of time (since last checkpoint). 
Size of physical records during checkpoint is more than size of all modified 
pages between checkpoints, since we need to store page snapshot record for each 
modified page and page delta records, if page is modified more than once 
between checkpoints.
We process WAL file several times in stable workflow (without crashes and 
rebalances):
 # We write records to WAL files
 # We copy WAL files to archive
 # We compact WAL files (remove phisical records + compress)

So, totally we write all physical records twice and read physical records at 
least twice.

To reduce disc workload we can move physical records to another storage and 
don't write them to WAL files. To provide the same crash recovery guarantees we 
can write modified pages twice during checkpoint. First time to some delta file 
and second time to the page storage. In this case we can recover any page if we 
crash during write to page storage from delta file (instead of WAL, as we do 
now).

This proposal has pros and cons.
Pros:
 - Less size of stored data (we don't store page delta files, only final state 
of the page)
 - Reduced disc workload (we store additionally write once all modified pages 
instead of 2 writes and 2 reads of larger amount of data)
 - Potentially reduced latency (instead of writing physical records 
synchronously during data modification we write to WAL only logical records and 
physical pages will be written by checkpointer threads)

Cons:
 - Increased checkpoint duration (we should write doubled amount of data during 
checkpoint)

Let's try to implement it and benchmark.


> Move physical records from WAL to another storage 
> --
>
> Key: IGNITE-20697
> URL: https://issues.apache.org/jira/browse/IGNITE-20697
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Aleksey Plekhanov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Labels: iep-113, ise
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, physycal records take most of the WAL size. But physical records 
> in WAL files required only for crash recovery and these records are useful 
> only for a short period of time (since last checkpoint). 
> Size of physical records during checkpoint is more than size of all modified 
> pages between checkpoints, since we need to store page snapshot record for 
> each modified page and page delta records, if page is modified more than once 
> between checkpoints.
> We process WAL file several times in stable workflow (without crashes and 
> rebalances):
>  # We write records to WAL files
>  # We copy WAL files to archive
>  # We compact WAL files (remove phisical records + compress)
>

[jira] [Updated] (IGNITE-20697) Move physical records from WAL to another storage

2023-11-01 Thread Aleksey Plekhanov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Plekhanov updated IGNITE-20697:
---
Labels: iep-113 ise  (was: ise)

> Move physical records from WAL to another storage 
> --
>
> Key: IGNITE-20697
> URL: https://issues.apache.org/jira/browse/IGNITE-20697
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Aleksey Plekhanov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Labels: iep-113, ise
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currentrly, physycal records take most of the WAL size. But physical records 
> in WAL files required only for crash recovery and these records are useful 
> only for a short period of time (since last checkpoint). 
> Size of physical records during checkpoint is more than size of all modified 
> pages between checkpoints, since we need to store page snapshot record for 
> each modified page and page delta records, if page is modified more than once 
> between checkpoints.
> We process WAL file several times in stable workflow (without crashes and 
> rebalances):
>  # We write records to WAL files
>  # We copy WAL files to archive
>  # We compact WAL files (remove phisical records + compress)
> So, totally we write all physical records twice and read physical records at 
> least twice.
> To reduce disc workload we can move physical records to another storage and 
> don't write them to WAL files. To provide the same crash recovery guarantees 
> we can write modified pages twice during checkpoint. First time to some delta 
> file and second time to the page storage. In this case we can recover any 
> page if we crash during write to page storage from delta file (instead of 
> WAL, as we do now).
> This proposal has pros and cons.
> Pros:
>  - Less size of stored data (we don't store page delta files, only final 
> state of the page)
>  - Reduced disc workload (we store additionally write once all modified pages 
> instead of 2 writes and 2 reads of larger amount of data)
>  - Potentially reduced latency (instead of writing physical records 
> synchronously during data modification we write to WAL only logical records 
> and physical pages will be written by checkpointer threads)
> Cons:
>  - Increased checkpoint duration (we should write doubled amount of data 
> during checkpoint)
> Let's try to implement it and benchmark.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IGNITE-20778) Provide internal API to get a list of parameters for non-executed query

2023-11-01 Thread Igor Sapego (Jira)

Igor Sapego created IGNITE-20778:


 Summary: Provide internal API to get a list of parameters for 
non-executed query
 Key: IGNITE-20778
 URL: https://issues.apache.org/jira/browse/IGNITE-20778
 Project: Ignite
  Issue Type: Bug
  Components: sql
Reporter: Igor Sapego


To properly support ODBC metadata requirements we should be able to provide 
user with a list of parameters with types for the non-executed query.

Example
Given a query:
{noformat}
insert into some_table(id, val1, val2) values(?, ?, ?)
{noformat}

This API should return an array of size of 3 that at least contains types for 
the parameters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20778) Provide internal API to get a list of parameters for non-executed query

2023-11-01 Thread Igor Sapego (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego updated IGNITE-20778:
-
Issue Type: New Feature  (was: Bug)

> Provide internal API to get a list of parameters for non-executed query
> ---
>
> Key: IGNITE-20778
> URL: https://issues.apache.org/jira/browse/IGNITE-20778
> Project: Ignite
>  Issue Type: New Feature
>  Components: sql
>Reporter: Igor Sapego
>Priority: Major
>  Labels: ignite-3
>
> To properly support ODBC metadata requirements we should be able to provide 
> user with a list of parameters with types for the non-executed query.
> Example
> Given a query:
> {noformat}
> insert into some_table(id, val1, val2) values(?, ?, ?)
> {noformat}
> This API should return an array of size of 3 that at least contains types for 
> the parameters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-20663) Thin 3.0: ItThinClientSchemaSynchronizationTest.testClientUsesLatestSchemaOnWrite is flaky

2023-11-01 Thread Igor Sapego (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Sapego reassigned IGNITE-20663:


Assignee: Igor Sapego

> Thin 3.0: 
> ItThinClientSchemaSynchronizationTest.testClientUsesLatestSchemaOnWrite is 
> flaky
> --
>
> Key: IGNITE-20663
> URL: https://issues.apache.org/jira/browse/IGNITE-20663
> Project: Ignite
>  Issue Type: Bug
>  Components: thin client
>Reporter: Igor Sapego
>Assignee: Igor Sapego
>Priority: Major
>  Labels: ignite-3
>
> The following test is flaky:
> https://ci.ignite.apache.org/test/-4306071594745342563?currentProjectId=ApacheIgnite3xGradle_Test_IntegrationTests&expandTestHistoryChartSection=true
> Need to investigate and fix.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20777) Exception while init cluster due to missed `add-opens` argument

2023-11-01 Thread Igor (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor updated IGNITE-20777:
--
Description: 
*Steps to reproduce:*
 # Start cluster with 2 nodes.
 # Init cluster

*Expected result:*

Cluster started

*Actual result:*

Error in log and cluster shutting down.

[^ignite3db-0.log.txt] [^stderr.log.txt]

*Workaround:*

If the option *--add-opens=java.base/sun.nio.ch=ALL-UNNAMED* is added to 
startup script, cluster works fine.

  was:
*Steps to reproduce:*
 # Start cluster with 2 nodes.
 # Init cluster

*Expected result:*

Cluster started

{*}Actual result:{*}{*}{*}

Error in log and cluster shutting down.

[^ignite3db-0.log.txt] [^stderr.log.txt]

*Workaround:*

If the option *--add-opens=java.base/sun.nio.ch=ALL-UNNAMED* added to startup 
script, cluster works fine.


> Exception while init cluster due to missed `add-opens` argument
> ---
>
> Key: IGNITE-20777
> URL: https://issues.apache.org/jira/browse/IGNITE-20777
> Project: Ignite
>  Issue Type: Bug
>  Components: cli, general
>Affects Versions: 3.0.0-beta2
>Reporter: Igor
>Priority: Major
>  Labels: ignite-3
> Attachments: ignite3db-0.log.txt, stderr.log.txt
>
>
> *Steps to reproduce:*
>  # Start cluster with 2 nodes.
>  # Init cluster
> *Expected result:*
> Cluster started
> *Actual result:*
> Error in log and cluster shutting down.
> [^ignite3db-0.log.txt] [^stderr.log.txt]
> *Workaround:*
> If the option *--add-opens=java.base/sun.nio.ch=ALL-UNNAMED* is added to 
> startup script, cluster works fine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IGNITE-20777) Exception while init cluster due to missed `add-opens` argument

2023-11-01 Thread Igor (Jira)

Igor created IGNITE-20777:
-

 Summary: Exception while init cluster due to missed `add-opens` 
argument
 Key: IGNITE-20777
 URL: https://issues.apache.org/jira/browse/IGNITE-20777
 Project: Ignite
  Issue Type: Bug
  Components: cli, general
Affects Versions: 3.0.0-beta2
Reporter: Igor
 Attachments: ignite3db-0.log.txt, stderr.log.txt

*Steps to reproduce:*
 # Start cluster with 2 nodes.
 # Init cluster

*Expected result:*

Cluster started

{*}Actual result:{*}{*}{*}

Error in log and cluster shutting down.

[^ignite3db-0.log.txt] [^stderr.log.txt]

*Workaround:*

If the option *--add-opens=java.base/sun.nio.ch=ALL-UNNAMED* added to startup 
script, cluster works fine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-19447) Switch schema validation to CatalogService

2023-11-01 Thread Roman Puchkovskiy (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-19447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy reassigned IGNITE-19447:
--

Assignee: Roman Puchkovskiy

> Switch schema validation to CatalogService
> --
>
> Key: IGNITE-19447
> URL: https://issues.apache.org/jira/browse/IGNITE-19447
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Roman Puchkovskiy
>Assignee: Roman Puchkovskiy
>Priority: Major
>  Labels: iep-98, ignite-3
>
> Currently, CatalogService is 'hanging in the air' and real transaction 
> processing code does not use it. When CatalogService is ready, we need to 
> switch schema validation to use it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20775) Create configuration and reasonable test defaults for logit storage

2023-11-01 Thread Ivan Bessonov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-20775:
---
Description: 
Right now, tests allocate too much disk space during runs. Literally gigabytes.

We must provide better defaults for tests (in TestIgnitionManager, for example).

We should also check that restart with updated configuration doesn't break 
anything.

Other minor fixes are welcomed here as well.

  was:
Right now, tests allocate too much disk space during runs. Literally gigabytes.

We must provide better defaults for tests (in TestIgnitionManager, for example).

We should also check that restart with updated configuration doesn't break 
anything.


> Create configuration and reasonable test defaults for logit storage
> ---
>
> Key: IGNITE-20775
> URL: https://issues.apache.org/jira/browse/IGNITE-20775
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: ignite-3
>
> Right now, tests allocate too much disk space during runs. Literally 
> gigabytes.
> We must provide better defaults for tests (in TestIgnitionManager, for 
> example).
> We should also check that restart with updated configuration doesn't break 
> anything.
> Other minor fixes are welcomed here as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IGNITE-20776) Stop nodes in parallel on Cluster shutdown

2023-11-01 Thread Roman Puchkovskiy (Jira)

Roman Puchkovskiy created IGNITE-20776:
--

 Summary: Stop nodes in parallel on Cluster shutdown
 Key: IGNITE-20776
 URL: https://issues.apache.org/jira/browse/IGNITE-20776
 Project: Ignite
  Issue Type: Improvement
Reporter: Roman Puchkovskiy
Assignee: Roman Puchkovskiy
 Fix For: 3.0.0-beta2


When Cluster#shutdown() is invoked, it stops all started nodes. Currently, they 
are stopped sequentially. This should be done in parallel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IGNITE-20775) Create configuration and reasonable test defaults for logit storage

2023-11-01 Thread Ivan Bessonov (Jira)

Ivan Bessonov created IGNITE-20775:
--

 Summary: Create configuration and reasonable test defaults for 
logit storage
 Key: IGNITE-20775
 URL: https://issues.apache.org/jira/browse/IGNITE-20775
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Right now, tests allocate too much disk space during runs. Literally gigabytes.

We must provide better defaults for tests (in TestIgnitionManager, for example).

We should also check that restart with updated configuration doesn't break 
anything.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (IGNITE-20760) Drop column error message get indexes by column name only

2023-11-01 Thread Evgeny Stanilovsky (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781739#comment-17781739
 ] 

Evgeny Stanilovsky commented on IGNITE-20760:
-

[~jooger] thanks! merged to main.

> Drop column error message get indexes by column name only 
> --
>
> Key: IGNITE-20760
> URL: https://issues.apache.org/jira/browse/IGNITE-20760
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 3.0
>Reporter: Alexander Belyak
>Assignee: Yury Gerzhedovich
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If there are some index, preventing column to be dropped, then error message 
> contains all indexes with all columns with the same name. Even if there is 
> completely different tables with the same named columns:
> {code:java}
> drop table tab1;
> drop table tab2;
> create table tab1(id integer not null primary key, f1 int);
> create index tab1_f1 on tab1(f1);
> create table tab2(id integer not null primary key, f1 int, f2 int);
> create index tab2_f1 on tab2(f1);
> create index tab2_f12 on tab2(f1,f2);
> alter table tab2 drop column f1;
> >> Fail with wrong error message: 
> >> [Code: 0, SQL State: 5]  Failed to validate query. Deleting column 
> >> 'F1' used by index(es) [TAB1_F1, TAB2_F1, TAB2_F12], it is not allowed
> >> Because it contains TAB1_F1 index
> drop index tab2_f12;
> drop index tab2_f1;
> alter table tab2 drop column f1
> >> Success, so the problem only on the error message generation. {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-20540) Document configuring logging in AI3

2023-11-01 Thread Igor Gusev (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Gusev reassigned IGNITE-20540:
---

Assignee: Igor Gusev

> Document configuring logging in AI3
> ---
>
> Key: IGNITE-20540
> URL: https://issues.apache.org/jira/browse/IGNITE-20540
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Reporter: Vyacheslav Koptilin
>Assignee: Igor Gusev
>Priority: Major
>  Labels: ignite-3
>
> Need to provide all required information to properly configure logging on the 
> client side.
> JUL:
>  - SLF4J JDK14 Provider should be added to the classpath - 
> org.slf4j:slf4j-jdk14:2.0.x
>  - configuration file can be specified via `java.util.logging.config.file` 
> property
> additional details can be found here: 
> [https://docs.oracle.com/en/java/javase/11/core/java-logging-overview.html#GUID-B83B652C-17EA-48D9-93D2-563AE1FF8EDA]
> LOG4J2:
>  - SLF4J Provider should be added to the classpath - 
> org.apache.logging.log4j:log4j-slf4j2-impl:2.x.x (2.20.0 or higher, for 
> example)
>  - LOG4J2 API and implementation: org.apache.logging.log4j:log4j-api:2.x.x, 
> org.apache.logging.log4j:log4j-core:2.x.x
>  - LOG4J2 bridge: org.apache.logging.log4j:log4j-jpl:2.x.x
>  - configuration can be done via the properties file. additional details can 
> be found here [https://logging.apache.org/log4j/2.x/manual/configuration.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (IGNITE-20459) Preapre test plan for the Transactions on Stable Topology feature

2023-11-01 Thread Alexander Lapin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin resolved IGNITE-20459.
--
Resolution: Fixed

> Preapre test plan for the Transactions on Stable Topology feature
> -
>
> Key: IGNITE-20459
> URL: https://issues.apache.org/jira/browse/IGNITE-20459
> Project: Ignite
>  Issue Type: Task
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (IGNITE-20760) Drop column error message get indexes by column name only

2023-11-01 Thread Yury Gerzhedovich (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781710#comment-17781710
 ] 

Yury Gerzhedovich commented on IGNITE-20760:


[~zstan] could you please review the patch?

> Drop column error message get indexes by column name only 
> --
>
> Key: IGNITE-20760
> URL: https://issues.apache.org/jira/browse/IGNITE-20760
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 3.0
>Reporter: Alexander Belyak
>Assignee: Yury Gerzhedovich
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If there are some index, preventing column to be dropped, then error message 
> contains all indexes with all columns with the same name. Even if there is 
> completely different tables with the same named columns:
> {code:java}
> drop table tab1;
> drop table tab2;
> create table tab1(id integer not null primary key, f1 int);
> create index tab1_f1 on tab1(f1);
> create table tab2(id integer not null primary key, f1 int, f2 int);
> create index tab2_f1 on tab2(f1);
> create index tab2_f12 on tab2(f1,f2);
> alter table tab2 drop column f1;
> >> Fail with wrong error message: 
> >> [Code: 0, SQL State: 5]  Failed to validate query. Deleting column 
> >> 'F1' used by index(es) [TAB1_F1, TAB2_F1, TAB2_F12], it is not allowed
> >> Because it contains TAB1_F1 index
> drop index tab2_f12;
> drop index tab2_f1;
> alter table tab2 drop column f1
> >> Success, so the problem only on the error message generation. {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20301) ItIgniteInMemoryNodeRestartTest tests are flaky

2023-11-01 Thread Roman Puchkovskiy (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-20301:
---
Description: 
inMemoryNodeRestartNotLeader and inMemoryNodeFullPartitionRestart fail 50/50 on 
my machine.

For both tests, the following is seen in the console when they fail:

[2023-08-29T11:20:35,053][ERROR][%iiimnrt_imnfpr_3344%JRaft-Common-Executor-0][LogThreadPoolExecutor]
 Uncaught exception in pool: %iiimnrt_imnfpr_3344%JRaft-Common-Executor-, 
org.apache.ignite.raft.jraft.util.MetricThreadPoolExecutor@76504d30[Running, 
pool size = 4, active threads = 2, queued tasks = 0, completed tasks = 1601].
 java.lang.IllegalArgumentException: null
    at org.apache.ignite.raft.jraft.util.Requires.requireTrue(Requires.java:64) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.fillCommonFields(Replicator.java:1553)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1631) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1601) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.continueSending(Replicator.java:983)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.lambda$waitMoreEntries$9(Replicator.java:1583)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.runOnNewLog(LogManagerImpl.java:1197)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.lambda$wakeupAllWaiter$6(LogManagerImpl.java:398)
 ~[main/:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) 
~[?:?]
    at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
    at java.lang.Thread.run(Thread.java:834) ~[?:?]

 

{*}Update{*}: as of 2023-11-01, these failures can not be reproduced anymore, 
but others can be seen (with much lower frequency, approx 1 in a few dozen 
runs).

These happen for the following reason. When we recover an in-memory partition, 
we want to ask other participants of the group whether they have data in this 
partition or not. If we don't have the corresponding nodes in our local 
physical topology, we treat this as if they did not have the data. So if we 
recover a partition before SWIM has managed to fully deliver information about 
other cluster nodes (as ClusterService start does not imply that all this 
information is received), we skip those nodes, so we might think there is no 
majority and fail the partition start.

A possible solution is to wait till the nodes appear in the topology, up to 
some timeout (like 3 seconds).

  was:
inMemoryNodeRestartNotLeader and inMemoryNodeFullPartitionRestart fail 50/50 on 
my machine.

For both tests, the following is seen in the console when they fail:

[2023-08-29T11:20:35,053][ERROR][%iiimnrt_imnfpr_3344%JRaft-Common-Executor-0][LogThreadPoolExecutor]
 Uncaught exception in pool: %iiimnrt_imnfpr_3344%JRaft-Common-Executor-, 
org.apache.ignite.raft.jraft.util.MetricThreadPoolExecutor@76504d30[Running, 
pool size = 4, active threads = 2, queued tasks = 0, completed tasks = 1601].
 java.lang.IllegalArgumentException: null
    at org.apache.ignite.raft.jraft.util.Requires.requireTrue(Requires.java:64) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.fillCommonFields(Replicator.java:1553)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1631) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1601) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.continueSending(Replicator.java:983)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.lambda$waitMoreEntries$9(Replicator.java:1583)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.runOnNewLog(LogManagerImpl.java:1197)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.lambda$wakeupAllWaiter$6(LogManagerImpl.java:398)
 ~[main/:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) 
~[?:?]
    at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
    at java.lang.Thread.run(Thread.java:834) ~[?:?]

 

{*}Update{*}: as of 2023-11-01, these failures can not be reproduced anymore, 
but others can be seen (with much lower frequency, approx 1 in a few dozen 
runs).

These happen for the following reason. When we recover an in-memory partition, 
we want to ask other participants of the group whether they have data in thi

[jira] [Assigned] (IGNITE-20295) Get rid of server-side session management

2023-11-01 Thread Konstantin Orlov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Orlov reassigned IGNITE-20295:
-

Assignee: Konstantin Orlov

> Get rid of server-side session management
> -
>
> Key: IGNITE-20295
> URL: https://issues.apache.org/jira/browse/IGNITE-20295
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Yury Gerzhedovich
>Assignee: Konstantin Orlov
>Priority: Major
>  Labels: ignite-3
>
> At the begining of Ignite 3 was plans to have server side sessions and 
> partially it was implemented, but now it's not actual and we could remove all 
> obsolete code related to the part.
> At least required to remove package 
> org.apache.ignite.internal.sql.engine.session with all classes
> Also need to remove org.apache.ignite.sql.Session#idleTimeout
> All stuff requires for keep 'session' information could be moved to client 
> handlers, like a ClientResource class.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20301) ItIgniteInMemoryNodeRestartTest tests are flaky

2023-11-01 Thread Roman Puchkovskiy (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-20301:
---
Description: 
inMemoryNodeRestartNotLeader and inMemoryNodeFullPartitionRestart fail 50/50 on 
my machine.

For both tests, the following is seen in the console when they fail:

[2023-08-29T11:20:35,053][ERROR][%iiimnrt_imnfpr_3344%JRaft-Common-Executor-0][LogThreadPoolExecutor]
 Uncaught exception in pool: %iiimnrt_imnfpr_3344%JRaft-Common-Executor-, 
org.apache.ignite.raft.jraft.util.MetricThreadPoolExecutor@76504d30[Running, 
pool size = 4, active threads = 2, queued tasks = 0, completed tasks = 1601].
 java.lang.IllegalArgumentException: null
    at org.apache.ignite.raft.jraft.util.Requires.requireTrue(Requires.java:64) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.fillCommonFields(Replicator.java:1553)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1631) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1601) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.continueSending(Replicator.java:983)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.lambda$waitMoreEntries$9(Replicator.java:1583)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.runOnNewLog(LogManagerImpl.java:1197)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.lambda$wakeupAllWaiter$6(LogManagerImpl.java:398)
 ~[main/:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) 
~[?:?]
    at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
    at java.lang.Thread.run(Thread.java:834) ~[?:?]

 

{*}Update{*}: as of 2023-11-01, these failures can not be reproduced anymore, 
but others can be seen (with much lower frequency, approx 1 in a few dozen 
runs).

These happen for the following reason. When we recover an in-memory partition, 
we want to ask other participants of the group whether they have data in this 
partition or not. If we don't have the corresponding nodes in our local 
physical topology, we treat this as if they did not have the data. So if we 
recover a partition before SWIM has managed to fully deliver information about 
other cluster nodes, we skip those nodes, so we might think there is no 
majority and fail the partition start.

A possible solution is to wait till the nodes appear in the topology, up to 
some timeout (like 3 seconds).

  was:
inMemoryNodeRestartNotLeader and inMemoryNodeFullPartitionRestart fail 50/50 on 
my machine.

For both tests, the following is seen in the console when they fail:

[2023-08-29T11:20:35,053][ERROR][%iiimnrt_imnfpr_3344%JRaft-Common-Executor-0][LogThreadPoolExecutor]
 Uncaught exception in pool: %iiimnrt_imnfpr_3344%JRaft-Common-Executor-, 
org.apache.ignite.raft.jraft.util.MetricThreadPoolExecutor@76504d30[Running, 
pool size = 4, active threads = 2, queued tasks = 0, completed tasks = 1601].
 java.lang.IllegalArgumentException: null
    at org.apache.ignite.raft.jraft.util.Requires.requireTrue(Requires.java:64) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.fillCommonFields(Replicator.java:1553)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1631) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1601) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.continueSending(Replicator.java:983)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.lambda$waitMoreEntries$9(Replicator.java:1583)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.runOnNewLog(LogManagerImpl.java:1197)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.lambda$wakeupAllWaiter$6(LogManagerImpl.java:398)
 ~[main/:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) 
~[?:?]
    at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
    at java.lang.Thread.run(Thread.java:834) ~[?:?]

 

{*}Update{*}: as of 2023-11-01, these failures do not reproduce, but others can 
be seen (with much lower frequency, approx 1 in a few dozen runs). These happen 
for the following reason. When we recover an in-memory partition, we want to 
ask other participants of the group whether they have data in this partition or 
not. If we don't have the corresponding nodes in our local physical topology,

[jira] [Updated] (IGNITE-20301) ItIgniteInMemoryNodeRestartTest tests are flaky

2023-11-01 Thread Roman Puchkovskiy (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-20301:
---
Description: 
inMemoryNodeRestartNotLeader and inMemoryNodeFullPartitionRestart fail 50/50 on 
my machine.

For both tests, the following is seen in the console when they fail:

[2023-08-29T11:20:35,053][ERROR][%iiimnrt_imnfpr_3344%JRaft-Common-Executor-0][LogThreadPoolExecutor]
 Uncaught exception in pool: %iiimnrt_imnfpr_3344%JRaft-Common-Executor-, 
org.apache.ignite.raft.jraft.util.MetricThreadPoolExecutor@76504d30[Running, 
pool size = 4, active threads = 2, queued tasks = 0, completed tasks = 1601].
 java.lang.IllegalArgumentException: null
    at org.apache.ignite.raft.jraft.util.Requires.requireTrue(Requires.java:64) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.fillCommonFields(Replicator.java:1553)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1631) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1601) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.continueSending(Replicator.java:983)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.lambda$waitMoreEntries$9(Replicator.java:1583)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.runOnNewLog(LogManagerImpl.java:1197)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.lambda$wakeupAllWaiter$6(LogManagerImpl.java:398)
 ~[main/:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) 
~[?:?]
    at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
    at java.lang.Thread.run(Thread.java:834) ~[?:?]

 

{*}Update{*}: as of 2023-11-01, these failures do not reproduce, but others can 
be seen (with much lower frequency, approx 1 in a few dozen runs). These happen 
for the following reason. When we recover an in-memory partition, we want to 
ask other participants of the group whether they have data in this partition or 
not. If we don't have the corresponding nodes in our local physical topology, 
we treat this as if they did not have the data. So if we recover a partition 
before SWIM has managed to fully deliver information about other cluster nodes, 
we skip those nodes, so we might think there is no majority and fail the 
partition start.

A possible solution is to wait till the nodes appear in the topology, up to 
some timeout (like 3 seconds).

  was:
inMemoryNodeRestartNotLeader and inMemoryNodeFullPartitionRestart fail 50/50 on 
my machine.

For both tests, the following is seen in the console when they fail:

[2023-08-29T11:20:35,053][ERROR][%iiimnrt_imnfpr_3344%JRaft-Common-Executor-0][LogThreadPoolExecutor]
 Uncaught exception in pool: %iiimnrt_imnfpr_3344%JRaft-Common-Executor-, 
org.apache.ignite.raft.jraft.util.MetricThreadPoolExecutor@76504d30[Running, 
pool size = 4, active threads = 2, queued tasks = 0, completed tasks = 1601].
 java.lang.IllegalArgumentException: null
    at org.apache.ignite.raft.jraft.util.Requires.requireTrue(Requires.java:64) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.fillCommonFields(Replicator.java:1553)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1631) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.sendEntries(Replicator.java:1601) 
~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.continueSending(Replicator.java:983)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.core.Replicator.lambda$waitMoreEntries$9(Replicator.java:1583)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.runOnNewLog(LogManagerImpl.java:1197)
 ~[main/:?]
    at 
org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.lambda$wakeupAllWaiter$6(LogManagerImpl.java:398)
 ~[main/:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) 
~[?:?]
    at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
    at java.lang.Thread.run(Thread.java:834) ~[?:?]


> ItIgniteInMemoryNodeRestartTest tests are flaky
> ---
>
> Key: IGNITE-20301
> URL: https://issues.apache.org/jira/browse/IGNITE-20301
> Project: Ignite
>  Issue Type: Bug
>Reporter: Roman Puchkovskiy
>Assignee: Roman Puchkovskiy
>Priority: Major
>  Labels: ignite-3, te

[jira] [Updated] (IGNITE-20660) Sql. Broken behavior of OCTET_LENGTH function

2023-11-01 Thread Konstantin Orlov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Orlov updated IGNITE-20660:
--
Summary: Sql. Broken behavior of OCTET_LENGTH function  (was: SQL: broken 
behavior of OCTET_LENGTH function)

> Sql. Broken behavior of OCTET_LENGTH function
> -
>
> Key: IGNITE-20660
> URL: https://issues.apache.org/jira/browse/IGNITE-20660
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 3.0.0-beta2
>Reporter: Andrey Khitrin
>Assignee: Andrey Mashenkov
>Priority: Major
>  Labels: ignite-3, sql
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently, `OCTET_LENGTH` function acted like a synonym for `LENGTH` function. 
> But in the latest AI3 versions, it started to decline usual strings:
> {code:sql}
> sql-cli> select OCTET_LENGTH('Some Text');
> SQL query execution error
> Failed to validate query. From line 0, column 0 to line 1, column 31: Cast 
> function cannot convert value of type CHAR(9) to type VARBINARY
> {code}
> Looks like it only accepts a very specific type of strings:
> {code:sql}
> sql-cli> select OCTET_LENGTH(x'0123');
> ╔═══╗
> ║ OCTET_LENGTH(X'0123') ║
> ╠═══╣
> ║ 2 ║
> ╚═══╝
> {code}
> But as it said in ANSI99 specification, this function must be able to work 
> with any correct string.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-18662) Sql. Numeric to/from decimal cast with overflow does not produce an error

2023-11-01 Thread Yury Gerzhedovich (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-18662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Gerzhedovich updated IGNITE-18662:
---
Priority: Major  (was: Minor)

> Sql. Numeric to/from decimal cast with overflow does not produce an error 
> --
>
> Key: IGNITE-18662
> URL: https://issues.apache.org/jira/browse/IGNITE-18662
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Maksim Zhuravkov
>Priority: Major
>  Labels: calcite2-required, calcite3-required, ignite-3
> Fix For: 3.0.0-beta2
>
>
> Casts from numeric type to decimal with overflow must fail but they return a 
> result:
> {code:java}
> SELECT 1000::BIGINT::DECIMAL(3,1)
> {code}
> Returns 
> {code:java}
> 
> 100
> {code}
> And the following query:
> {code:java}
> SELECT 2147483648::DECIMAL(18,0)::INTEGER
> 
> -2147483648
> # Integer.MIN_VALUE
> {code}
> {code:java}
> query I
> SELECT 128::DECIMAL(3,0)::TINYINT
> 
> -128
> # Byte.MIN_VALUE 
> {code}
> See skipif-ed examples in cast_to_decimal.test and cast_from_decimal.test



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-20723) Tests fail on TC because a primary replica is not assigned or does not respond

2023-11-01 Thread Vyacheslav Koptilin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin reassigned IGNITE-20723:


Assignee: Vladislav Pyatkov

> Tests fail on TC because a primary replica is not assigned or does not respond
> --
>
> Key: IGNITE-20723
> URL: https://issues.apache.org/jira/browse/IGNITE-20723
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_SQL_Logic_11804.log.zip
>
>
> TC run is available 
> [here|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunnerSqlLogic/7584713?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildProblemsSection=true&expandBuildChangesSection=true&expandBuildTestsSection=true].
> By my brig analysis, the issue is somewhere in the assignments:
> {noformat}
> [2023-10-06T13:03:51,231][INFO 
> ][%sqllogic0%Raft-Group-Client-1][PlacementDriverManager] Placement driver 
> active actor is starting.
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=291_part_3, force=false
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Waiting for actual 
> storage state, group=291_part_3
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%JRaft-Request-Processor-5][ReplicaManager] Lease accepted, 
> group=291_part_3, leaseStartTime=HybridTimestamp [time=88228107862067, 
> physical=1696597718931, logical=51], leaseExpirationTime=HybridTimestamp 
> [time=88235972182016, physical=1696597838931, logical=0]
> [2023-10-06T13:08:50,256][WARN 
> ][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable error 
> during the request type=ActionRequestImpl occurred (will be retried on the 
> randomly selected node):
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException
> at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  [?:?]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>  [?:?]
> at 
> java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
>  [?:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: java.util.concurrent.TimeoutException
> ... 7 more
> at 
> app//org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:42)
> at app//org.junit.jupiter.api.Assertions.fail(Assertions.java:147)
> at 
> app//org.apache.ignite.internal.sqllogic.Statement.execute(Statement.java:112)
> at 
> app//org.apache.ignite.internal.sqllogic.SqlScriptRunner.run(SqlScriptRunner.java:70)
> at 
> app//org.junit.jupiter.api.AssertTimeoutPreemptively.lambda$assertTimeoutPreemptively$0(AssertTimeoutPreemptively.java:48)
> at 
> app//org.junit.jupiter.api.AssertTimeoutPreemptively.lambda$submitTask$3(AssertTimeoutPreemptively.java:95)
> at 
> java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base@11.0.17/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.ignite.tx.TransactionException: IGN-REP-3 
> TraceId:0002850d-2b24-4356-a0dc-2c25af4202a1 
> org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: 
> IGN-REP-3 TraceId:0002850d-2b24-4356-a0dc-2c25af4202a1 Replication is timed 
> out [replicaGrpId=291_part_3]
> at 
> java.base@11.0.17/java.lang.invoke.MethodHandle.invokeWithArguments(Met

[jira] [Assigned] (IGNITE-20160) NullPointerException in FSMCallerImpl.doCommitted

2023-11-01 Thread Vyacheslav Koptilin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin reassigned IGNITE-20160:


Assignee: Sergey Uttsel

> NullPointerException in FSMCallerImpl.doCommitted
> -
>
> Key: IGNITE-20160
> URL: https://issues.apache.org/jira/browse/IGNITE-20160
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Sergey Uttsel
>Priority: Blocker
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:496)
> at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:448)
> at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:136)
> at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:130)
> at 
> org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:226)
> at 
> org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191)
> at 
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137)
> {code}
> [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunnerSqlLogic/7410174?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildProblemsSection=true&expandBuildChangesSection=true]
> It happens here (see FSMCallerImpl#doCommitted):
> {code:java}
> final IteratorImpl iterImpl = new IteratorImpl(this.fsm, this.logManager, 
> closures, firstClosureIndex,
> lastAppliedIndex, committedIndex, this.applyingIndex, 
> this.node.getOptions());{code}
> on the 2nd line, most likely on resolving null pointer to *node,* which is 
> nullified on FSMCaller shutdown. Raft groups were being stopped in this 
> moment.
> *Implementation details*
> A simple fix to avoid the NPE at the aforementioned line would be to check 
> `node` for null. 
> Additionally it would be nice to check `shutdownLatch` in `doCommitted` and 
> make sure we call `unsubscribe` in a proper order.
> The reason is that doCommitted is called from a disruptor's callback. 
> One more observation - the reference to a node is set to null in `shutdown` 
> that is called before `join` where we unsubscribe from notifications from 
> Disruptor. There is a little chance that something comes into FSMCallerImpl 
> after shutdown but before join.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-20685) Implement ability to trigger transaction recovery

2023-11-01 Thread Vyacheslav Koptilin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin reassigned IGNITE-20685:


Assignee: Vladislav Pyatkov

> Implement ability to trigger transaction recovery
> -
>
> Key: IGNITE-20685
> URL: https://issues.apache.org/jira/browse/IGNITE-20685
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>
> h3. Motivation
> Let's assume that the date node somehow found out that the transaction 
> coordinator is dead, but the products of its activity such as locks and write 
> intents are still present. In that case it’s necessary to check whether 
> corresponding transaction was actually finished and if not finish it.
> h3. Definition of Done
>  * Transactions X that detects (detection logic will be covered in a separate 
> ticket) that coordinator is dead awaits commitPartition primary replica and 
> sends initiateRecoveryReplicaRequest to it in a fully asynchronous manner. 
> Meaning that transaction X should behave itself in a way as it specified in 
> deadlock prevention engine and do not explicitly wait for initiateRecovery 
> result. Actually we do not expect any direct response from initiate recovery. 
> Initiate recovery failover will be implemented in a different way.
>  * Commit partition somewhere handles given request. No-op handling is 
> expected for now, proper one will be added in IGNITE-20735 Let's consider 
> either TransactionStateResolver or TxManagerImpl as initiateRecovery handler. 
> TransactionStateResolver seems as the best choice here, however it should be 
> refactored a bit, basically because it's won't be only state resolver any 
> longer.
> h3. Implementation Notes
>  * Given ticket is trivial and should be considered as a bridge between 
> durable tx coordinator liveness detection and corresponding 
> initiateRecoveryReplicaRequest handling. Both items will be covered in a 
> separate tickets.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IGNITE-20774) ItRebalanceDistributedTest#testRebalanceRetryWhenCatchupFailed is flaky on TC

2023-11-01 Thread Sergey Chugunov (Jira)

Sergey Chugunov created IGNITE-20774:


 Summary: 
ItRebalanceDistributedTest#testRebalanceRetryWhenCatchupFailed is flaky on TC
 Key: IGNITE-20774
 URL: https://issues.apache.org/jira/browse/IGNITE-20774
 Project: Ignite
  Issue Type: Bug
Reporter: Sergey Chugunov
 Attachments: _Integration_Tests_Module_Runner_18723.log.zip

Test fails with timeout in different branches including main:
{code:java}
java.util.concurrent.TimeoutException: after() timed out after 60 seconds
at 
org.junit.jupiter.engine.extension.TimeoutExceptionFactory.create(TimeoutExceptionFactory.java:29)
at 
org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:58)
at 
org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptLifecycleMethod(TimeoutExtension.java:128)
at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptAfterEachMethod(TimeoutExtension.java:110)
at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
...
{code}
>From stack trace it is not clear where test timed out.

TC run with failed test in main is available 
[here|https://ci.ignite.apache.org/viewLog.html?buildId=7593604&tab=buildResultsDiv&buildTypeId=ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunner]
 test logs are attached.

Also TC reports NullPointerException in logs, it is not clear whether timeout 
is caused but this NPE:
{code:java}
[2023-10-29T00:37:45,888][ERROR][%irdt_trrwcf_2%JRaft-FSMCaller-Disruptor-_stripe_4-0][StripedDisruptor]
 Handle disruptor event error 
[name=%irdt_trrwcf_2%JRaft-FSMCaller-Disruptor-, 
event=org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTask@55293f4b, 
hasHandler=false]
java.lang.NullPointerException: null
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:492)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:382)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:136)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:130)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:226)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) 
[disruptor-3.3.7.jar:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20773) Adjust log resolution procedure in order to have an ability to do tx coordinator liveness check

2023-11-01 Thread Alexander Lapin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-20773:
-
Description: 
h3. Motivation

In order to implemented liveness check and thus initiate transaction recovery 
if the corresponding coordinator is dead, it's required to adjust lock 
resolution procedure.
h3. Definition of done

Liveness check described in IGNITE-20771 is triggered, meaning that txId of 
first lock is available.
h3. Implementation Notes

Given ticket finishes the chain of lock triggered tx recovery (coordinator 
aspect) thus it's required also to add some integration tests here for the 
whole chain.

  was:
h3. Motivation

In order to implemented liveness check and thus initiate transaction recovery 
if the corresponding coordinator is dead, it's required to adjust lock 
resolution procedure.
h3. Definition of done

Liveness check described in 
[IGNITE-20771|https://issues.apache.org/jira/browse/IGNITE-20771] is triggered, 
meaning that txId of first lock is available.


> Adjust log resolution procedure in order to have an ability to do tx 
> coordinator liveness check
> ---
>
> Key: IGNITE-20773
> URL: https://issues.apache.org/jira/browse/IGNITE-20773
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Denis Chudov
>Priority: Major
>
> h3. Motivation
> In order to implemented liveness check and thus initiate transaction recovery 
> if the corresponding coordinator is dead, it's required to adjust lock 
> resolution procedure.
> h3. Definition of done
> Liveness check described in IGNITE-20771 is triggered, meaning that txId of 
> first lock is available.
> h3. Implementation Notes
> Given ticket finishes the chain of lock triggered tx recovery (coordinator 
> aspect) thus it's required also to add some integration tests here for the 
> whole chain.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-20773) Adjust log resolution procedure in order to have an ability to do tx coordinator liveness check

2023-11-01 Thread Alexander Lapin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin reassigned IGNITE-20773:


Assignee: Denis Chudov

> Adjust log resolution procedure in order to have an ability to do tx 
> coordinator liveness check
> ---
>
> Key: IGNITE-20773
> URL: https://issues.apache.org/jira/browse/IGNITE-20773
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Denis Chudov
>Priority: Major
>
> h3. Motivation
> In order to implemented liveness check and thus initiate transaction recovery 
> if the corresponding coordinator is dead, it's required to adjust lock 
> resolution procedure.
> h3. Definition of done
> Liveness check described in 
> [IGNITE-20771|https://issues.apache.org/jira/browse/IGNITE-20771] is 
> triggered, meaning that txId of first lock is available.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20773) Adjust log resolution procedure in order to have an ability to do tx coordinator liveness check

2023-11-01 Thread Alexander Lapin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-20773:
-
Description: 
h3. Motivation

In order to implemented liveness check and thus initiate transaction recovery 
if the corresponding coordinator is dead, it's required to adjust lock 
resolution procedure.
h3. Definition of done

Liveness check described in 
[IGNITE-20771|https://issues.apache.org/jira/browse/IGNITE-20771] is triggered, 
meaning that txId of first lock is available.

> Adjust log resolution procedure in order to have an ability to do tx 
> coordinator liveness check
> ---
>
> Key: IGNITE-20773
> URL: https://issues.apache.org/jira/browse/IGNITE-20773
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Priority: Major
>
> h3. Motivation
> In order to implemented liveness check and thus initiate transaction recovery 
> if the corresponding coordinator is dead, it's required to adjust lock 
> resolution procedure.
> h3. Definition of done
> Liveness check described in 
> [IGNITE-20771|https://issues.apache.org/jira/browse/IGNITE-20771] is 
> triggered, meaning that txId of first lock is available.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20773) Adjust log resolution procedure in order to have an ability to do tx coordinator liveness check

2023-11-01 Thread Alexander Lapin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-20773:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Adjust log resolution procedure in order to have an ability to do tx 
> coordinator liveness check
> ---
>
> Key: IGNITE-20773
> URL: https://issues.apache.org/jira/browse/IGNITE-20773
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IGNITE-20773) Adjust log resolution procedure in order to have an ability to do tx coordinator liveness check

2023-11-01 Thread Alexander Lapin (Jira)

Alexander Lapin created IGNITE-20773:


 Summary: Adjust log resolution procedure in order to have an 
ability to do tx coordinator liveness check
 Key: IGNITE-20773
 URL: https://issues.apache.org/jira/browse/IGNITE-20773
 Project: Ignite
  Issue Type: Improvement
Reporter: Alexander Lapin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20771) Implement tx coordinator liveness check

2023-11-01 Thread Alexander Lapin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-20771:
-
Description: 
h3. Motivation

In order to implement tx coordinator recovery, it's definitely required to 
understand whether coordinator is dead or not. Every data node has it's own txn 
state local volatile map (txId -> org.apache.ignite.internal.tx.TxStateMeta) 
where besides other fields we can find txCoordinatorId. Liveness check assumes 
that if a node with given id is available in physical topology then coordinator 
is alive, otherwise it's considered as dead. However despite the fact that such 
local check is fast enough there's no sense in checking it too often, 
espesially with subsequent sends of initialRecoveryRequests. Thus, it seems 
reasonable to add one more field to the TxStateMeta that will store last 
liveness check timestamp. Because it's always local checks it's valid to use 
System.currentTimeMillis or similar instead of HybridTimestamp in order to 
reduce the contention on the clock. Please pay attention that triggers that 
will initiate liveness checks will be implemented separetly.
h3. Definition of Done
 * One more lastLivenessCheck timestamp is added to the TxStateMeta.
 * Aforementioned field is updated locally on each tx operation with 
currentTimeMillis.
 * New cluster-wide tx liveness interval configuration property is introduced.
 * Within liveness check
 ** if (the lastLivenessCheck >= currentTimeMillis - livenessInterval) - no-op
 ** else 
 *** update lastLivenessCheck
 *** do the probe - check whether txCoordinatorId is still available in 
physical topology, if it's available no further actions are required if int's 
not then
  trigger initiateRecovery procedure implemented in IGNITE-20685.
  if commit partition is also unavailable (meaning that there's no primary 
replica) mark transaction as abandoned.

> Implement tx coordinator liveness check
> ---
>
> Key: IGNITE-20771
> URL: https://issues.apache.org/jira/browse/IGNITE-20771
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Priority: Major
>
> h3. Motivation
> In order to implement tx coordinator recovery, it's definitely required to 
> understand whether coordinator is dead or not. Every data node has it's own 
> txn state local volatile map (txId -> 
> org.apache.ignite.internal.tx.TxStateMeta) where besides other fields we can 
> find txCoordinatorId. Liveness check assumes that if a node with given id is 
> available in physical topology then coordinator is alive, otherwise it's 
> considered as dead. However despite the fact that such local check is fast 
> enough there's no sense in checking it too often, espesially with subsequent 
> sends of initialRecoveryRequests. Thus, it seems reasonable to add one more 
> field to the TxStateMeta that will store last liveness check timestamp. 
> Because it's always local checks it's valid to use System.currentTimeMillis 
> or similar instead of HybridTimestamp in order to reduce the contention on 
> the clock. Please pay attention that triggers that will initiate liveness 
> checks will be implemented separetly.
> h3. Definition of Done
>  * One more lastLivenessCheck timestamp is added to the TxStateMeta.
>  * Aforementioned field is updated locally on each tx operation with 
> currentTimeMillis.
>  * New cluster-wide tx liveness interval configuration property is introduced.
>  * Within liveness check
>  ** if (the lastLivenessCheck >= currentTimeMillis - livenessInterval) - no-op
>  ** else 
>  *** update lastLivenessCheck
>  *** do the probe - check whether txCoordinatorId is still available in 
> physical topology, if it's available no further actions are required if int's 
> not then
>   trigger initiateRecovery procedure implemented in IGNITE-20685.
>   if commit partition is also unavailable (meaning that there's no 
> primary replica) mark transaction as abandoned.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20544) Implement the start of building indexes on node recovery

2023-11-01 Thread Kirill Tkalenko (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-20544:
-
Reviewer: Ivan Bessonov

> Implement the start of building indexes on node recovery
> 
>
> Key: IGNITE-20544
> URL: https://issues.apache.org/jira/browse/IGNITE-20544
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After the node is restored, a change of leaseholders may not occur (a 
> prolongation will occur), thus we may end up in a situation where build index 
> will not be completed before the change of leaseholders.
> It is important for us to start building indexes on recovery  only after the 
> partition has been restored (after applying the raft log).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20772) ItTableRaftSnapshotsTest#txSemanticsIsMaintained is flaky in different branches

2023-11-01 Thread Sergey Chugunov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Chugunov updated IGNITE-20772:
-
Attachment: _Integration_Tests_Module_Runner_18692.log.zip

> ItTableRaftSnapshotsTest#txSemanticsIsMaintained is flaky in different 
> branches
> ---
>
> Key: IGNITE-20772
> URL: https://issues.apache.org/jira/browse/IGNITE-20772
> Project: Ignite
>  Issue Type: Bug
>Reporter: Sergey Chugunov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_18692.log.zip
>
>
> Test fails from time to time in different branches, success rate is 98%.
> Latest failure in main branch was caused by timeout in test logic:
> {code:java}
> java.lang.RuntimeException: java.util.concurrent.TimeoutException
>   at org.apache.ignite.internal.Cluster.startNode(Cluster.java:336)
>   at org.apache.ignite.internal.Cluster.startNode(Cluster.java:315)
>   at 
> org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode(ItTableRaftSnapshotsTest.java:453)
>   at 
> org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNodeAndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:435)
>   at 
> org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode2AndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:425)
>   at 
> org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintainedAfterInstallingSnapshot(ItTableRaftSnapshotsTest.java:494)
>   at 
> org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintained(ItTableRaftSnapshotsTest.java:466)
> ...
> Caused by: java.util.concurrent.TimeoutException
>   at 
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
>   at org.apache.ignite.internal.Cluster.startNode(Cluster.java:330)
>   ... 93 more {code}
> Test involves node restart so multiple errors in logs about connection issues 
> are expected:
> {code:java}
> [2023-10-27T08:24:44,662][WARN 
> ][%itrst_tsim_2%Raft-Group-Client-4][RaftGroupServiceImpl] Recoverable error 
> during the request type=GetLeaderRequestImpl occurred (will be retried on the 
> randomly selected node): 
> java.util.concurrent.CompletionException: java.net.ConnectException: Peer 
> itrst_tsim_0 is unavailable
>   at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1099)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235)
>  ~[?:?]
>   at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:523)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleThrowable$40(RaftGroupServiceImpl.java:564)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>   at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: java.net.ConnectException: Peer itrst_tsim_0 is unavailable
>   at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.resolvePeer(RaftGroupServiceImpl.java:761)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:522)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>   ... 7 more {code}
> At the same time it is not clear from logs what prevented node from starting.
> Suite run with failed test is available 
> [here|https://ci.ignite.apache.org/viewLog.html?buildId=7590927&buildTypeId=ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunner&tab=buildLog].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20772) ItTableRaftSnapshotsTest#txSemanticsIsMaintained is flaky in different branches

2023-11-01 Thread Sergey Chugunov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Chugunov updated IGNITE-20772:
-
Description: 
Test fails from time to time in different branches, success rate is 98%.

Latest failure in main branch was caused by timeout in test logic:
{code:java}
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:336)
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:315)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode(ItTableRaftSnapshotsTest.java:453)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNodeAndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:435)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode2AndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:425)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintainedAfterInstallingSnapshot(ItTableRaftSnapshotsTest.java:494)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintained(ItTableRaftSnapshotsTest.java:466)
...
Caused by: java.util.concurrent.TimeoutException
at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:330)
... 93 more {code}
Test involves node restart so multiple errors in logs about connection issues 
are expected:
{code:java}
[2023-10-27T08:24:44,662][WARN 
][%itrst_tsim_2%Raft-Group-Client-4][RaftGroupServiceImpl] Recoverable error 
during the request type=GetLeaderRequestImpl occurred (will be retried on the 
randomly selected node): 
java.util.concurrent.CompletionException: java.net.ConnectException: Peer 
itrst_tsim_0 is unavailable
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1099)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) 
~[?:?]
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:523)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleThrowable$40(RaftGroupServiceImpl.java:564)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
 [?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.net.ConnectException: Peer itrst_tsim_0 is unavailable
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.resolvePeer(RaftGroupServiceImpl.java:761)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:522)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
... 7 more {code}
At the same time it is not clear from logs what prevented node from starting.

Suite run with failed test is available 
[here|https://ci.ignite.apache.org/viewLog.html?buildId=7590927&buildTypeId=ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunner&tab=buildLog],
 logs are attached.

  was:
Test fails from time to time in different branches, success rate is 98%.

Latest failure in main branch was caused by timeout in test logic:
{code:java}
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:336)
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:315)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode(ItTableRaftSnapshotsTest.java:453)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNodeAndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:435)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode2AndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:425)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintainedAfterInstallingSnapshot(ItTableRaftSnapshotsTest.java:494)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintained(ItTab

[jira] [Updated] (IGNITE-20772) ItTableRaftSnapshotsTest#txSemanticsIsMaintained is flaky in different branches

2023-11-01 Thread Sergey Chugunov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Chugunov updated IGNITE-20772:
-
Description: 
Test fails from time to time in different branches, success rate is 98%.

Latest failure in main branch was caused by timeout in test logic:
{code:java}
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:336)
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:315)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode(ItTableRaftSnapshotsTest.java:453)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNodeAndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:435)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode2AndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:425)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintainedAfterInstallingSnapshot(ItTableRaftSnapshotsTest.java:494)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintained(ItTableRaftSnapshotsTest.java:466)
...
Caused by: java.util.concurrent.TimeoutException
at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:330)
... 93 more {code}
Test involves node restart so multiple errors in logs about connection issues 
are expected:
{code:java}
[2023-10-27T08:24:44,662][WARN 
][%itrst_tsim_2%Raft-Group-Client-4][RaftGroupServiceImpl] Recoverable error 
during the request type=GetLeaderRequestImpl occurred (will be retried on the 
randomly selected node): 
java.util.concurrent.CompletionException: java.net.ConnectException: Peer 
itrst_tsim_0 is unavailable
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1099)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) 
~[?:?]
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:523)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleThrowable$40(RaftGroupServiceImpl.java:564)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
 [?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.net.ConnectException: Peer itrst_tsim_0 is unavailable
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.resolvePeer(RaftGroupServiceImpl.java:761)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:522)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
... 7 more {code}
At the same time it is not clear from logs what prevented node from starting.

Suite run with failed test is available 
[here|https://ci.ignite.apache.org/viewLog.html?buildId=7590927&buildTypeId=ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunner&tab=buildLog].

  was:
Test fails from time to time in different branches, success rate is 98%.

Latest failure in main branch was caused by timeout in test logic:


{code:java}
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:336)
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:315)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode(ItTableRaftSnapshotsTest.java:453)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNodeAndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:435)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode2AndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:425)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintainedAfterInstallingSnapshot(ItTableRaftSnapshotsTest.java:494)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintained(ItTableRaftSnapshotsTes

[jira] [Created] (IGNITE-20772) ItTableRaftSnapshotsTest#txSemanticsIsMaintained is flaky in different branches

2023-11-01 Thread Sergey Chugunov (Jira)

Sergey Chugunov created IGNITE-20772:


 Summary: ItTableRaftSnapshotsTest#txSemanticsIsMaintained is flaky 
in different branches
 Key: IGNITE-20772
 URL: https://issues.apache.org/jira/browse/IGNITE-20772
 Project: Ignite
  Issue Type: Bug
Reporter: Sergey Chugunov


Test fails from time to time in different branches, success rate is 98%.

Latest failure in main branch was caused by timeout in test logic:


{code:java}
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:336)
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:315)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode(ItTableRaftSnapshotsTest.java:453)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNodeAndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:435)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.reanimateNode2AndWaitForSnapshotInstalled(ItTableRaftSnapshotsTest.java:425)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintainedAfterInstallingSnapshot(ItTableRaftSnapshotsTest.java:494)
at 
org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.txSemanticsIsMaintained(ItTableRaftSnapshotsTest.java:466)
...
Caused by: java.util.concurrent.TimeoutException
at 
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
at org.apache.ignite.internal.Cluster.startNode(Cluster.java:330)
... 93 more {code}
Test involves node restart so multiple errors in logs about connection issues 
are expected:


{code:java}
[2023-10-27T08:24:44,662][WARN 
][%itrst_tsim_2%Raft-Group-Client-4][RaftGroupServiceImpl] Recoverable error 
during the request type=GetLeaderRequestImpl occurred (will be retried on the 
randomly selected node): 
java.util.concurrent.CompletionException: java.net.ConnectException: Peer 
itrst_tsim_0 is unavailable
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1099)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) 
~[?:?]
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:523)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleThrowable$40(RaftGroupServiceImpl.java:564)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
 [?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.net.ConnectException: Peer itrst_tsim_0 is unavailable
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.resolvePeer(RaftGroupServiceImpl.java:761)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:522)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
... 7 more {code}
At the same time it is not clear from logs what prevented node from starting.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-20310) Meta storage invokes are not completed when DZM start is completed

2023-11-01 Thread Vyacheslav Koptilin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin reassigned IGNITE-20310:


Assignee: Mirza Aliev

> Meta storage invokes are not completed  when DZM start is completed
> ---
>
> Key: IGNITE-20310
> URL: https://issues.apache.org/jira/browse/IGNITE-20310
> Project: Ignite
>  Issue Type: Bug
>Reporter: Sergey Uttsel
>Assignee: Mirza Aliev
>Priority: Major
>  Labels: dzm-reviewed, ignite-3
>
> h3. *Motivation*
> There are meta storage invokes in DistributionZoneManager start. Currently it 
> does the meta storage invokes in 
> DistributionZoneManager#createOrRestoreZoneState:
> # DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to init 
> the default zone.
> # DistributionZoneManager#restoreTimers in case when a filter update was 
> handled before DZM stop, but it didn't update data nodes.
> Futures of these invokes are ignored. So after the start method is completed 
> actually not all start actions are completed. It can lead to the following 
> situation: 
> * Initialisation of the default zone is hanged for some reason even after 
> full restart of the cluster.
> * That means that all data nodes related keys in metastorage haven't been 
> initialised.
> * For example, if user add some new node, and scale up timer is immediate, 
> which leads to immediate data nodes recalculation, this recalculation won't 
> happen, because data nodes key have not been initialised. 
> h3. *Possible solutions*
> h4. Easier
> We just need to wait for all async logic to be completed within the 
> {{DistributionZoneManager#start}} with {{ms.invoke().join()}}
> h4. Harder
> We can enhance {{IgniteComponent#start}}, so it could return Completable 
> future, and after that we need to change the flow of starting components, so 
> node is not ready to work until all {{IgniteComponent#start}} futures are 
> completed. For example, we can chain our futures on 
> {{IgniteImpl#recoverComponentsStateOnStart}}, so components' futures are 
> completed before {{metaStorageMgr.deployWatches()}}.
>  In {{DistributionZoneManager#start}}  we can return 
> {{CompletableFuture.allOf}} features, that are needed to be completed in the 
> {{DistributionZoneManager#start}}
> h3. *Definition of done*
> All asynchronous logic in the {{DistributionZoneManager#start}} is done 
> before a node is ready to work, in particular, ready to interact with zones.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-20760) Drop column error message get indexes by column name only

2023-11-01 Thread Yury Gerzhedovich (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Gerzhedovich reassigned IGNITE-20760:
--

Assignee: Yury Gerzhedovich

> Drop column error message get indexes by column name only 
> --
>
> Key: IGNITE-20760
> URL: https://issues.apache.org/jira/browse/IGNITE-20760
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 3.0
>Reporter: Alexander Belyak
>Assignee: Yury Gerzhedovich
>Priority: Major
>  Labels: ignite-3
>
> If there are some index, preventing column to be dropped, then error message 
> contains all indexes with all columns with the same name. Even if there is 
> completely different tables with the same named columns:
> {code:java}
> drop table tab1;
> drop table tab2;
> create table tab1(id integer not null primary key, f1 int);
> create index tab1_f1 on tab1(f1);
> create table tab2(id integer not null primary key, f1 int, f2 int);
> create index tab2_f1 on tab2(f1);
> create index tab2_f12 on tab2(f1,f2);
> alter table tab2 drop column f1;
> >> Fail with wrong error message: 
> >> [Code: 0, SQL State: 5]  Failed to validate query. Deleting column 
> >> 'F1' used by index(es) [TAB1_F1, TAB2_F1, TAB2_F12], it is not allowed
> >> Because it contains TAB1_F1 index
> drop index tab2_f12;
> drop index tab2_f1;
> alter table tab2 drop column f1
> >> Success, so the problem only on the error message generation. {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20764) Sql. Improve script parsing to handle dynamic parameters for each statement

2023-11-01 Thread Yury Gerzhedovich (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Gerzhedovich updated IGNITE-20764:
---
Component/s: sql

> Sql. Improve script parsing to handle dynamic parameters for each statement
> ---
>
> Key: IGNITE-20764
> URL: https://issues.apache.org/jira/browse/IGNITE-20764
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Pavel Pereslegin
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> At the moment, there is only one implementation ({{ParserServiceImpl}}) of 
> the {{ParserService}}, which is not entirely suitable for parsing scripts for 
> several reasons:
> 1. According to the {{ParserService}} interface, it returns the result of 
> parsing only single statement.
> 2. The service only counts the total number of dynamic parameters per script 
> (and not for the statement).
> 3. The cache implementation is designed to cache single expression.
> To implement the script execution logic (IGNITE-20443), it is recommended to 
> add a new method (or a second implementation), which will return a list of 
> parsing results for each statement (with the correct number of dynamic 
> parameters). of the script



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20771) Implement tx coordinator liveness check

2023-11-01 Thread Alexander Lapin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-20771:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Implement tx coordinator liveness check
> ---
>
> Key: IGNITE-20771
> URL: https://issues.apache.org/jira/browse/IGNITE-20771
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IGNITE-20771) Implement tx coordinator liveness check

2023-11-01 Thread Alexander Lapin (Jira)

Alexander Lapin created IGNITE-20771:


 Summary: Implement tx coordinator liveness check
 Key: IGNITE-20771
 URL: https://issues.apache.org/jira/browse/IGNITE-20771
 Project: Ignite
  Issue Type: Improvement
Reporter: Alexander Lapin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20735) Implement initiate recovery handling logic

2023-11-01 Thread Alexander Lapin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-20735:
-
Description: 
h3. Motivation

IGNITE-20685 will send initiate recovery replica request that should be 
properly handled in order to detect whether transaction is finished and 
rollback it if it's abandoned. Abandoned means that transaction is in pending 
state, however tx coordinator is dead.
h3. Definition of Done
 * If transaction state is either finished or aborted, then а cleanup request 
is sent in a common durable manner to a partition that have initiated recovery.
 * If the transaction state is pending, then the transaction should be rolled 
back, meaning that the state is changed to aborted and a corresponding cleanup 
request is sent in a common durable manner to a partition that have initiated 
recovery.

  was:
h3. Motivation

IGNITE-20685 will send initiate recovery replica request that should be 
properly handled in order to detect whether transaction is finished and 
rollback it if it's abandoned. Abandoned means that transaction is in pending 
state, however tx coordinator is dead.


> Implement initiate recovery handling logic
> --
>
> Key: IGNITE-20735
> URL: https://issues.apache.org/jira/browse/IGNITE-20735
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Priority: Major
>
> h3. Motivation
> IGNITE-20685 will send initiate recovery replica request that should be 
> properly handled in order to detect whether transaction is finished and 
> rollback it if it's abandoned. Abandoned means that transaction is in pending 
> state, however tx coordinator is dead.
> h3. Definition of Done
>  * If transaction state is either finished or aborted, then а cleanup request 
> is sent in a common durable manner to a partition that have initiated 
> recovery.
>  * If the transaction state is pending, then the transaction should be rolled 
> back, meaning that the state is changed to aborted and a corresponding 
> cleanup request is sent in a common durable manner to a partition that have 
> initiated recovery.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (IGNITE-20638) Integration of distributed index building

2023-11-01 Thread Kirill Tkalenko (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-20638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko reassigned IGNITE-20638:


Assignee: Kirill Tkalenko

> Integration of distributed index building
> -
>
> Key: IGNITE-20638
> URL: https://issues.apache.org/jira/browse/IGNITE-20638
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> We need to integrate 
> *org.apache.ignite.internal.index.IndexAvailabilityController* into 
> *org.apache.ignite.internal.app.IgniteImpl*.
> But there are nuances that I would like to solve in the current ticket:
> * We have created several components around the index building mechanism that 
> would be convenient to keep in one place; I suggest creating the 
> *IndexBuildingManager*. Which will contain all the components we need to work 
> with building the index.
> * *org.apache.ignite.internal.index.IndexBuildController* - should not be a 
> *IgniteComponent* and should not close *IndexBuilder* as it will be used by 
> two components.
> * *org.apache.ignite.internal.table.distributed.index.IndexBuilder* and 
> related classes are moved to module *ignite-index*.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (IGNITE-19813) Unable to execute TPCH Q2 query

2023-11-01 Thread Yury Gerzhedovich (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-19813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Gerzhedovich resolved IGNITE-19813.

Resolution: Cannot Reproduce

Seems has been fixed under IGNITE-20714

> Unable to execute TPCH Q2 query
> ---
>
> Key: IGNITE-19813
> URL: https://issues.apache.org/jira/browse/IGNITE-19813
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 3.0
>Reporter: Alexander Belyak
>Priority: Major
>  Labels: ignite-3
>
> Executing for standatd TPC-H query Q2 took few minutes. Not work neither as 
> Plain test nor PreparedStatement
> {code:java}
> SELECT
>                 s_acctbal,
>                 s_name,
>                 n_name,
>                 p_partkey,
>                 p_mfgr,
>                 s_address,
>                 s_phone,
>                 s_comment
>              FROM
>                 part,
>                 supplier,
>                 partsupp,
>                 nation,
>                 region
>              WHERE
>                 p_partkey = ps_partkey
>                 AND s_suppkey = ps_suppkey
>                 AND p_size = 12
>                 AND p_type LIKE '%NICKEL'
>                 AND s_nationkey = n_nationkey
>                 AND n_regionkey = r_regionkey
>                 AND r_name = 'AFRICA'
>                 AND ps_supplycost =
>                 (
>                    SELECT
>                       MIN(ps_supplycost)
>                    FROM
>                       partsupp,
>                       supplier,
>                       nation,
>                       region
>                    WHERE
>                       p_partkey = ps_partkey
>                       AND s_suppkey = ps_suppkey
>                       AND s_nationkey = n_nationkey
>                       AND n_regionkey = r_regionkey
>                       AND r_name = 'AFRICA'
>                 )
>              ORDER BY
>                 s_acctbal DESC,
>                 n_name,
>                 s_name,
>                 p_partkey LIMIT 100 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (IGNITE-20764) Sql. Improve script parsing to handle dynamic parameters for each statement

2023-11-01 Thread Pavel Pereslegin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-20764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781643#comment-17781643
 ] 

Pavel Pereslegin commented on IGNITE-20764:
---

[~zstan], [~korlov],
could you please review the proposed patch?

> Sql. Improve script parsing to handle dynamic parameters for each statement
> ---
>
> Key: IGNITE-20764
> URL: https://issues.apache.org/jira/browse/IGNITE-20764
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Pavel Pereslegin
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> At the moment, there is only one implementation ({{ParserServiceImpl}}) of 
> the {{ParserService}}, which is not entirely suitable for parsing scripts for 
> several reasons:
> 1. According to the {{ParserService}} interface, it returns the result of 
> parsing only single statement.
> 2. The service only counts the total number of dynamic parameters per script 
> (and not for the statement).
> 3. The cache implementation is designed to cache single expression.
> To implement the script execution logic (IGNITE-20443), it is recommended to 
> add a new method (or a second implementation), which will return a list of 
> parsing results for each statement (with the correct number of dynamic 
> parameters). of the script



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (IGNITE-16205) Calcite integration: Add JDBC tests when query cancelation is supported.

2023-11-01 Thread Evgeny Stanilovsky (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-16205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781627#comment-17781627
 ] 

Evgeny Stanilovsky commented on IGNITE-16205:
-

seems cancellation is implemented and we can move forward ? [~jooger] 

> Calcite integration: Add JDBC tests when query cancelation is supported.
> 
>
> Key: IGNITE-16205
> URL: https://issues.apache.org/jira/browse/IGNITE-16205
> Project: Ignite
>  Issue Type: New Feature
>  Components: jdbc
>Reporter: Vladimir Ermakov
>Priority: Major
>  Labels: ignite-3
>
> .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-16205) Calcite integration: Add JDBC tests when query cancelation is supported.

2023-11-01 Thread Evgeny Stanilovsky (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-16205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgeny Stanilovsky updated IGNITE-16205:

Ignite Flags:   (was: Docs Required,Release Notes Required)

> Calcite integration: Add JDBC tests when query cancelation is supported.
> 
>
> Key: IGNITE-16205
> URL: https://issues.apache.org/jira/browse/IGNITE-16205
> Project: Ignite
>  Issue Type: New Feature
>  Components: jdbc
>Reporter: Vladimir Ermakov
>Priority: Major
>  Labels: ignite-3
>
> .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

50 matches

Mail list logo