[jira] [Commented] (IGNITE-18879) Leaseholder candidates balancing

2024-02-04 Thread yexiaowei (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-18879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814069#comment-17814069
 ] 

yexiaowei commented on IGNITE-18879:


The reason why PD's implementation does not work like TIDB, which sends 
heartbeat to PD and then schedules based on the status collected by heartbeat, 
such as primary transfer or change assignments.

> Leaseholder candidates balancing
> 
>
> Key: IGNITE-18879
> URL: https://issues.apache.org/jira/browse/IGNITE-18879
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Denis Chudov
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> Primary replicas (leaseholders) should be evenly distributed over cluster to 
> balance the transactional load between nodes. As the placement driver assigns 
> primary replicas, balancing the primary replicas is also it's responsibility. 
> Naive implementation of balancing should choose a node as leaseholder 
> candidate in a way to save even lease distribution over all nodes. In real 
> cluster, it may take into account slow nodes, hot table records, etc. If 
> lease candidate declines LeaseGrantMessage from placement driver, the 
> balancer should make decision to choose another candidate for given primary 
> replica or enforce the previously chosen. So the balancing algorith should be 
> pluggable, so that we could have ability to improve/replace/compare it with 
> others.
> *Definition of done*
> Introduced interface for lease candidates balancer, and a simple 
> implementation sustaining even lease distribution, which is used by placement 
> driver by default. No public or internal configuration needed on this stage.
> *Implementation notes*
> Lease candidates balancer should have at least 2 methods:
>  - {_}get(group, ignoredNodes){_}: returns candidate for the given group, a 
> node from ignoredNodes set can't be chosen as a candidate
>  - {_}considerRedirectProposal(group, candidate, proposedCandidate){_}: 
> processes redirect proposal for given group provided by given candidate 
> (previously chosen using _get_ method), proposedCandidate is the alternative 
> candidate. Returns candidate that should be enforced by placement driver.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (IGNITE-18879) Leaseholder candidates balancing

2024-02-04 Thread yexiaowei (Jira)


[ https://issues.apache.org/jira/browse/IGNITE-18879 ]


yexiaowei deleted comment on IGNITE-18879:


was (Author: JIRAUSER303976):
The reason why PD's implementation does not work like TIDB, which sends 
heartbeat to PD and then schedules based on the status collected by heartbeat, 
such as primary transfer or change assignments.

> Leaseholder candidates balancing
> 
>
> Key: IGNITE-18879
> URL: https://issues.apache.org/jira/browse/IGNITE-18879
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Denis Chudov
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> Primary replicas (leaseholders) should be evenly distributed over cluster to 
> balance the transactional load between nodes. As the placement driver assigns 
> primary replicas, balancing the primary replicas is also it's responsibility. 
> Naive implementation of balancing should choose a node as leaseholder 
> candidate in a way to save even lease distribution over all nodes. In real 
> cluster, it may take into account slow nodes, hot table records, etc. If 
> lease candidate declines LeaseGrantMessage from placement driver, the 
> balancer should make decision to choose another candidate for given primary 
> replica or enforce the previously chosen. So the balancing algorith should be 
> pluggable, so that we could have ability to improve/replace/compare it with 
> others.
> *Definition of done*
> Introduced interface for lease candidates balancer, and a simple 
> implementation sustaining even lease distribution, which is used by placement 
> driver by default. No public or internal configuration needed on this stage.
> *Implementation notes*
> Lease candidates balancer should have at least 2 methods:
>  - {_}get(group, ignoredNodes){_}: returns candidate for the given group, a 
> node from ignoredNodes set can't be chosen as a candidate
>  - {_}considerRedirectProposal(group, candidate, proposedCandidate){_}: 
> processes redirect proposal for given group provided by given candidate 
> (previously chosen using _get_ method), proposedCandidate is the alternative 
> candidate. Returns candidate that should be enforced by placement driver.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21220) Worker node recovery

2024-02-04 Thread Dmitry Baranov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814163#comment-17814163
 ] 

Dmitry Baranov commented on IGNITE-21220:
-

In case of network partitioning node will be removed from logical topology, to 
join topology again node need to be restarted, thus jobs will be canceled 
during restart. There is no need to process jobs on initial worker node in case 
of onNodeLeft event

> Worker node recovery
> 
>
> Key: IGNITE-21220
> URL: https://issues.apache.org/jira/browse/IGNITE-21220
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Aleksandr
>Priority: Major
>  Labels: ignite-3
>
> There is a case when the worker node executing some job has left the logical 
> topology. This node has to identify this situation and stop all running jobs. 
> We must not have the situation when the worker left the topology, the 
> coordinator restarted the job on another node, then the first worker joined 
> the topology and two instances of the job are running on the cluster. 
> Maybe we do not allow this but we have a lack of tests here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-21220) Worker node recovery

2024-02-04 Thread Dmitry Baranov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Baranov reassigned IGNITE-21220:
---

Assignee: Dmitry Baranov

> Worker node recovery
> 
>
> Key: IGNITE-21220
> URL: https://issues.apache.org/jira/browse/IGNITE-21220
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Aleksandr
>Assignee: Dmitry Baranov
>Priority: Major
>  Labels: ignite-3
>
> There is a case when the worker node executing some job has left the logical 
> topology. This node has to identify this situation and stop all running jobs. 
> We must not have the situation when the worker left the topology, the 
> coordinator restarted the job on another node, then the first worker joined 
> the topology and two instances of the job are running on the cluster. 
> Maybe we do not allow this but we have a lack of tests here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-21220) Worker node recovery

2024-02-04 Thread Dmitry Baranov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Baranov resolved IGNITE-21220.
-
Resolution: Won't Fix

> Worker node recovery
> 
>
> Key: IGNITE-21220
> URL: https://issues.apache.org/jira/browse/IGNITE-21220
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Aleksandr
>Assignee: Dmitry Baranov
>Priority: Major
>  Labels: ignite-3
>
> There is a case when the worker node executing some job has left the logical 
> topology. This node has to identify this situation and stop all running jobs. 
> We must not have the situation when the worker left the topology, the 
> coordinator restarted the job on another node, then the first worker joined 
> the topology and two instances of the job are running on the cluster. 
> Maybe we do not allow this but we have a lack of tests here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-21219) Write memory leak tests

2024-02-04 Thread Dmitry Baranov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Baranov reassigned IGNITE-21219:
---

Assignee: Dmitry Baranov

> Write memory leak tests
> ---
>
> Key: IGNITE-21219
> URL: https://issues.apache.org/jira/browse/IGNITE-21219
> Project: Ignite
>  Issue Type: Improvement
>  Components: compute
>Reporter: Aleksandr
>Assignee: Dmitry Baranov
>Priority: Major
>  Labels: ignite-3
>
> Compute jobs handling logic is getting harder to track every reference we 
> have. It seems like we can easily introduce a memory leak. I wonder if we 
> have some microbenchmarks that prove the absence of memory leak in the 
> compute component.
> Important note: the simulation of leaving and joining the cluster of several 
> nodes (candidates, workers) should be done as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20121) Start index destruction when it is removed from the Catalog

2024-02-04 Thread Roman Puchkovskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-20121:
---
Epic Link: IGNITE-20473  (was: IGNITE-20782)

> Start index destruction when it is removed from the Catalog
> ---
>
> Key: IGNITE-20121
> URL: https://issues.apache.org/jira/browse/IGNITE-20121
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Roman Puchkovskiy
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Index destruction, before starting, must wait until the partition’s SafeTime 
> becomes >= ‘Activation moment of index removal’ (aka ‘end time of STOPPING 
> state’ for this index). This is to avoid a race between operations on the 
> index (including writes and reads) and its destruction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-19150) Implement proper destruction of indexes

2024-02-04 Thread Roman Puchkovskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-19150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-19150:
---
Epic Link: IGNITE-20473  (was: IGNITE-20782)

> Implement proper destruction of indexes
> ---
>
> Key: IGNITE-19150
> URL: https://issues.apache.org/jira/browse/IGNITE-19150
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> At the moment, we do not honestly destroy indexes anywhere in the core code, 
> we simply remove them from the collection of indexes when inserting data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21371) Fix ItDmlTest#testMerge

2024-02-04 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-21371:
-
Epic Link: IGNITE-21389

> Fix ItDmlTest#testMerge
> ---
>
> Key: IGNITE-21371
> URL: https://issues.apache.org/jira/browse/IGNITE-21371
> Project: Ignite
>  Issue Type: Bug
>Reporter: Kirill Tkalenko
>Assignee:  Kirill Sizov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> It was discovered that in 
> *org.apache.ignite.internal.sql.engine.ItDmlTest#testMerge*, queries *MERGE* 
> are executed in an implicit transaction that does not decrease the RW 
> transactions counter in 
> *org.apache.ignite.internal.index.IndexNodeFinishedRwTransactionsChecker#decrementRwTxCount*,
>  because of this, tests in which indexes are created can freeze.
> If the test is turned on, 
> *org.apache.ignite.internal.sql.engine.ItDmlTest#rangeReadAndExclusiveInsert* 
> starts to fail.
> h3. What is found out:
> For an implicit RW transaction, *txCoordinatorId* may change because of this, 
> the counter cannot decrease.
> This happens because the *org.apache.ignite.internal.tx.TxStateMeta* is 
> updated in 
> *org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener#processRequest*
>  and changes *TxStateMeta#txCoordinatorId* to another, even if the 
> transaction is processed on the node that created it.
> Presumably this is due to the fact that fragments of the transaction are 
> executed on different nodes; perhaps to fix it it will be enough to add 
> *txCoordinatorId* in the messages.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21289) .NET: Thin 3.0: implement job execution interface

2024-02-04 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-21289:

Summary: .NET: Thin 3.0: implement job execution interface  (was: .NET thin 
client: implement job execution interface)

> .NET: Thin 3.0: implement job execution interface
> -
>
> Key: IGNITE-21289
> URL: https://issues.apache.org/jira/browse/IGNITE-21289
> Project: Ignite
>  Issue Type: Improvement
>  Components: compute, platforms, thin client
>Affects Versions: 3.0.0-beta1
>Reporter: Vadim Pakhnushev
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET, ignite-3
> Fix For: 3.0.0-beta2
>
>
> Implement {{JobExecution}} interface on the client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21102) Incorrect cluster state output for ACTIVE_READ_ONLY in --baseline

2024-02-04 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814196#comment-17814196
 ] 

Ignite TC Bot commented on IGNITE-21102:


{panel:title=Branch: [pull/11138/head] Base: [master] : Possible Blockers 
(1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Platform .NET (Core Linux){color} [[tests 0 TIMEOUT , Exit Code 
, TC_SERVICE_MESSAGE 
|https://ci2.ignite.apache.org/viewLog.html?buildId=7732645]]

{panel}
{panel:title=Branch: [pull/11138/head] Base: [master] : New Tests 
(12)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}Thin Client: Java{color} [[tests 
1|https://ci2.ignite.apache.org/viewLog.html?buildId=7725289]]
* {color:#013220}ClientTestSuite: 
ServiceAwarenessTest.testServiceAwarenessEnabled - PASSED{color}

{color:#8b}Calcite SQL{color} [[tests 
5|https://ci2.ignite.apache.org/viewLog.html?buildId=7725227]]
* {color:#013220}IgniteCalciteTestSuite: 
SearchSargOnIndexIntegrationTest.testNulls - PASSED{color}
* {color:#013220}IgniteCalciteTestSuite: 
SearchSargOnIndexIntegrationTest.testNot - PASSED{color}
* {color:#013220}IgniteCalciteTestSuite: 
SearchSargOnIndexIntegrationTest.testOrdering - PASSED{color}
* {color:#013220}IgniteCalciteTestSuite: 
SearchSargOnIndexIntegrationTest.testIn - PASSED{color}
* {color:#013220}IgniteCalciteTestSuite: 
SearchSargOnIndexIntegrationTest.testRange - PASSED{color}

{color:#8b}Control Utility 1{color} [[tests 
4|https://ci2.ignite.apache.org/viewLog.html?buildId=7725237]]
* {color:#013220}IgniteControlUtilityTestSuite: 
GridCommandHandlerTest.testClusterStateInBaselineCommand[cmdHnd=jmx] - 
PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite: 
GridCommandHandlerTest.testClusterStateInBaselineCommand[cmdHnd=cli] - 
PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite: 
GridCommandHandlerWithSslTest.testClusterStateInBaselineCommand[cmdHnd=cli] - 
PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite: 
GridCommandHandlerWithSslFactoryTest.testClusterStateInBaselineCommand[cmdHnd=cli]
 - PASSED{color}

{color:#8b}Control Utility (Zookeeper){color} [[tests 
2|https://ci2.ignite.apache.org/viewLog.html?buildId=7725238]]
* {color:#013220}ZookeeperIgniteControlUtilityTestSuite: 
GridCommandHandlerTest.testClusterStateInBaselineCommand[cmdHnd=jmx] - 
PASSED{color}
* {color:#013220}ZookeeperIgniteControlUtilityTestSuite: 
GridCommandHandlerTest.testClusterStateInBaselineCommand[cmdHnd=cli] - 
PASSED{color}

{panel}
[TeamCity *--> Run :: All* 
Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7725304&buildTypeId=IgniteTests24Java8_RunAll]

> Incorrect cluster state output for ACTIVE_READ_ONLY in --baseline
> -
>
> Key: IGNITE-21102
> URL: https://issues.apache.org/jira/browse/IGNITE-21102
> Project: Ignite
>  Issue Type: Bug
>Reporter: Julia Bakulina
>Assignee: Oleg Valuyskiy
>Priority: Major
>  Labels: ise
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Incorrect cluster state output for ACTIVE_READ_ONLY in --baseline.
> org.apache.ignite.internal.commandline.BaselineCommand#baselinePrint0
> {code:java}
> logger.info("Cluster state: " + (res.isActive() ? "active" : 
> "inactive"));
> {code}
> org.apache.ignite.cluster.ClusterState#ACTIVE_READ_ONLY
>  
> An example of changing the cluster state:
> {code:java}
> Command [SET-STATE] started
> Arguments: ... --set-state ACTIVE_READ_ONLY
> 
> Cluster state changed to ACTIVE_READ_ONLY
> Command [SET-STATE] finished with code: 0 {code}
> Cluster state in control.sh --baseline
> {code:java}
> Command [BASELINE] started
> Arguments: ... --baseline
> 
> Cluster state: active
> Current topology version: 1
> Baseline auto adjustment disabled:...
> Current topology version: 1 (...)
> Baseline nodes:
>     ...
> 
> Number of baseline nodes: 1
> Other nodes not found.
> Command [BASELINE] finished with code: 0 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21441) ItSchemaChangeTableViewTest#testAddNewColumn is flaky with Replication is timed out [replicaGrpId=6_part_5]

2024-02-04 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-21441:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> ItSchemaChangeTableViewTest#testAddNewColumn is flaky with Replication is 
> timed out [replicaGrpId=6_part_5]
> ---
>
> Key: IGNITE-21441
> URL: https://issues.apache.org/jira/browse/IGNITE-21441
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Priority: Major
>
> Similar to IGNITE-21394 but with CancellationException as a root cause 
> instead of TimeoutException.
> {code:java}
> Replication is timed out [replicaGrpId=6_part_5]
> org.apache.ignite.tx.TransactionException: IGN-REP-3 
> TraceId:47cb7cb4-3e8d-40ce-8a2f-55d13bb2c798 Replication is timed out 
> [replicaGrpId=6_part_5] {code}
> Possible root cause
> {code:java}
>       
> [2024-02-02T09:47:03,851][ERROR][%isctvt_tanc_3346%Raft-Group-Client-11][WatchProcessor]
>  Error occurred when processing a watch event
>       org.apache.ignite.internal.lang.IgniteInternalException: Failed to get 
> a leader for the RAFT replication group [get=6_part_0].
>         at 
> org.apache.ignite.internal.table.distributed.TableManager.lambda$changePeersOnRebalance$96(TableManager.java:1844)
>  ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
>         at 
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2398) 
> ~[?:?]
>         at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:543)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleThrowable$41(RaftGroupServiceImpl.java:605)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>         at java.lang.Thread.run(Thread.java:834) [?:?]
>       Caused by: java.util.concurrent.CompletionException: 
> java.util.concurrent.CancellationException
>         at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
>  ~[?:?]
>         ... 10 more
>       Caused by: java.util.concurrent.CancellationException
>         at 
> java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396) 
> ~[?:?]
>         ... 8 more
>       [2024-02-02T09:47:03,852][WARN 
> ][%isctvt_tanc_3346%Raft-Group-Client-11][TableManager] Unable to process 
> pending assignments event
>       org.apache.ignite.internal.lang.IgniteInternalException: Failed to get 
> a leader for the RAFT replication group [get=6_part_0]. {code}
> [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7820437?expandBuildDeploymentsSection=false&hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandBuildTestsSection=true&showLog=7820408_2572_91.2439.2498&logFilter=debug&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandCode+Inspection=true&logView=flowAware]
> Failed locally 1 out of 100.
> h3. Implementation Notes
> Seems that we should cover not only TimeoutException while retrieveing leader 
> within watch event processing.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21441) ItSchemaChangeTableViewTest#testAddNewColumn is flaky with Replication is timed out [replicaGrpId=6_part_5]

2024-02-04 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-21441:
-
Labels: ignite-3  (was: )

> ItSchemaChangeTableViewTest#testAddNewColumn is flaky with Replication is 
> timed out [replicaGrpId=6_part_5]
> ---
>
> Key: IGNITE-21441
> URL: https://issues.apache.org/jira/browse/IGNITE-21441
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
>
> Similar to IGNITE-21394 but with CancellationException as a root cause 
> instead of TimeoutException.
> {code:java}
> Replication is timed out [replicaGrpId=6_part_5]
> org.apache.ignite.tx.TransactionException: IGN-REP-3 
> TraceId:47cb7cb4-3e8d-40ce-8a2f-55d13bb2c798 Replication is timed out 
> [replicaGrpId=6_part_5] {code}
> Possible root cause
> {code:java}
>       
> [2024-02-02T09:47:03,851][ERROR][%isctvt_tanc_3346%Raft-Group-Client-11][WatchProcessor]
>  Error occurred when processing a watch event
>       org.apache.ignite.internal.lang.IgniteInternalException: Failed to get 
> a leader for the RAFT replication group [get=6_part_0].
>         at 
> org.apache.ignite.internal.table.distributed.TableManager.lambda$changePeersOnRebalance$96(TableManager.java:1844)
>  ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
>         at 
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2398) 
> ~[?:?]
>         at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:543)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleThrowable$41(RaftGroupServiceImpl.java:605)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>         at java.lang.Thread.run(Thread.java:834) [?:?]
>       Caused by: java.util.concurrent.CompletionException: 
> java.util.concurrent.CancellationException
>         at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
>  ~[?:?]
>         ... 10 more
>       Caused by: java.util.concurrent.CancellationException
>         at 
> java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396) 
> ~[?:?]
>         ... 8 more
>       [2024-02-02T09:47:03,852][WARN 
> ][%isctvt_tanc_3346%Raft-Group-Client-11][TableManager] Unable to process 
> pending assignments event
>       org.apache.ignite.internal.lang.IgniteInternalException: Failed to get 
> a leader for the RAFT replication group [get=6_part_0]. {code}
> [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7820437?expandBuildDeploymentsSection=false&hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandBuildTestsSection=true&showLog=7820408_2572_91.2439.2498&logFilter=debug&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandCode+Inspection=true&logView=flowAware]
> Failed locally 1 out of 100.
> h3. Implementation Notes
> Seems that we should cover not only TimeoutException while retrieveing leader 
> within watch event processing.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21441) ItSchemaChangeTableViewTest#testAddNewColumn is flaky with Replication is timed out [replicaGrpId=6_part_5]

2024-02-04 Thread Alexander Lapin (Jira)
Alexander Lapin created IGNITE-21441:


 Summary: ItSchemaChangeTableViewTest#testAddNewColumn is flaky 
with Replication is timed out [replicaGrpId=6_part_5]
 Key: IGNITE-21441
 URL: https://issues.apache.org/jira/browse/IGNITE-21441
 Project: Ignite
  Issue Type: Bug
Reporter: Alexander Lapin


Similar to IGNITE-21394 but with CancellationException as a root cause instead 
of TimeoutException.
{code:java}
Replication is timed out [replicaGrpId=6_part_5]
org.apache.ignite.tx.TransactionException: IGN-REP-3 
TraceId:47cb7cb4-3e8d-40ce-8a2f-55d13bb2c798 Replication is timed out 
[replicaGrpId=6_part_5] {code}
Possible root cause
{code:java}
      
[2024-02-02T09:47:03,851][ERROR][%isctvt_tanc_3346%Raft-Group-Client-11][WatchProcessor]
 Error occurred when processing a watch event
      org.apache.ignite.internal.lang.IgniteInternalException: Failed to get a 
leader for the RAFT replication group [get=6_part_0].
        at 
org.apache.ignite.internal.table.distributed.TableManager.lambda$changePeersOnRebalance$96(TableManager.java:1844)
 ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
        at 
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
 ~[?:?]
        at 
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
 ~[?:?]
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) 
~[?:?]
        at 
java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2398) 
~[?:?]
        at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:543)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
        at 
org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleThrowable$41(RaftGroupServiceImpl.java:605)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
 [?:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
      Caused by: java.util.concurrent.CompletionException: 
java.util.concurrent.CancellationException
        at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
 ~[?:?]
        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
 ~[?:?]
        at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
 ~[?:?]
        ... 10 more
      Caused by: java.util.concurrent.CancellationException
        at 
java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396) 
~[?:?]
        ... 8 more
      [2024-02-02T09:47:03,852][WARN 
][%isctvt_tanc_3346%Raft-Group-Client-11][TableManager] Unable to process 
pending assignments event
      org.apache.ignite.internal.lang.IgniteInternalException: Failed to get a 
leader for the RAFT replication group [get=6_part_0]. {code}
[https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7820437?expandBuildDeploymentsSection=false&hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandBuildTestsSection=true&showLog=7820408_2572_91.2439.2498&logFilter=debug&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandCode+Inspection=true&logView=flowAware]

Failed locally 1 out of 100.
h3. Implementation Notes

Seems that we should cover not only TimeoutException while retrieveing leader 
within watch event processing.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21441) ItSchemaChangeTableViewTest#testAddNewColumn is flaky with Replication is timed out [replicaGrpId=6_part_5]

2024-02-04 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-21441:
-
Epic Link: IGNITE-21389

> ItSchemaChangeTableViewTest#testAddNewColumn is flaky with Replication is 
> timed out [replicaGrpId=6_part_5]
> ---
>
> Key: IGNITE-21441
> URL: https://issues.apache.org/jira/browse/IGNITE-21441
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
>
> Similar to IGNITE-21394 but with CancellationException as a root cause 
> instead of TimeoutException.
> {code:java}
> Replication is timed out [replicaGrpId=6_part_5]
> org.apache.ignite.tx.TransactionException: IGN-REP-3 
> TraceId:47cb7cb4-3e8d-40ce-8a2f-55d13bb2c798 Replication is timed out 
> [replicaGrpId=6_part_5] {code}
> Possible root cause
> {code:java}
>       
> [2024-02-02T09:47:03,851][ERROR][%isctvt_tanc_3346%Raft-Group-Client-11][WatchProcessor]
>  Error occurred when processing a watch event
>       org.apache.ignite.internal.lang.IgniteInternalException: Failed to get 
> a leader for the RAFT replication group [get=6_part_0].
>         at 
> org.apache.ignite.internal.table.distributed.TableManager.lambda$changePeersOnRebalance$96(TableManager.java:1844)
>  ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
>         at 
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2398) 
> ~[?:?]
>         at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:543)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>         at 
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$handleThrowable$41(RaftGroupServiceImpl.java:605)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>         at java.lang.Thread.run(Thread.java:834) [?:?]
>       Caused by: java.util.concurrent.CompletionException: 
> java.util.concurrent.CancellationException
>         at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
>  ~[?:?]
>         at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
>  ~[?:?]
>         ... 10 more
>       Caused by: java.util.concurrent.CancellationException
>         at 
> java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396) 
> ~[?:?]
>         ... 8 more
>       [2024-02-02T09:47:03,852][WARN 
> ][%isctvt_tanc_3346%Raft-Group-Client-11][TableManager] Unable to process 
> pending assignments event
>       org.apache.ignite.internal.lang.IgniteInternalException: Failed to get 
> a leader for the RAFT replication group [get=6_part_0]. {code}
> [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7820437?expandBuildDeploymentsSection=false&hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandBuildTestsSection=true&showLog=7820408_2572_91.2439.2498&logFilter=debug&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandCode+Inspection=true&logView=flowAware]
> Failed locally 1 out of 100.
> h3. Implementation Notes
> Seems that we should cover not only TimeoutException while retrieveing leader 
> within watch event processing.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21442) DefaultMessagingServiceTest.sendMessagesTwoChannels() hangs

2024-02-04 Thread Roman Puchkovskiy (Jira)
Roman Puchkovskiy created IGNITE-21442:
--

 Summary: DefaultMessagingServiceTest.sendMessagesTwoChannels() 
hangs
 Key: IGNITE-21442
 URL: https://issues.apache.org/jira/browse/IGNITE-21442
 Project: Ignite
  Issue Type: Bug
Reporter: Roman Puchkovskiy
Assignee: Roman Puchkovskiy
 Fix For: 3.0.0-beta2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21443) Ignite 3 can hangs on starting

2024-02-04 Thread Yury Gerzhedovich (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Gerzhedovich updated IGNITE-21443:
---
Fix Version/s: 3.0.0-beta2

> Ignite 3 can hangs on starting
> --
>
> Key: IGNITE-21443
> URL: https://issues.apache.org/jira/browse/IGNITE-21443
> Project: Ignite
>  Issue Type: Bug
>Reporter: Yury Gerzhedovich
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> If something goes wrong Ignite 3 starting just hangs instead of finish start 
> and return control to user. Example is below. 
> {code:java}
> (base) 
> ygerzhedovich@ygerzhedovich-ThinkPad-P52s:~/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/bin$
>  ./ignite3db start
> Starting Ignite 3...
> SLF4J: No SLF4J providers were found.
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
> org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
> TraceId:2248ddcc-80d5-434a-b4c8-fefdcda633c1 Unable to start 
> [node=defaultNode]
>     at 
> org.apache.ignite.internal.app.IgniteImpl.handleStartException(IgniteImpl.java:1067)
>     at org.apache.ignite.internal.app.IgniteImpl.start(IgniteImpl.java:1056)
>     at 
> org.apache.ignite.internal.app.IgnitionImpl.doStart(IgnitionImpl.java:198)
>     at org.apache.ignite.internal.app.IgnitionImpl.start(IgnitionImpl.java:99)
>     at org.apache.ignite.IgnitionManager.start(IgnitionManager.java:72)
>     at org.apache.ignite.IgnitionManager.start(IgnitionManager.java:51)
>     at org.apache.ignite.internal.app.IgniteRunner.call(IgniteRunner.java:48)
>     at org.apache.ignite.internal.app.IgniteRunner.call(IgniteRunner.java:35)
>     at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
>     at picocli.CommandLine.access$1500(CommandLine.java:148)
>     at 
> picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
>     at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
>     at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
>     at 
> picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
>     at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
>     at picocli.CommandLine.execute(CommandLine.java:2170)
>     at org.apache.ignite.internal.app.IgniteRunner.start(IgniteRunner.java:60)
>     at org.apache.ignite.internal.app.IgniteRunner.main(IgniteRunner.java:74)
> Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
> IGN-CMN-65535 TraceId:2248ddcc-80d5-434a-b4c8-fefdcda633c1 While lock file: 
> /home/ygerzhedovich/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/work/vault/LOCK:
>  Resource temporarily unavailable
>     at 
> org.apache.ignite.internal.vault.persistence.PersistentVaultService.start(PersistentVaultService.java:111)
>     at 
> org.apache.ignite.internal.vault.VaultManager.start(VaultManager.java:52)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.startComponent(LifecycleManager.java:79)
>     at org.apache.ignite.internal.app.IgniteImpl.start(IgniteImpl.java:937)
>     ... 16 more
> Caused by: org.rocksdb.RocksDBException: While lock file: 
> /home/ygerzhedovich/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/work/vault/LOCK:
>  Resource temporarily unavailable
>     at org.rocksdb.RocksDB.open(Native Method)
>     at org.rocksdb.RocksDB.open(RocksDB.java:249)
>     at 
> org.apache.ignite.internal.vault.persistence.PersistentVaultService.start(PersistentVaultService.java:109)
>     ... 19 more
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21443) Ignite 3 can hangs on starting

2024-02-04 Thread Yury Gerzhedovich (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Gerzhedovich updated IGNITE-21443:
---
Labels: ignite-3  (was: )

> Ignite 3 can hangs on starting
> --
>
> Key: IGNITE-21443
> URL: https://issues.apache.org/jira/browse/IGNITE-21443
> Project: Ignite
>  Issue Type: Bug
>Reporter: Yury Gerzhedovich
>Priority: Major
>  Labels: ignite-3
>
> If something goes wrong Ignite 3 starting just hangs instead of finish start 
> and return control to user. Example is below. 
> {code:java}
> (base) 
> ygerzhedovich@ygerzhedovich-ThinkPad-P52s:~/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/bin$
>  ./ignite3db start
> Starting Ignite 3...
> SLF4J: No SLF4J providers were found.
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
> org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
> TraceId:2248ddcc-80d5-434a-b4c8-fefdcda633c1 Unable to start 
> [node=defaultNode]
>     at 
> org.apache.ignite.internal.app.IgniteImpl.handleStartException(IgniteImpl.java:1067)
>     at org.apache.ignite.internal.app.IgniteImpl.start(IgniteImpl.java:1056)
>     at 
> org.apache.ignite.internal.app.IgnitionImpl.doStart(IgnitionImpl.java:198)
>     at org.apache.ignite.internal.app.IgnitionImpl.start(IgnitionImpl.java:99)
>     at org.apache.ignite.IgnitionManager.start(IgnitionManager.java:72)
>     at org.apache.ignite.IgnitionManager.start(IgnitionManager.java:51)
>     at org.apache.ignite.internal.app.IgniteRunner.call(IgniteRunner.java:48)
>     at org.apache.ignite.internal.app.IgniteRunner.call(IgniteRunner.java:35)
>     at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
>     at picocli.CommandLine.access$1500(CommandLine.java:148)
>     at 
> picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
>     at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
>     at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
>     at 
> picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
>     at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
>     at picocli.CommandLine.execute(CommandLine.java:2170)
>     at org.apache.ignite.internal.app.IgniteRunner.start(IgniteRunner.java:60)
>     at org.apache.ignite.internal.app.IgniteRunner.main(IgniteRunner.java:74)
> Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
> IGN-CMN-65535 TraceId:2248ddcc-80d5-434a-b4c8-fefdcda633c1 While lock file: 
> /home/ygerzhedovich/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/work/vault/LOCK:
>  Resource temporarily unavailable
>     at 
> org.apache.ignite.internal.vault.persistence.PersistentVaultService.start(PersistentVaultService.java:111)
>     at 
> org.apache.ignite.internal.vault.VaultManager.start(VaultManager.java:52)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.startComponent(LifecycleManager.java:79)
>     at org.apache.ignite.internal.app.IgniteImpl.start(IgniteImpl.java:937)
>     ... 16 more
> Caused by: org.rocksdb.RocksDBException: While lock file: 
> /home/ygerzhedovich/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/work/vault/LOCK:
>  Resource temporarily unavailable
>     at org.rocksdb.RocksDB.open(Native Method)
>     at org.rocksdb.RocksDB.open(RocksDB.java:249)
>     at 
> org.apache.ignite.internal.vault.persistence.PersistentVaultService.start(PersistentVaultService.java:109)
>     ... 19 more
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21443) Ignite 3 can hangs on starting

2024-02-04 Thread Yury Gerzhedovich (Jira)
Yury Gerzhedovich created IGNITE-21443:
--

 Summary: Ignite 3 can hangs on starting
 Key: IGNITE-21443
 URL: https://issues.apache.org/jira/browse/IGNITE-21443
 Project: Ignite
  Issue Type: Bug
Reporter: Yury Gerzhedovich


If something goes wrong Ignite 3 starting just hangs instead of finish start 
and return control to user. Example is below. 
{code:java}
(base) 
ygerzhedovich@ygerzhedovich-ThinkPad-P52s:~/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/bin$
 ./ignite3db start
Starting Ignite 3...
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:2248ddcc-80d5-434a-b4c8-fefdcda633c1 Unable to start [node=defaultNode]
    at 
org.apache.ignite.internal.app.IgniteImpl.handleStartException(IgniteImpl.java:1067)
    at org.apache.ignite.internal.app.IgniteImpl.start(IgniteImpl.java:1056)
    at 
org.apache.ignite.internal.app.IgnitionImpl.doStart(IgnitionImpl.java:198)
    at org.apache.ignite.internal.app.IgnitionImpl.start(IgnitionImpl.java:99)
    at org.apache.ignite.IgnitionManager.start(IgnitionManager.java:72)
    at org.apache.ignite.IgnitionManager.start(IgnitionManager.java:51)
    at org.apache.ignite.internal.app.IgniteRunner.call(IgniteRunner.java:48)
    at org.apache.ignite.internal.app.IgniteRunner.call(IgniteRunner.java:35)
    at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
    at picocli.CommandLine.access$1500(CommandLine.java:148)
    at 
picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
    at 
picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
    at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
    at picocli.CommandLine.execute(CommandLine.java:2170)
    at org.apache.ignite.internal.app.IgniteRunner.start(IgniteRunner.java:60)
    at org.apache.ignite.internal.app.IgniteRunner.main(IgniteRunner.java:74)
Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
IGN-CMN-65535 TraceId:2248ddcc-80d5-434a-b4c8-fefdcda633c1 While lock file: 
/home/ygerzhedovich/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/work/vault/LOCK:
 Resource temporarily unavailable
    at 
org.apache.ignite.internal.vault.persistence.PersistentVaultService.start(PersistentVaultService.java:111)
    at org.apache.ignite.internal.vault.VaultManager.start(VaultManager.java:52)
    at 
org.apache.ignite.internal.app.LifecycleManager.startComponent(LifecycleManager.java:79)
    at org.apache.ignite.internal.app.IgniteImpl.start(IgniteImpl.java:937)
    ... 16 more
Caused by: org.rocksdb.RocksDBException: While lock file: 
/home/ygerzhedovich/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/work/vault/LOCK:
 Resource temporarily unavailable
    at org.rocksdb.RocksDB.open(Native Method)
    at org.rocksdb.RocksDB.open(RocksDB.java:249)
    at 
org.apache.ignite.internal.vault.persistence.PersistentVaultService.start(PersistentVaultService.java:109)
    ... 19 more
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21443) Ignite 3 can hangs on starting

2024-02-04 Thread Yury Gerzhedovich (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Gerzhedovich updated IGNITE-21443:
---
Priority: Critical  (was: Major)

> Ignite 3 can hangs on starting
> --
>
> Key: IGNITE-21443
> URL: https://issues.apache.org/jira/browse/IGNITE-21443
> Project: Ignite
>  Issue Type: Bug
>Reporter: Yury Gerzhedovich
>Priority: Critical
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> If something goes wrong Ignite 3 starting just hangs instead of finish start 
> and return control to user. Example is below. 
> {code:java}
> (base) 
> ygerzhedovich@ygerzhedovich-ThinkPad-P52s:~/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/bin$
>  ./ignite3db start
> Starting Ignite 3...
> SLF4J: No SLF4J providers were found.
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
> org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
> TraceId:2248ddcc-80d5-434a-b4c8-fefdcda633c1 Unable to start 
> [node=defaultNode]
>     at 
> org.apache.ignite.internal.app.IgniteImpl.handleStartException(IgniteImpl.java:1067)
>     at org.apache.ignite.internal.app.IgniteImpl.start(IgniteImpl.java:1056)
>     at 
> org.apache.ignite.internal.app.IgnitionImpl.doStart(IgnitionImpl.java:198)
>     at org.apache.ignite.internal.app.IgnitionImpl.start(IgnitionImpl.java:99)
>     at org.apache.ignite.IgnitionManager.start(IgnitionManager.java:72)
>     at org.apache.ignite.IgnitionManager.start(IgnitionManager.java:51)
>     at org.apache.ignite.internal.app.IgniteRunner.call(IgniteRunner.java:48)
>     at org.apache.ignite.internal.app.IgniteRunner.call(IgniteRunner.java:35)
>     at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
>     at picocli.CommandLine.access$1500(CommandLine.java:148)
>     at 
> picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
>     at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
>     at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
>     at 
> picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
>     at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
>     at picocli.CommandLine.execute(CommandLine.java:2170)
>     at org.apache.ignite.internal.app.IgniteRunner.start(IgniteRunner.java:60)
>     at org.apache.ignite.internal.app.IgniteRunner.main(IgniteRunner.java:74)
> Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
> IGN-CMN-65535 TraceId:2248ddcc-80d5-434a-b4c8-fefdcda633c1 While lock file: 
> /home/ygerzhedovich/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/work/vault/LOCK:
>  Resource temporarily unavailable
>     at 
> org.apache.ignite.internal.vault.persistence.PersistentVaultService.start(PersistentVaultService.java:111)
>     at 
> org.apache.ignite.internal.vault.VaultManager.start(VaultManager.java:52)
>     at 
> org.apache.ignite.internal.app.LifecycleManager.startComponent(LifecycleManager.java:79)
>     at org.apache.ignite.internal.app.IgniteImpl.start(IgniteImpl.java:937)
>     ... 16 more
> Caused by: org.rocksdb.RocksDBException: While lock file: 
> /home/ygerzhedovich/Desktop/Ignite_3_distrib/gridgain-db-9.0.0-ea5/work/vault/LOCK:
>  Resource temporarily unavailable
>     at org.rocksdb.RocksDB.open(Native Method)
>     at org.rocksdb.RocksDB.open(RocksDB.java:249)
>     at 
> org.apache.ignite.internal.vault.persistence.PersistentVaultService.start(PersistentVaultService.java:109)
>     ... 19 more
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)