[ 
https://issues.apache.org/jira/browse/IGNITE-15300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17411795#comment-17411795
 ] 

Pavel Pereslegin edited comment on IGNITE-15300 at 9/8/21, 8:50 AM:
--------------------------------------------------------------------

The test hangs when the restore process is initiated from node 1, whose 
communication is later blocked (and cannot be unblocked).
The test flaky fails due to a state sync issue. We are canceling the process on 
two nodes, but only waiting on the initiator to complete (this has been fixed 
in IGNITE-14794).

It looks like the patch proposed in IGNITE-14794 fixes this completely.

Checked it on TeamCity (the problem is hardly reproducible locally), [suite 
started 80+ 
times|https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_ControlUtilityZookeeper&tab=buildTypeHistoryList&branch_IgniteTests24Java8=pull%2F9186%2Fhead].

Execution timeouts (not related to this issue) - 2 times.
testBaselineCollectCrd - 6 failures.
testBaselineCollect - 1 failure.
testSnapshotRestoreCancelAndStatus - *0* failures.


was (Author: xtern):
The test hangs when the restore process is initiated from node 1, whose 
communication is later blocked (and cannot be unblocked).
The test flaky fails due to a state sync issue. We are canceling the process on 
two nodes, but only waiting on the initiator to complete (this has been fixed 
in IGNITE-14794).

It looks like the patch proposed in IGNITE-14794 fixes this completely.

Checked it on TeamCity (the problem is hardly reproducible locally), suite 
started 80+ times.

Execution timeouts (not related to this issue) - 2 times.
testBaselineCollectCrd - 6 failures.
testBaselineCollect - 1 failure.
testSnapshotRestoreCancelAndStatus - *0* failures.

> Test testSnapshotRestoreCancelAndStatus flaky in Zookeepr SPI environment
> -------------------------------------------------------------------------
>
>                 Key: IGNITE-15300
>                 URL: https://issues.apache.org/jira/browse/IGNITE-15300
>             Project: Ignite
>          Issue Type: Test
>            Reporter: Maxim Muzafarov
>            Assignee: Pavel Pereslegin
>            Priority: Major
>              Labels: iep-43
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://ci.ignite.apache.org/viewLog.html?buildId=6123288&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ControlUtilityZookeeper#testNameId-4389213602152674112
> {code}
> [2021-08-09 22:59:49,757][ERROR][main][root] Test failed 
> [test=GridCommandHandlerTest#testSnapshotRestoreCancelAndStatus, 
> duration=16514]
> java.lang.AssertionError
>       at org.junit.Assert.fail(Assert.java:86)
>       at org.junit.Assert.assertTrue(Assert.java:41)
>       at org.junit.Assert.assertTrue(Assert.java:52)
>       at 
> org.apache.ignite.testframework.GridTestUtils.assertContains(GridTestUtils.java:391)
>       at 
> org.apache.ignite.util.GridCommandHandlerTest.testSnapshotRestoreCancelAndStatus(GridCommandHandlerTest.java:3312)
>       at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.apache.ignite.testframework.junits.GridAbstractTest$7.run(GridAbstractTest.java:2432)
> {code}
> Sometimes zk suite hangs ([execution 
> timeout|https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_ControlUtilityZookeeper&tab=buildTypeHistoryList&branch_IgniteTests24Java8=%3Cdefault%3E&state=failed])
>  on this test with the following stacktrace.
> {noformat}
> "rest-#15365%gridCommandHandlerTest0%" #16591 prio=5 os_prio=0 
> tid=0x00007f7e7842b800 nid=0x1a79 waiting on condition [0x00007f7e30416000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>       at 
> org.apache.ignite.internal.util.future.IgniteFutureImpl.get(IgniteFutureImpl.java:152)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotRestoreCancelTask$1.execute(SnapshotRestoreCancelTask.java:43)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:601)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7270)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:595)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:522)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1305)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1435)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:665)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:535)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:834)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:448)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:427)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManager.executeRestoreManagementTask(IgniteSnapshotManager.java:1743)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.snapshot.IgniteSnapshotManager.cancelSnapshotRestore(IgniteSnapshotManager.java:1008)
>       at 
> org.apache.ignite.internal.visor.snapshot.VisorSnapshotRestoreTask$VisorSnapshotRestoreCancelJob.run(VisorSnapshotRestoreTask.java:93)
>       at 
> org.apache.ignite.internal.visor.snapshot.VisorSnapshotRestoreTask$VisorSnapshotRestoreCancelJob.run(VisorSnapshotRestoreTask.java:79)
>       at org.apache.ignite.internal.visor.VisorJob.execute(VisorJob.java:69)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:601)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7270)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:595)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:522)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1305)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1435)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:665)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:535)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:834)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:568)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:548)
>       at 
> org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler.handleAsyncUnsafe(GridTaskCommandHandler.java:223)
>       at 
> org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler.handleAsync(GridTaskCommandHandler.java:162)
>       at 
> org.apache.ignite.internal.processors.rest.GridRestProcessor.handleRequest0(GridRestProcessor.java:316)
>       at 
> org.apache.ignite.internal.processors.rest.GridRestProcessor.handleRequest(GridRestProcessor.java:302)
>       at 
> org.apache.ignite.internal.processors.rest.GridRestProcessor.access$000(GridRestProcessor.java:107)
>       at 
> org.apache.ignite.internal.processors.rest.GridRestProcessor$2.body(GridRestProcessor.java:188)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to