[ 
https://issues.apache.org/jira/browse/IGNITE-17383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713064#comment-17713064
 ] 

Julia Bakulina commented on IGNITE-17383:
-----------------------------------------

There needs to be 2 checks on the cluster state and persistent data region.

What is still to be done:
 * 1st check - add the condition on an inactive cluster in 
IdleVerify.execute(), i.e. before executeTaskByNameOnNode() - in order to fail 
fast;
 * 2nd check is already done;
 * if smb changes the cluster state after the 1st check then do the same as 
with other errors, i.e. return code OK and write the error into idle_verify.txt

> IdleVerify hangs when called on inactive cluster with persistence
> -----------------------------------------------------------------
>
>                 Key: IGNITE-17383
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17383
>             Project: Ignite
>          Issue Type: Bug
>          Components: control.sh
>            Reporter: Ilya Shishkov
>            Assignee: Julia Bakulina
>            Priority: Minor
>              Labels: ise
>             Fix For: 2.16
>
>          Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> When you call {{control.sh --cache idle_verify}} on inactive cluster with 
> persistence, control script hangs and no actions are performed. As you can 
> see below in 'rest' thread dump, {{VerifyBackupPartitionsTaskV2}} waits for 
> checkpoint start in {{GridCacheDatabaseSharedManager#waitForCheckpoint}}.
> It seems, that we can interrupt task execution and print message in control 
> script output, that IdleVerify can't work on inactive cluster.
> {code:title=Thread dump}
> "rest-#82%ignite-server%" #146 prio=5 os_prio=31 tid=0x00007fe0cf97c000 
> nid=0x3607 waiting on condition [0x0000700010149000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.waitForCheckpoint(GridCacheDatabaseSharedManager.java:1869)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.IgniteCacheDatabaseSharedManager.waitForCheckpoint(IgniteCacheDatabaseSharedManager.java:1107)
>       at 
> org.apache.ignite.internal.processors.cache.verify.VerifyBackupPartitionsTaskV2$VerifyBackupPartitionsJobV2.execute(VerifyBackupPartitionsTaskV2.java:199)
>       at 
> org.apache.ignite.internal.processors.cache.verify.VerifyBackupPartitionsTaskV2$VerifyBackupPartitionsJobV2.execute(VerifyBackupPartitionsTaskV2.java:171)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:620)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7366)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:614)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:539)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1343)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1444)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:674)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:540)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:860)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:470)
>       at 
> org.apache.ignite.internal.IgniteComputeImpl.executeAsync0(IgniteComputeImpl.java:514)
>       at 
> org.apache.ignite.internal.IgniteComputeImpl.executeAsync(IgniteComputeImpl.java:496)
>       at 
> org.apache.ignite.internal.visor.verify.VisorIdleVerifyJob.run(VisorIdleVerifyJob.java:70)
>       at 
> org.apache.ignite.internal.visor.verify.VisorIdleVerifyJob.run(VisorIdleVerifyJob.java:35)
>       at org.apache.ignite.internal.visor.VisorJob.execute(VisorJob.java:69)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:620)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7366)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:614)
>       at 
> org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:539)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1343)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1444)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:674)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:540)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:860)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:590)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:570)
>       at 
> org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler.handleAsyncUnsafe(GridTaskCommandHandler.java:223)
>       at 
> org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler.handleAsync(GridTaskCommandHandler.java:162)
>       at 
> org.apache.ignite.internal.processors.rest.GridRestProcessor.handleRequest0(GridRestProcessor.java:317)
>       at 
> org.apache.ignite.internal.processors.rest.GridRestProcessor.handleRequest(GridRestProcessor.java:303)
>       at 
> org.apache.ignite.internal.processors.rest.GridRestProcessor.access$000(GridRestProcessor.java:108)
>       at 
> org.apache.ignite.internal.processors.rest.GridRestProcessor$2.body(GridRestProcessor.java:189)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:750)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to