[ 
https://issues.apache.org/jira/browse/IGNITE-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Semen Boikov closed IGNITE-1843.
--------------------------------
    Assignee:     (was: Semen Boikov)

> Hang in GridJobProcessor on node stop
> -------------------------------------
>
>                 Key: IGNITE-1843
>                 URL: https://issues.apache.org/jira/browse/IGNITE-1843
>             Project: Ignite
>          Issue Type: Bug
>          Components: compute
>            Reporter: Semen Boikov
>            Priority: Blocker
>             Fix For: 1.5
>
>
> Observed hang in GridTaskFailoverAffinityRunTest.testNodeRestart:
> GridJobProcessor in onKernalStop tries to block operations and gets write 
> lock:
> {noformat}
> [11:42:06] :           [org.apache.ignite:ignite-core] Thread 
> [name="restart-thread-2", id=27562, state=TIMED_WAITING, blockCnt=885, 
> waitCnt=27131]
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.lang.Thread.sleep(Native Method)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.util.GridSpinReadWriteLock.writeLock(GridSpinReadWriteLock.java:210)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.processors.job.GridJobProcessor.onKernalStop(GridJobProcessor.java:277)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.IgniteKernal.stop0(IgniteKernal.java:1824)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.IgniteKernal.stop(IgniteKernal.java:1770)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2133)
> [11:42:06] :           [org.apache.ignite:ignite-core]         - locked 
> o.a.i.i.IgnitionEx$IgniteNamedInstance@6602227a
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2096)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.IgnitionEx.stop(IgnitionEx.java:314)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.Ignition.stop(Ignition.java:223)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:802)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1060)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.GridTaskFailoverAffinityRunTest.access$000(GridTaskFailoverAffinityRunTest.java:44)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.GridTaskFailoverAffinityRunTest$1.call(GridTaskFailoverAffinityRunTest.java:121)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.testframework.GridTestThread.run(GridTestThread.java:86)
> {noformat}
> Discovery listener thread is blocked trying to get read lock:
> {noformat}
> [11:42:06] :           [org.apache.ignite:ignite-core] Thread 
> [name="disco-event-worker-#26595%internal.GridTaskFailoverAffinityRunTest2%", 
> id=32093, state=TIMED_WAITING, blockCnt=0, waitCnt=25843]
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.lang.Thread.sleep(Native Method)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.util.GridSpinReadWriteLock.readLock(GridSpinReadWriteLock.java:101)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.processors.job.GridJobProcessor$JobDiscoveryListener.onEvent(GridJobProcessor.java:1854)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:770)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:755)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.managers.eventstorage.GridEventStorageManager.record(GridEventStorageManager.java:295)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.managers.discovery.GridDiscoveryManager$DiscoveryWorker.recordEvent(GridDiscoveryManager.java:1949)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body0(GridDiscoveryManager.java:2156)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body(GridDiscoveryManager.java:1989)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.lang.Thread.run(Thread.java:745)
> {noformat}
> There is hanging 'get' from marshaller cache inside GridJobProcessor read 
> lock which probably depends on discovery event:
> {noformat}
> [11:42:06] :           [org.apache.ignite:ignite-core] Thread 
> [name="ignite-#26553%pub-internal.GridTaskFailoverAffinityRunTest2%", 
> id=32037, state=WAITING, blockCnt=2, waitCnt=3]
> [11:42:06] :           [org.apache.ignite:ignite-core]     Lock 
> [object=o.a.i.i.processors.cache.distributed.dht.GridPartitionedGetFuture@30ae29ee,
>  ownerName=null, ownerId=-1]
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> sun.misc.Unsafe.park(Native Method)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:157)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:115)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.processors.cache.GridCacheAdapter.getTopologySafe(GridCacheAdapter.java:1312)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.MarshallerContextImpl.className(MarshallerContextImpl.java:151)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.MarshallerContextAdapter.getClass(MarshallerContextAdapter.java:174)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.marshaller.optimized.OptimizedMarshallerUtils.classDescriptor(OptimizedMarshallerUtils.java:257)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:309)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:364)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.processors.closure.GridClosureProcessor$C2.readExternal(GridClosureProcessor.java:1808)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.marshaller.optimized.OptimizedObjectInputStream.readExternalizable(OptimizedObjectInputStream.java:514)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.marshaller.optimized.OptimizedClassDescriptor.read(OptimizedClassDescriptor.java:803)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:315)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:364)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.marshaller.optimized.OptimizedMarshaller.unmarshal(OptimizedMarshaller.java:248)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.processors.job.GridJobWorker.initialize(GridJobWorker.java:409)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1094)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1776)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:811)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.managers.communication.GridIoManager.access$1500(GridIoManager.java:106)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> o.a.i.i.managers.communication.GridIoManager$5.run(GridIoManager.java:774)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [11:42:06] :           [org.apache.ignite:ignite-core]         at 
> java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to