[ https://issues.apache.org/jira/browse/IGNITE-8752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515664#comment-16515664 ]
ASF GitHub Bot commented on IGNITE-8752: ---------------------------------------- GitHub user ilantukh opened a pull request: https://github.com/apache/ignite/pull/4215 IGNITE-8752 : Removed topology readlock acquisition when reading local update counters. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-8752 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/4215.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4215 ---- commit 19fadabd0e0bcced6ab285d8a9f86679ab5a4a01 Author: Ilya Lantukh <ilantukh@...> Date: 2018-06-18T12:59:06Z IGNITE-8752 : Removed topology readlock acquisition when reading local update counters. ---- > Deadlock when registering binary metadata while holding topology read lock > -------------------------------------------------------------------------- > > Key: IGNITE-8752 > URL: https://issues.apache.org/jira/browse/IGNITE-8752 > Project: Ignite > Issue Type: Bug > Affects Versions: 2.1 > Reporter: Alexey Goncharuk > Assignee: Ilya Lantukh > Priority: Critical > Fix For: 2.6 > > Attachments: DeadlockTest.java > > > The following deadlock was reproduced on ignite-2.4 version: > {code} > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) > at > org.apache.ignite.internal.MarshallerContextImpl.registerClassName(MarshallerContextImpl.java:284) > at > org.apache.ignite.internal.binary.BinaryContext.registerUserClassName(BinaryContext.java:1191) > at > org.apache.ignite.internal.binary.BinaryContext.registerUserClassDescriptor(BinaryContext.java:773) > at > org.apache.ignite.internal.binary.BinaryContext.registerClassDescriptor(BinaryContext.java:751) > at > org.apache.ignite.internal.binary.BinaryContext.descriptorForClass(BinaryContext.java:622) > at > org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal0(BinaryWriterExImpl.java:164) > at > org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:147) > at > org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:134) > at > org.apache.ignite.internal.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:251) > at > org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.marshalToBinary(CacheObjectBinaryProcessorImpl.java:396) > at > org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.marshalToBinary(CacheObjectBinaryProcessorImpl.java:381) > at > org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.toBinary(CacheObjectBinaryProcessorImpl.java:875) > at > org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.toCacheObject(CacheObjectBinaryProcessorImpl.java:825) > at > org.apache.ignite.internal.processors.cache.GridCacheContext.toCacheObject(GridCacheContext.java:1783) > at > org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.runEntryProcessor(GridCacheMapEntry.java:5264) > at > org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:4667) > at > org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:4484) > at > org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:3083) > at > org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$6200(BPlusTree.java:2977) > at > org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1732) > at > org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1610) > at > org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1270) > at > org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:370) > at > org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:1769) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2420) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:1883) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1736) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1628) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:299) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:483) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:443) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1117) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.invoke0(GridDhtAtomicCache.java:826) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.invoke(GridDhtAtomicCache.java:784) > at > org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.invoke(IgniteCacheProxyImpl.java:1359) > at > org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.invoke(GatewayProtectedCacheProxy.java:1172) > {code} > {code} > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000000008d108960> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.lock0(StripedCompositeReadWriteLock.java:154) > at > org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.lock(StripedCompositeReadWriteLock.java:123) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.update(GridDhtPartitionTopologyImpl.java:1314) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.processFullPartitionUpdate(GridCachePartitionExchangeManager.java:1463) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.access$1100(GridCachePartitionExchangeManager.java:134) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$3.onMessage(GridCachePartitionExchangeManager.java:340) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$3.onMessage(GridCachePartitionExchangeManager.java:338) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:2696) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:2675) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1054) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555) > at > org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126) > at > org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090) > {code} > {code} > "tcp-disco-msg-worker-#2%grid-name%" #127 prio=10 os_prio=0 > tid=0x00007f9539745000 nid=0x6149 waiting on condition [0x00007f941aff3000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000000008d188980> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.localUpdateCounters(GridDhtPartitionTopologyImpl.java:2503) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartAckRequest(GridContinuousProcessor.java:996) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$500(GridContinuousProcessor.java:103) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$3.onCustomEvent(GridContinuousProcessor.java:194) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$3.onCustomEvent(GridContinuousProcessor.java:187) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:691) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery(GridDiscoveryManager.java:577) > - locked <0x000000008d188d08> (a java.lang.Object) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.notifyDiscoveryListener(ServerImpl.java:5453) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processCustomMessage(ServerImpl.java:5340) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2739) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2531) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6730) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2614) > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > Locked ownable synchronizers: > - None > {code} > A system thread holds the topology read lock and waits for binary meta > registration, while another user thread is registering a continuous query. As > a result, discovery worker is also attempting to acquire read lock, but this > read lock is blocked because exchange worker is already enqueued for write > lock. > One of the solutions would be an ability to read partition update counters > without acquiring topology read lock, however I am not sure if this is > possible. > Another solution is never synchronously wait for metadata update inside > topology read lock. If we need to register a new type, we should acquire the > class to register, release the lock, register the class and then retry to > apply an entry processor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)