Hi All ,
We have observed a peculiar case in which ignite shutdown gets stuck
indefinitely
We have deployed ignite 2 nodes in embedded mode.
1. On Node1 , some cache operations are performed which triggers
BinaryMetadata Transfer/sync up with other node
1. This wait is indefinite and there is no timeout here
2. As this is a cache put , one write lock also gets acquired.
2. As node1 is waiting for response from node2 , Node1 gets segmented
and loses connectivity with node2
1. Due to this , Node1 never gets expected response from node2 and
does not come out of the waiting
3. On node segmentation , we are closing Ignite , but as some thread are
still performing cache operation and are stuck at binary metadata transfer
, Ignite is not able to close
1. Ignite close is waiting for the write lock acquired in step 1 to
get released ,which will not happen in this case.
Following is the thread dump
java.lang.Thread.State: WAITING (parking) -> <<THREAD WAITING FOR
BINARY METADATA TRANSFER>>
at jdk.internal.misc.Unsafe.park([email protected]/Native
Method)
at java.util.concurrent.locks.LockSupport.park(
[email protected]/LockSupport.java:323)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:179)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:142)
at
org.apache.ignite.internal.processors.cache.binary.BinaryMetadataTransport.putAndWaitPendingUpdate(BinaryMetadataTransport.java:281)
at
org.apache.ignite.internal.processors.cache.binary.BinaryMetadataTransport.requestMetadataUpdate(BinaryMetadataTransport.java:221)
at
org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:638)
at
org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$1.addMeta(CacheObjectBinaryProcessorImpl.java:292)
at
org.apache.ignite.internal.binary.BinaryContext.updateMetadata(BinaryContext.java:1337)
at
org.apache.ignite.internal.binary.BinaryClassDescriptor.write(BinaryClassDescriptor.java:862)
at
org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal0(BinaryWriterExImpl.java:232)
at
org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:165)
at
org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:152)
at
org.apache.ignite.internal.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:251)
at
org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.marshalToBinary(CacheObjectBinaryProcessorImpl.java:583)
at
org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.toBinary(CacheObjectBinaryProcessorImpl.java:1492)
at
org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.toCacheObject(CacheObjectBinaryProcessorImpl.java:1329)
at
org.apache.ignite.internal.processors.cache.GridCacheContext.toCacheObject(GridCacheContext.java:1822)
at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.enlistWriteEntry(GridNearTxLocal.java:1546)
at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.enlistWrite(GridNearTxLocal.java:1083)
at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.putAsync0(GridNearTxLocal.java:635)
at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.putAsync(GridNearTxLocal.java:484)
at
org.apache.ignite.internal.processors.cache.GridCacheAdapter$20.op(GridCacheAdapter.java:2511)
at
org.apache.ignite.internal.processors.cache.GridCacheAdapter$20.op(GridCacheAdapter.java:2509)
at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4284)
at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put0(GridCacheAdapter.java:2509)
at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2487)
at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2466)
at
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1332)
at
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:867)
java.lang.Thread.State: TIMED_WAITING (sleeping) <<IGNITE WAITING
TO BE STOPPED>>
at java.lang.Thread.sleep([email protected]/Native Method)
at
org.apache.ignite.internal.util.IgniteUtils.sleep(IgniteUtils.java:8270)
at
org.apache.ignite.internal.processors.cache.GridCacheGateway.onStopped(GridCacheGateway.java:324)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.blockGateways(GridCacheProcessor.java:806)
at
org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:1916)
at
org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:1806)
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2340)
- locked <0x0000000685a014c0> (a
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2163)
at
org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:351)
at org.apache.ignite.Ignition.stop(Ignition.java:230)
at
org.apache.ignite.internal.IgniteKernal.close(IgniteKernal.java:2776)
at
org.apache.ignite.cache.CacheManager.close(CacheManager.java:411)
*I wanted to know if this analysis is correct and what are the
alternative/workaround/configuration that I can use to avoid this issue.*
--
Thanks and Regard
Atul Dhatrak