Hi All ,
We have observed a peculiar case in which ignite shutdown gets stuck indefinitely We have deployed ignite 2 nodes in embedded mode. 1. On Node1 , some cache operations are performed which triggers BinaryMetadata Transfer/sync up with other node 1. This wait is indefinite and there is no timeout here 2. As this is a cache put , one write lock also gets acquired. 2. As node1 is waiting for response from node2 , Node1 gets segmented and loses connectivity with node2 1. Due to this , Node1 never gets expected response from node2 and does not come out of the waiting 3. On node segmentation , we are closing Ignite , but as some thread are still performing cache operation and are stuck at binary metadata transfer , Ignite is not able to close 1. Ignite close is waiting for the write lock acquired in step 1 to get released ,which will not happen in this case. Following is the thread dump java.lang.Thread.State: WAITING (parking) -> <<THREAD WAITING FOR BINARY METADATA TRANSFER>> at jdk.internal.misc.Unsafe.park(java.base@11.0.19/Native Method) at java.util.concurrent.locks.LockSupport.park( java.base@11.0.19/LockSupport.java:323) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:179) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:142) at org.apache.ignite.internal.processors.cache.binary.BinaryMetadataTransport.putAndWaitPendingUpdate(BinaryMetadataTransport.java:281) at org.apache.ignite.internal.processors.cache.binary.BinaryMetadataTransport.requestMetadataUpdate(BinaryMetadataTransport.java:221) at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:638) at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$1.addMeta(CacheObjectBinaryProcessorImpl.java:292) at org.apache.ignite.internal.binary.BinaryContext.updateMetadata(BinaryContext.java:1337) at org.apache.ignite.internal.binary.BinaryClassDescriptor.write(BinaryClassDescriptor.java:862) at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal0(BinaryWriterExImpl.java:232) at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:165) at org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:152) at org.apache.ignite.internal.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:251) at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.marshalToBinary(CacheObjectBinaryProcessorImpl.java:583) at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.toBinary(CacheObjectBinaryProcessorImpl.java:1492) at org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.toCacheObject(CacheObjectBinaryProcessorImpl.java:1329) at org.apache.ignite.internal.processors.cache.GridCacheContext.toCacheObject(GridCacheContext.java:1822) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.enlistWriteEntry(GridNearTxLocal.java:1546) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.enlistWrite(GridNearTxLocal.java:1083) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.putAsync0(GridNearTxLocal.java:635) at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.putAsync(GridNearTxLocal.java:484) at org.apache.ignite.internal.processors.cache.GridCacheAdapter$20.op(GridCacheAdapter.java:2511) at org.apache.ignite.internal.processors.cache.GridCacheAdapter$20.op(GridCacheAdapter.java:2509) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4284) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put0(GridCacheAdapter.java:2509) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2487) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2466) at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1332) at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:867) java.lang.Thread.State: TIMED_WAITING (sleeping) <<IGNITE WAITING TO BE STOPPED>> at java.lang.Thread.sleep(java.base@11.0.19/Native Method) at org.apache.ignite.internal.util.IgniteUtils.sleep(IgniteUtils.java:8270) at org.apache.ignite.internal.processors.cache.GridCacheGateway.onStopped(GridCacheGateway.java:324) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.blockGateways(GridCacheProcessor.java:806) at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:1916) at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:1806) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2340) - locked <0x0000000685a014c0> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2163) at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:351) at org.apache.ignite.Ignition.stop(Ignition.java:230) at org.apache.ignite.internal.IgniteKernal.close(IgniteKernal.java:2776) at org.apache.ignite.cache.CacheManager.close(CacheManager.java:411) *I wanted to know if this analysis is correct and what are the alternative/workaround/configuration that I can use to avoid this issue.* -- Thanks and Regard Atul Dhatrak