Hi Sumit,
What is Ignite version that you use?
AFAIK partition map exchange is a king of "stop the world" actiity for
the cluster, so any other actions with cluster (like cache creation)
would be suspended until PME end.
If all of your clients concurrently try to create same cache it's OK
that there are some rolledback transactions, but if your cluster became
unresponsive after that this looks like a bug, so you can submit a JIRA
ticket, with steps to reproduce this issue.
But if you experience a deadlock thic looks like a bug.
On 2021/11/02 13:45:49 Sumit Deshinge wrote:
> Hi,
>
> I have apache ignite cluster of 3 ignite server and more than 20 ignite
> thin clients (each thin client being on separate VM). These thin clients
> are trying to create caches at approximately the same time parallely and
> also starting with cache CRUD operations after that.
>
> Looks like partition map exchange process and cache CRUD operations in
> parallel are causing deadlock or lock acquire failures.
>
> *What should be the strategy to handle this scenario ?*
>
> Ignite server has below errors:
>
> *Exception stack trace 1:*
>
> WARNING: Dumping the near node thread that started transaction
> [xidVer=GridCacheVersion [topVer=247332659, order=1635852705217,
> nodeOrder=1], nodeId=2735bef0-7404-41e3-843f-7043490c9d84]
> Stack trace of the transaction owner thread:
> Thread [name="client-connector-#56%perf-dn1%", id=93, state=WAITING,
> blockCnt=5023, waitCnt=36165]
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
> at o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
> at o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
> at
o.a.i.i.processors.cache.GridCacheAdapter$41.op(GridCacheAdapter.java:3430)
> at
o.a.i.i.processors.cache.GridCacheAdapter$41.op(GridCacheAdapter.java:3423)
> at
o.a.i.i.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4480)
> at
o.a.i.i.processors.cache.GridCacheAdapter.remove0(GridCacheAdapter.java:3423)
> at
o.a.i.i.processors.cache.GridCacheAdapter.remove(GridCacheAdapter.java:3405)
> at
o.a.i.i.processors.cache.GridCacheAdapter.remove(GridCacheAdapter.java:3388)
> at
o.a.i.i.processors.cache.IgniteCacheProxyImpl.remove(IgniteCacheProxyImpl.java:1438)
> at
o.a.i.i.processors.cache.GatewayProtectedCacheProxy.remove(GatewayProtectedCacheProxy.java:964)
> at
o.a.i.i.processors.platform.client.cache.ClientCacheRemoveKeyRequest.process(ClientCacheRemoveKeyRequest.java:41)
> at
o.a.i.i.processors.platform.client.ClientRequestHandler.handle(ClientRequestHandler.java:77)
> at
o.a.i.i.processors.odbc.ClientListenerNioListener.onMessage(ClientListenerNioListener.java:204)
> at
o.a.i.i.processors.odbc.ClientListenerNioListener.onMessage(ClientListenerNioListener.java:55)
> at
o.a.i.i.util.nio.GridNioFilterChain$TailFilter.onMessageReceived(GridNioFilterChain.java:279)
> at
o.a.i.i.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
> at
o.a.i.i.util.nio.GridNioAsyncNotifyFilter$3.body(GridNioAsyncNotifyFilter.java:97)
> at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:120)
> at o.a.i.i.util.worker.GridWorkerPool$1.run(GridWorkerPool.java:70)
> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
> *Exception stack trace 2:*
>
> WARNING: >>> Transaction [startTime=11:39:27.214,
> curTime=11:40:36.277, systemTime=0, userTime=69063, tx=GridNearTxLocal
> [mappings=IgniteTxMappingsImpl [], nearLocallyMapped=false,
> colocatedLocallyMapped=false, needCheckBackup=null,
> hasRemoteLocks=false, trackTimeout=false, systemTime=44700,
> systemStartTime=0, prepareStartTime=0, prepareTime=0,
> commitOrRollbackStartTime=0, commitOrRollbackTime=0, lb=null,
> mvccOp=null, qryId=-1, crdVer=0,
> thread=client-connector-#57%perf-dn1%, mappings=IgniteTxMappingsImpl
> [], super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false,
> span=o.a.i.i.processors.tracing.NoopSpan@4a931268,
> nearNodes=KeySetView [], dhtNodes=KeySetView [], explicitLock=false,
> super=IgniteTxLocalAdapter [completedBase=null,
> sndTransformedVals=false, depEnabled=false, txState=IgniteTxStateImpl
> [activeCacheIds=[], recovery=null, mvccEnabled=null,
> mvccCachingCacheIds=[], txMap=EmptySet []], super=IgniteTxAdapter
> [xidVer=GridCacheVersion [topVer=247332659, order=1635852705226,
> nodeOrder=1], writeVer=null, implicit=false, loc=true, threadId=95,
> startTime=1635853167214, nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0,
> sysInvalidate=false, sys=false, plc=2, commitVer=null,
> finalizing=NONE, invalidParts=null, state=SUSPENDED, timedOut=false,
> topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0],
> mvccSnapshot=null, skipCompletedVers=false, parentTx=null,
> duration=69079ms, onePhaseCommit=false], size=0]]]]
> Nov 2, 2021 11:40:36 AM org.apache.ignite.logger.java.JavaLogger warning
> WARNING: First 10 long running cache futures [total=16]
> Nov 2, 2021 11:40:36 AM org.apache.ignite.logger.java.JavaLogger warning
> WARNING: >>> Future [startTime=11:39:27.324, curTime=11:40:36.277,
> fut=GridDhtLockFuture
> [span=o.a.i.i.processors.tracing.NoopSpan@4a931268,
> nearNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> nearLockVer=GridCacheVersion [topVer=247332659, order=1635852705208,
> nodeOrder=1], topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=162], threadId=124,
> futId=be58a60ec71-1d64903c-c700-4deb-bace-cc5158713120,
> lockVer=GridCacheVersion [topVer=247332659, order=1635852705208,
> nodeOrder=1], read=false, err=null, timedOut=false, timeout=0,
> tx=GridNearTxLocal [mappings=IgniteTxMappingsImpl []dNearNodes=null,
> ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
> nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true],
>
masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
> prevVer=null, nextVer=null]], rmts=null]], flags=3]]], prepared=0,
> locked=false, nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=2,
> partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion
> [topVer=247332659, order=1635852705208, nodeOrder=1]]]],
> super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=247332659,
> order=1635852705208, nodeOrder=1], writeVer=null, implicit=false,
> loc=true, threadId=124, startTime=1635853167214,
> nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0,
> sysInvalidate=false, sys=false, plc=2, commitVer=null,
> finalizing=NONE, invalidParts=null, state=ACTIVE, timedOut=false,
> topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
> mvccSnapshot=null, skipCompletedVers=false, parentTx=null,
> duration=69094ms, onePhaseCommit=false], size=1]],
> nearLocallyMapped=false, colocatedLocallyMapped=true,
> needCheckBackup=null, hasRemoteLocks=false, trackTimeout=false,
> systemTime=75000, systemStartTime=971108549857700, prepareStartTime=0,
> prepareTime=0, commitOrRollbackStartTime=0, commitOrRollbackTime=0,
> lb=null, mvccOp=null, qryId=-1, crdVer=0,
> thread=client-connector-#84%perf-dn1%, mappings=IgniteTxMappingsImpl
> []dNearNodes=null, ownerVer=GridCacheVersion [topVer=247332659,
> order=1635852705211, nodeOrder=1], serOrder=null,
> key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true],
>
masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
> prevVer=null, nextVer=null]], rmts=null]], flags=3]]], prepared=0,
> locked=false, nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=2,
> partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion
> [topVer=247332659, order=1635852705208, nodeOrder=1]]]],
> super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=247332659,
> order=1635852705208, nodeOrder=1], writeVer=null, implicit=false,
> loc=true, threadId=124, startTime=1635853167214,
> nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0,
> sysInvalidate=false, sys=false, plc=2, commitVer=null,
> finalizing=NONE, invalidParts=null, state=ACTIVE, timedOut=false,
> topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
> mvccSnapshot=null, skipCompletedVers=false, parentTx=null,
> duration=69094ms, onePhaseCommit=false], size=1]],
> super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false,
> span=o.a.i.i.processors.tracing.NoopSpan@4a931268,
> nearNodes=KeySetView [], dhtNodes=KeySetView [], explicitLock=false,
> super=IgniteTxLocalAdapter [completedBase=null,
> sndTransformedVals=false, depEnabled=false, txState=IgniteTxStateImpl
> [activeCacheIds=[585748697], recovery=false, mvccEnabled=false,
> mvccCachingCacheIds=[], txMap=ArrayList [IgniteTxEntry
> [txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true], cacheId=585748697], val=TxEntryValueHolder
> [val=null, op=DELETE], prevVal=TxEntryValueHolder [val=null, op=NOOP],
> oldVal=TxEntryValueHolder [val=null, op=NOOP],
> entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1,
> conflictVer=null, explicitVer=null, dhtVer=null,
> filters=CacheEntryPredicate[] [], filtersPassed=false,
> filtersSet=true, entry=GridDhtCacheEntry [rdrs=ReaderId[] [],
> part=244, super=GridDistributedCacheEntry [super=GridCacheMapEntry
> [key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true], val=null, ver=GridCacheVersion [topVer=247332659,
> order=1635852705229, nodeOrder=1], hash=1085684290,
> extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc [locs=LinkedList
> [GridCacheMvccCandidate [nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> ver=GridCacheVersion [topVer=247332659, order=1635852705207,
> nodeOrder=1], threadId=122, id=2104, topVer=AffinityTopologyVersion
> [topVer=1, minorTopVer=162], reentry=null,
> otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> otherVer=GridCacheVersion [topVer=247332659, order=1635852705207,
> nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
> ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
> nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true],
>
masks=local=1|owner=1|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
> prevVer=null, nextVer=null], GridCacheMvccCandidate
> [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
> [topVer=247332659, order=1635852705208, nodeOrder=1], threadId=124,
> id=2102, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
> reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> otherVer=GridCacheVersion [topVer=247332659, order=1635852705208,
> nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
> ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
> nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true],
>
masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
> prevVer=null, nextVer=null], GridCacheMvccCandidate
> [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
> [topVer=247332659, order=1635852705213, nodeOrder=1], threadId=122,
> id=2120, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
> reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> otherVer=GridCacheVersion [topVer=247332659, order=1635852705213,
> nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
> ownerVer=GridCacheVersion [topVer=247332659, order=1635852705207,
> nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true],
>
masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
> prevVer=null, nextVer=null], GridCacheMvccCandidate
> [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
> [topVer=247332659, order=1635852705214, nodeOrder=1], threadId=123,
> id=2118, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
> reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> otherVer=GridCacheVersion [topVer=247332659, order=1635852705214,
> nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
> ownerVer=GridCacheVersion [topVer=247332659, order=1635852705207,
> nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true],
>
masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
> prevVer=null, nextVer=null], GridCacheMvccCandidate
> [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
> [topVer=247332659, order=1635852705217, nodeOrder=1], threadId=93,
> id=2108, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
> reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> otherVer=GridCacheVersion [topVer=247332659, order=1635852705217,
> nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
> ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
> nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true],
>
masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
> prevVer=null, nextVer=null], GridCacheMvccCandidate
> [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
> [topVer=247332659, order=1635852705218, nodeOrder=1], threadId=115,
> id=2106, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
> reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> otherVer=GridCacheVersion [topVer=247332659, order=1635852705218,
> nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
> ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
> nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true],
>
masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
> prevVer=null, nextVer=null], GridCacheMvccCandidate
> [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
> [topVer=247332659, order=1635852705222, nodeOrder=1], threadId=95,
> id=2110, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
> reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> otherVer=GridCacheVersion [topVer=247332659, order=1635852705222,
> nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
> ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
> nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true],
>
masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
> prevVer=null, nextVer=null], GridCacheMvccCandidate
> [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
> [topVer=247332659, order=1635852705223, nodeOrder=1], threadId=120,
> id=2112, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
> reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
> otherVer=GridCacheVersion [topVer=247332659, order=1635852705223,
> nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
> ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
> nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
> val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
> hasValBytes=true],
>
masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_loca
[message truncated...]