[ https://issues.apache.org/jira/browse/IGNITE-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Mashenkov updated IGNITE-11148: -------------------------------------- Fix Version/s: 2.8 > PartitionCountersNeighborcastFuture blocks partition map exchange > ------------------------------------------------------------------ > > Key: IGNITE-11148 > URL: https://issues.apache.org/jira/browse/IGNITE-11148 > Project: Ignite > Issue Type: Bug > Components: mvcc > Reporter: Stepachev Maksim > Priority: Major > Labels: mvcc_stabilization_stage_1 > Fix For: 2.8 > > > We researched a problem with "execution timeout" in Continuous Query 2 for > *CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testMultiThreadedFailover*. > The investigation result showed that we got MVCC problem, as result the test > blocks at *getAndPut*, because in some moment wrong behavior happened: > {code:java} > [16:02:56] : [Step 4/5] [2019-01-30 13:02:56,923][INFO > ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][IgniteTxManager] > Finishing prepared transaction [commit=false, tx=GridDhtTxRemote > [nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b900004, > rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4, > nearXidVer=GridCacheVersion [topVer=160333378, order=1548853376060, > nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter > [explicitVers=null, started=true, commitAllowed=0, > txState=IgniteTxRemoteStateImpl [readMap=EmptyMap {}, > writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter > [xidVer=GridCacheVersion [topVer=160333378, order=1548853376061, > nodeOrder=3], writeVer=GridCacheVersion [topVer=160333378, > order=1548853376062, nodeOrder=3], implicit=false, loc=false, threadId=21, > startTime=1548853376731, nodeId=3e6881c0-1e96-42a9-8bd1-55d344c00002, > startVer=GridCacheVersion [topVer=160333378, order=1548853376060, > nodeOrder=1], endVer=null, isolation=REPEATABLE_READ, > concurrency=PESSIMISTIC, timeout=0, sysInvalidate=false, sys=false, plc=2, > commitVer=GridCacheVersion [topVer=160333378, order=1548853376061, > nodeOrder=3], finalizing=NONE, invalidParts=null, state=PREPARED, > timedOut=false, topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], > mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207, > cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null, > duration=191ms, onePhaseCommit=false]]]]{code} > and after that: > {code:java} > [16:02:56] : [Step 4/5] [2019-01-30 13:02:56,931][INFO > ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][recovery] > Starting delivery partition countres to remote nodes [txId=GridCacheVersion > [topVer=160333378, order=1548853376060, nodeOrder=5], > futId=82cfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4{code} > _!IMPORTANT - we work with PartitionCountersNeighborcastFuture which *doesn't > provide status information* (monitoring)._ > One of possible position of the problem: > PartitionCountersNeighborcastFuture.onNodeLeft > As result we have the transaction in *state=PREPARED* and *completionTime=0* > which never complete : > > {code:java} > [16:03:16]W: [org.apache.ignite:ignite-indexing] [2019-01-30 > 13:03:16,776][WARN > ][exchange-worker-#40%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][diagnostic] > Failed to wait for partition release future [topVer=AffinityTopologyVersion > [topVer=8, minorTopVer=0], node=18519119-475a-448f-8c02-ff1f64900000] > LocalTxReleaseFuture [ > topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], > futures=[ > TxFinishFuture [ > tx=GridDhtTxRemote [ > nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b900004, > rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4, > nearXidVer=GridCacheVersion [topVer=160333378, order=1548853376060, > nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter > [explicitVers=null, started=true, commitAllowed=0, > txState=IgniteTxRemoteStateImpl [readMap=EmptyMap {}, > writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter [ > xidVer=GridCacheVersion [topVer=160333378, order=1548853376061, > nodeOrder=3], > writeVer=GridCacheVersion [topVer=160333378, order=1548853376062, > nodeOrder=3], implicit=false, loc=false, threadId=21, > startTime=1548853376731, nodeId=3e6881c0-1e96-42a9-8bd1-55d344c00002, > startVer=GridCacheVersion [topVer=160333378, order=1548853376060, > nodeOrder=1], endVer=null, isolation=REPEATABLE_READ, > concurrency=PESSIMISTIC, timeout=0, sysInvalidate=false, sys=false, plc=2, > commitVer=GridCacheVersion [topVer=160333378, order=1548853376061, > nodeOrder=3], finalizing=RECOVERY_FINISH, invalidParts=null, state=PREPARED, > timedOut=false, topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], > mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207, > cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null, > duration=20048ms, onePhaseCommit=false]]], completionTime=0, duration=20048] > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)