Stepachev Maksim created IGNITE-11148:
-----------------------------------------
Summary: PartitionCountersNeighborcastFuture blocks partition map
exchange
Key: IGNITE-11148
URL: https://issues.apache.org/jira/browse/IGNITE-11148
Project: Ignite
Issue Type: Bug
Components: mvcc
Reporter: Stepachev Maksim
We researched a problem with "execution timeout" in the Continuous Query 2 for
*CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testMultiThreadedFailover*.
The investigation result showed that we got MVCC problem, as result the test
blocks at getAndPut, because in some moment wrong behavior happened:
{code:java}
[16:02:56] : [Step 4/5] [2019-01-30 13:02:56,923][INFO
][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][IgniteTxManager]
Finishing prepared transaction [commit=false, tx=GridDhtTxRemote
[nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b900004,
rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4,
nearXidVer=GridCacheVersion [topVer=160333378, order=1548853376060,
nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter
[explicitVers=null, started=true, commitAllowed=0,
txState=IgniteTxRemoteStateImpl [readMap=EmptyMap {},
writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter
[xidVer=GridCacheVersion [topVer=160333378, order=1548853376061, nodeOrder=3],
writeVer=GridCacheVersion [topVer=160333378, order=1548853376062, nodeOrder=3],
implicit=false, loc=false, threadId=21, startTime=1548853376731,
nodeId=3e6881c0-1e96-42a9-8bd1-55d344c00002, startVer=GridCacheVersion
[topVer=160333378, order=1548853376060, nodeOrder=1], endVer=null,
isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0,
sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion
[topVer=160333378, order=1548853376061, nodeOrder=3], finalizing=NONE,
invalidParts=null, state=PREPARED, timedOut=false,
topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0],
mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207,
cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null,
duration=191ms, onePhaseCommit=false]]]]{code}
and after that:
{code:java}
[16:02:56] : [Step 4/5] [2019-01-30 13:02:56,931][INFO
][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][recovery]
Starting delivery partition countres to remote nodes [txId=GridCacheVersion
[topVer=160333378, order=1548853376060, nodeOrder=5],
futId=82cfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4{code}
_!IMPORTANT - we work with PartitionCountersNeighborcastFuture which doesn't
provide status information (monitoring)._
One of possible position of the problem:
PartitionCountersNeighborcastFuture.onNodeLeft
As result we have the transaction in state=PREPARED and completionTime=0 which
never complete :
{code:java}
[16:03:16]W: [org.apache.ignite:ignite-indexing] [2019-01-30 13:03:16,776][WARN
][exchange-worker-#40%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][diagnostic]
Failed to wait for partition release future [topVer=AffinityTopologyVersion
[topVer=8, minorTopVer=0], node=18519119-475a-448f-8c02-ff1f64900000]
LocalTxReleaseFuture [
topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0],
futures=[
TxFinishFuture [
tx=GridDhtTxRemote [
nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b900004,
rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4,
nearXidVer=GridCacheVersion [topVer=160333378, order=1548853376060,
nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter
[explicitVers=null, started=true, commitAllowed=0,
txState=IgniteTxRemoteStateImpl [readMap=EmptyMap {},
writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter [
xidVer=GridCacheVersion [topVer=160333378, order=1548853376061, nodeOrder=3],
writeVer=GridCacheVersion [topVer=160333378, order=1548853376062,
nodeOrder=3], implicit=false, loc=false, threadId=21, startTime=1548853376731,
nodeId=3e6881c0-1e96-42a9-8bd1-55d344c00002, startVer=GridCacheVersion
[topVer=160333378, order=1548853376060, nodeOrder=1], endVer=null,
isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0,
sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion
[topVer=160333378, order=1548853376061, nodeOrder=3],
finalizing=RECOVERY_FINISH, invalidParts=null, state=PREPARED, timedOut=false,
topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0],
mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207,
cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null,
duration=20048ms, onePhaseCommit=false]]], completionTime=0, duration=20048]
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)