[ https://issues.apache.org/jira/browse/IGNITE-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422805#comment-16422805 ]
Andrey Gura commented on IGNITE-6827: ------------------------------------- [~ascherbakov] I've reviewed changes and have some comments. First of all, it seems strange that {{rollbackOnTopologyChangeTimeout}} parameter is part of {{TransactionConfiguration}}. Actually PME uses this value and it is PME's responsibility to rollback transactions, so timeout should be parameter of exchange. {{GridDistributedTxFinishRequest}} contains new field {{timedout}}. I believe that we can use bit from {{flags}} field instead of new boolean field. In this case message format will n't be changed. Some minor comments: * Some TODO's added but isn't resolved in classes: {{IgniteTxManager}}, {{GridCacheAdapter}}, {{GridCAcheMvccManager}}. * {{IgniteTxAdapter.remainingTime}}: {{timedOut}} flag value checking is redundant because the same work will be done in couple lines below. * {{GridCachePartitionExchangeManager.body0()}}: two calls of {{cfg.getTransactionConfiguration().getRollbackOnTopologyChangeTimeout()}} can be rewritten in more concise and convenient form: Before: {code:java} boolean rollbackEnabled = cfg.getTransactionConfiguration().getRollbackOnTopologyChangeTimeout() > 0; final long dumpTimeout = 2 * cctx.gridConfig().getNetworkTimeout(); long nextDumpTime = 0; while (true) { try { resVer = exchFut.get(rollbackEnabled ? cfg.getTransactionConfiguration(). getRollbackOnTopologyChangeTimeout() : dumpTimeout, TimeUnit.MILLISECONDS); break; } {code} After: {code:java} long rollbackTimeout = cfg.getTransactionConfiguration().getRollbackOnTopologyChangeTimeout(); final long dumpTimeout = 2 * cctx.gridConfig().getNetworkTimeout(); long nextDumpTime = 0; while (true) { try { resVer = exchFut.get(rollbackTimeout > 0 ? rollbackTimeout : dumpTimeout, TimeUnit.MILLISECONDS); break; } {code} > Configurable rollback for long running transactions before partition exchange > ----------------------------------------------------------------------------- > > Key: IGNITE-6827 > URL: https://issues.apache.org/jira/browse/IGNITE-6827 > Project: Ignite > Issue Type: Improvement > Affects Versions: 2.0 > Reporter: Alexei Scherbakov > Assignee: Alexei Scherbakov > Priority: Major > Fix For: 2.5 > > > Currently long running / buggy user transactions force partition exchange > block on waiting for > org.apache.ignite.internal.processors.cache.GridCacheSharedContext#partitionReleaseFuture, > preventing all grid progress. > I suggest introducing new global flag in TransactionConfiguration, like > {{txRollbackTimeoutOnTopologyChange}} > which will rollback exchange blocking transaction after the timeout. > Still need to think what to do with other topology locking activities. -- This message was sent by Atlassian JIRA (v7.6.3#76005)