[ 
https://issues.apache.org/jira/browse/IGNITE-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Gura updated IGNITE-6527:
--------------------------------
    Fix Version/s:     (was: 2.7)
                   2.8

> Deadlock detection works incorrectly with some timeouts that haven't caused 
> by deadlocks.
> -----------------------------------------------------------------------------------------
>
>                 Key: IGNITE-6527
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6527
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Vitaliy Biryukov
>            Assignee: Andrey Gura
>            Priority: Major
>             Fix For: 2.8
>
>         Attachments: TxOptimisticDeadlockDetectionIncorrectMessageTest.java
>
>
> Deadlock detection works incorrectly with timeouts that haven't caused by 
> deadlocks. In case of a deadlock in future. Or can detect another deadlock 
> which was not the cause of timeout.
> *requested keys:* keys primary for the same node and blocking in sequential 
> order during the timeout (or all keys that haven't locked by an optimistic 
> transaction in case of near cache).
> *candidates:* keys candidates to be locked on a primary node (entries 
> contains in  GridDhtTxLocal). 
> In the process of updating the Wait-For-Graph requested keys used as 
> candidates.  But "TxDeadlock.toString" method use candidates which were 
> received from messages. 
> 1) It causes an incorrect error message.
> Example: 
> K1: TX1 holds lock, TX2 waits lock.
> K2: TX3 holds lock, TX1 waits lock.
> Transactions:
> TX1 [txId=GridCacheVersion [topVer=118090802, order=1506610794980, 
> nodeOrder=1], nodeId=f03b1ae3-a100-479c-9671-11d5cef00000, threadId=455]
> TX2 [txId=GridCacheVersion [topVer=118090802, order=1506610794980, 
> nodeOrder=2], nodeId=2c0c0e78-cab2-4b23-a985-4965e4200001, threadId=456]
> TX3 [txId=GridCacheVersion [topVer=118090802, order=1506610794980, 
> nodeOrder=3], nodeId=3340dc48-f1a1-4ea8-8742-19b314300002, threadId=457]
> Keys:
> K1 [key=6, cache=cache]
> K2 [key=1, cache=cache]
> 2) DD can detect another deadlock which was not the cause of timeout but it 
> would be the cause if the current deadlock did not happen.
> These are very rare situations, but they can happen.
> I see several solutions:
> * Just make a correct message.
> * log warn and continue detecting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to