[ 
https://issues.apache.org/jira/browse/IGNITE-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15222229#comment-15222229
 ] 

Andrey Gura edited comment on IGNITE-2854 at 4/4/16 12:17 AM:
--------------------------------------------------------------

Algorithm described in previous comment has drawbacks. 

It can't detect deadlock for transaction that was timed out and involved into 
deadlock or can detect invalid deadlock due to a race conditions. 

For example we have transactions {{TX1}} and {{TX2}} with the same timeout and 
start time. {{TX1}} holds lock on key {{K1}} and requests lock for {{K2}} while 
{{TX2}} hold lock on key {{K2}} and requests lock for {{K1}} so it is deadlcok. 
{{K1}} and {{K2}} have different primary nodes so both transactions are 
distributed. 

When {{TX1}} and {{TX2}} times out all {{GridDhtColocatedLockFuture}} and 
blocked {{GridDhtLockFuture}} times out also. {{GridDhtLockFuture.onTimeout}} 
initiates deadlock detection while {{GridDhtColocatedLockFuture.onTimeout}} 
releases locks and then rollback corresponding transaction. So we have 
incomplete information about transactions state and can't detect deadlock or 
detect something invalid like {{TX1 <-> TX1}}.

The second problem is that in current implementation remote nodes will not send 
response to near node in case of {{GridDhtLockFuture}} timeout. So we can't 
print deadlock information in user thread.

Suggested solution:

Deadlock detection initiates by near node in case of 
{{GridDhtColocatedNearFuture.onTimeout}} invoked. At the same time all 
{{GridDhtLockFuture}}s register futures in transaction manager. This futures 
will be completed when special request about finished detection will be 
received from near node.

It is still possible race conditions because for each timed out transaction 
will be started concurrent deadlock detection process.



was (Author: agura):
Algorithm described in previous comment has one drawback: it can't detect 
deadlock for transaction that was timed out and involved into deadlock or can 
detect invalid deadlock due to a race conditions. 

For example we have transactions {{TX1}} and {{TX2}} with the same timeout and 
start time. {{TX1}} holds lock on key {{K1}} and requests lock for {{K2}} while 
{{TX2}} hold lock on key {{K2}} and requests lock for {{K1}} so it is deadlcok. 
{{K1}} and {{K2}} have different primary nodes so both transactions are 
distributed. 

When {{TX1}} and {{TX2}} times out all {{GridDhtColocatedLockFuture}} and 
blocked {{GridDhtLockFuture}} times out also. {{GridDhtLockFuture.onTimeout}} 
initiates deadlock detection while {{GridDhtColocatedLockFuture.onTimeout}} 
releases locks and then rollback corresponding transaction. So we have 
incomplete information about transactions state and can't detect deadlock or 
detect something invalid like {{TX1 <-> TX1}}.

> Need to implement deadlock detection
> ------------------------------------
>
>                 Key: IGNITE-2854
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2854
>             Project: Ignite
>          Issue Type: New Feature
>          Components: cache
>    Affects Versions: 1.5.0.final
>            Reporter: Valentin Kulichenko
>            Assignee: Andrey Gura
>             Fix For: 1.6
>
>
> Currently, if transactional deadlock occurred, there is no easy way to find 
> out which locks were reordered.
> We need to add a mechanism that will collect information about awating 
> candidates, analyze it and show guilty keys. Most likely this should be 
> implemented with the help of custom discovery message.
> In addition we should automatically execute this mechanism if transaction 
> times out and add information to timeout exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to