[ 
https://issues.apache.org/jira/browse/IGNITE-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210403#comment-15210403
 ] 

Andrey Gura commented on IGNITE-2854:
-------------------------------------

*Suggested solution:*

# Tx manager executes deadlock detection procedure (if enabled) in the 
following cases:
** On Tx timeout (if enabled)
** On request (e.g. MBean)
** Periodically (if period configured)
# Choosing Tx candidate for deadlock detection:
** In case of deadlock detection triggered by Tx timeout then timed out Tx will 
be choosen.
** Otherwise get oldest Tx.
# Deadlock detection
** For chosen Tx get all cache entries ({{IgniteTxEntry}} instances) that 
involved into Tx.
** Each Tx entry contains node Id on which lock request sent. So deadlock 
detector has Ids of all nodes involved into Tx.
** Request Tx snapshot from all involved nodes if needed (Tx can involve only 
local node) and merge it with local Tx snapshot. _*Note:*_ _It makes sense 
retrieve Tx snapshot that contains only Txs that involve known Tx entries in 
order to reduce network traffic._
** If resulting Tx snapshot involves other nodes then request Tx snapshots from 
that nodes and merge all snapshot with resulting snapshot. _*Note:*_ _Actually 
we can first try to detect deadlock on current Tx snapshot and request  Tx 
snapshots from other remote nodes only if deadlock isn't found._
** Build _resource allocation graph_ from  {{GridCacheMvcc}}'s local candidates 
(owner candidate - lock aquired, not owner candidate - lock requested) and 
reduce it to _wait for graph (WFG)_.
** Try to find cycle in WFG.
** If cycle is found the deadlock detected
** otherwise choose another Tx candidate that is still not involved in deadlock 
detection procedrue and repeat whole procedure. _*Note:*_ _Repetition is 
redundant in cases when deadlock detection was triggered by Tx timout._
# Deadlock resolving (if configured)
It makes sense to rollback oldest Tx involved into detected deadlock. 
Especially for cases when Tx timout isn't defined.

> Need to implement deadlock detection
> ------------------------------------
>
>                 Key: IGNITE-2854
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2854
>             Project: Ignite
>          Issue Type: New Feature
>          Components: cache
>    Affects Versions: 1.5.0.final
>            Reporter: Valentin Kulichenko
>            Assignee: Andrey Gura
>             Fix For: 1.6
>
>
> Currently, if transactional deadlock occurred, there is no easy way to find 
> out which locks were reordered.
> We need to add a mechanism that will collect information about awating 
> candidates, analyze it and show guilty keys. Most likely this should be 
> implemented with the help of custom discovery message.
> In addition we should automatically execute this mechanism if transaction 
> times out and add information to timeout exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to