[ https://issues.apache.org/jira/browse/IGNITE-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210403#comment-15210403 ]
Andrey Gura commented on IGNITE-2854: ------------------------------------- *Suggested solution:* # Tx manager executes deadlock detection procedure (if enabled) in the following cases: ** On Tx timeout (if enabled) ** On request (e.g. MBean) ** Periodically (if period configured) # Choosing Tx candidate for deadlock detection: ** In case of deadlock detection triggered by Tx timeout then timed out Tx will be choosen. ** Otherwise get oldest Tx. # Deadlock detection ** For chosen Tx get all cache entries ({{IgniteTxEntry}} instances) that involved into Tx. ** Each Tx entry contains node Id on which lock request sent. So deadlock detector has Ids of all nodes involved into Tx. ** Request Tx snapshot from all involved nodes if needed (Tx can involve only local node) and merge it with local Tx snapshot. _*Note:*_ _It makes sense retrieve Tx snapshot that contains only Txs that involve known Tx entries in order to reduce network traffic._ ** If resulting Tx snapshot involves other nodes then request Tx snapshots from that nodes and merge all snapshot with resulting snapshot. _*Note:*_ _Actually we can first try to detect deadlock on current Tx snapshot and request Tx snapshots from other remote nodes only if deadlock isn't found._ ** Build _resource allocation graph_ from {{GridCacheMvcc}}'s local candidates (owner candidate - lock aquired, not owner candidate - lock requested) and reduce it to _wait for graph (WFG)_. ** Try to find cycle in WFG. ** If cycle is found the deadlock detected ** otherwise choose another Tx candidate that is still not involved in deadlock detection procedrue and repeat whole procedure. _*Note:*_ _Repetition is redundant in cases when deadlock detection was triggered by Tx timout._ # Deadlock resolving (if configured) It makes sense to rollback oldest Tx involved into detected deadlock. Especially for cases when Tx timout isn't defined. > Need to implement deadlock detection > ------------------------------------ > > Key: IGNITE-2854 > URL: https://issues.apache.org/jira/browse/IGNITE-2854 > Project: Ignite > Issue Type: New Feature > Components: cache > Affects Versions: 1.5.0.final > Reporter: Valentin Kulichenko > Assignee: Andrey Gura > Fix For: 1.6 > > > Currently, if transactional deadlock occurred, there is no easy way to find > out which locks were reordered. > We need to add a mechanism that will collect information about awating > candidates, analyze it and show guilty keys. Most likely this should be > implemented with the help of custom discovery message. > In addition we should automatically execute this mechanism if transaction > times out and add information to timeout exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)