[ 
https://issues.apache.org/jira/browse/IGNITE-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220558#comment-15220558
 ] 

Andrey Gura commented on IGNITE-2854:
-------------------------------------

Algorithm is changed in order to limit amount of requested info on keys basis:

# When {{GridDhtLockFuture}} is timed out we run deadlock detection. As input 
we have near transaction ID and pending keys that wasn't locked by this 
transaction.
# Deadlock detector maps pending keys on primary nodes (on first step it is 
always current node). As results deadlock detector have set of candidates 
represented by pairs {{UUID -> List<IgniteTxKey>}}
# For each candidate (if exists) deadlock detector send request to node by its 
{{UUID}}. Request contains keys from candidates pairs. If thre is no candidate 
process finishes.
# Selected candidate removed from candidate set, node and all keys marked as 
handled.
# Node processes request and returns all mvcc candidates that hold or waiting 
for *passed keys* and all other keys involved into transactions that associated 
with found mvcc candidates.
# Deadlock detector builds wat-for-graph (or updates it) and tries to find 
cycle on it using input transaction ID as first vertex of graph.
# If cycle is found then deadlock detection stops (deadlock found).
# If cycle isn't found then deadlock detector maps obtained keys to primary 
nodes and near nodes. Candidates set is updated.
# Process continues from step 3

Properties of this implementation:

* Always will found at most one deadlock for given timed out transaction.
* Always will detect deadlock which cause an user transaction timeout (if 
exist). Step 6.
* Detection will finish as soon as possible because after each update of 
wait-for-graph it can find deadlock.
* Detection minimize the network utilisation. Step 5.

Implementation requires some test coverage for different cases:

* Different nodes that start deadlocked transaction (all from one 
(clinet/server), all from different (client/server), mix) 
* Different nodes that start transaction with timeout (server/client near node, 
server/client non near node)
* More then one cycle (waiting for each other or independent)
* Transitive transactions waiting for each other and eventually waiting for 
deadlocked transaction.

Problems to be solved:

* Deadlock detector behaviour in case of topologu changes and transactions 
remapping.
* Deadlock detector behaviour in case of remote request failed.

> Need to implement deadlock detection
> ------------------------------------
>
>                 Key: IGNITE-2854
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2854
>             Project: Ignite
>          Issue Type: New Feature
>          Components: cache
>    Affects Versions: 1.5.0.final
>            Reporter: Valentin Kulichenko
>            Assignee: Andrey Gura
>             Fix For: 1.6
>
>
> Currently, if transactional deadlock occurred, there is no easy way to find 
> out which locks were reordered.
> We need to add a mechanism that will collect information about awating 
> candidates, analyze it and show guilty keys. Most likely this should be 
> implemented with the help of custom discovery message.
> In addition we should automatically execute this mechanism if transaction 
> times out and add information to timeout exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to