[ 
https://issues.apache.org/jira/browse/IGNITE-17731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pavlov updated IGNITE-17731:
-----------------------------------
    Priority: Minor  (was: Major)

> Possible LRT in case of postponed GridDhtLockRequest
> ----------------------------------------------------
>
>                 Key: IGNITE-17731
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17731
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mikhail Petrov
>            Priority: Minor
>              Labels: IEP-89, ise
>
> Let's assume the foowing scenario:
> 1.  TX coordinator starts transaction and sends GridDhtLockRequest to "near" 
> nodes.
> 2. Some GridDhtLockRequest messages was delayed by the network. 
> 3. Not all "near" nodes receive GridDhtLockRequest and as result not all of 
> them respond to the TX coordinator.
> 4. TX coordinator aborts TX by the timeout.
> 5. Completed TX ID is stored in IgniteTxManager#completedVersHashMap.
> 6. TX load continuous (assume puts in TX cache) and record about described 
> above completed TX is evicted from the map.
> 7. GridDhtLockRequest from the clause 2 is finally recived by the "near" 
> nodes. They lock keys, start the local TX, and respond to the TX coordinator.
> But currently TX coordinator ignores GridDhtLockResponce as info about 
> initial TX was evicted and does nothing.
> As a result near nodes keep holding key locks and waiting for next steps of 
> TX protocol that will never happen as TX was already completed.
> As a WA TX can be explicitly KILLED on the near node. 
> It is proposed to handle this situation and not aquire locks on the near node 
> if TX coordinator or other cluster nodes do not have notion about TX to which 
> current lock request belongs to.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to