[ https://issues.apache.org/jira/browse/IGNITE-17731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitry Pavlov updated IGNITE-17731: ----------------------------------- Priority: Minor (was: Major) > Possible LRT in case of postponed GridDhtLockRequest > ---------------------------------------------------- > > Key: IGNITE-17731 > URL: https://issues.apache.org/jira/browse/IGNITE-17731 > Project: Ignite > Issue Type: Bug > Reporter: Mikhail Petrov > Priority: Minor > Labels: IEP-89, ise > > Let's assume the foowing scenario: > 1. TX coordinator starts transaction and sends GridDhtLockRequest to "near" > nodes. > 2. Some GridDhtLockRequest messages was delayed by the network. > 3. Not all "near" nodes receive GridDhtLockRequest and as result not all of > them respond to the TX coordinator. > 4. TX coordinator aborts TX by the timeout. > 5. Completed TX ID is stored in IgniteTxManager#completedVersHashMap. > 6. TX load continuous (assume puts in TX cache) and record about described > above completed TX is evicted from the map. > 7. GridDhtLockRequest from the clause 2 is finally recived by the "near" > nodes. They lock keys, start the local TX, and respond to the TX coordinator. > But currently TX coordinator ignores GridDhtLockResponce as info about > initial TX was evicted and does nothing. > As a result near nodes keep holding key locks and waiting for next steps of > TX protocol that will never happen as TX was already completed. > As a WA TX can be explicitly KILLED on the near node. > It is proposed to handle this situation and not aquire locks on the near node > if TX coordinator or other cluster nodes do not have notion about TX to which > current lock request belongs to. -- This message was sent by Atlassian Jira (v8.20.10#820010)