Mikhail Petrov created IGNITE-28662:
---------------------------------------

             Summary: Node stop may be infinitely blocked by atomic cache 
operations invoked from thin client
                 Key: IGNITE-28662
                 URL: https://issues.apache.org/jira/browse/IGNITE-28662
             Project: Ignite
          Issue Type: Bug
            Reporter: Mikhail Petrov


Consider a cluster of 3 nodes - node0, node1, node2

1. node1 and node2 receive put request from a thin client. node1 is "primary" 
for cache keys received by node2 and node2 is "primary" for cache keys received 
by node1. Both of them begin operations execution and wait for them to complete.
2. node1 and node2 receive stop signal (Ignite#close). The stop procedure on 
both nodes blocks on GridNioAsyncNotifyFilter#stop, which waits for the thin 
client operations to complete.
3. node1 and node2 fail to process cache request for some reason (a cache 
interceptor raised an exceception)
4. node1 and node 2 will not send GridNearAtomicUpdateResponse with failed keys 
to each other because they are both stopping (see GridCacheIoManager#onSend). 
This message is an indication to the "near" node that some keys could not be 
processed and the operation should be terminated with an exception.
5. node1 and node2 are unable to complete the cache operations received from 
the thin client (both of them will never receive GridNearAtomicUpdateResponse 
or NODE_LEFT event for the primary node ) -> they are unable to complete the 
stop procedure



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to