Mikhail Petrov created IGNITE-28662:
---------------------------------------
Summary: Node stop may be infinitely blocked by atomic cache
operations invoked from thin client
Key: IGNITE-28662
URL: https://issues.apache.org/jira/browse/IGNITE-28662
Project: Ignite
Issue Type: Bug
Reporter: Mikhail Petrov
Consider a cluster of 3 nodes - node0, node1, node2
1. node1 and node2 receive put request from a thin client. node1 is "primary"
for cache keys received by node2 and node2 is "primary" for cache keys received
by node1. Both of them begin operations execution and wait for them to complete.
2. node1 and node2 receive stop signal (Ignite#close). The stop procedure on
both nodes blocks on GridNioAsyncNotifyFilter#stop, which waits for the thin
client operations to complete.
3. node1 and node2 fail to process cache request for some reason (a cache
interceptor raised an exceception)
4. node1 and node 2 will not send GridNearAtomicUpdateResponse with failed keys
to each other because they are both stopping (see GridCacheIoManager#onSend).
This message is an indication to the "near" node that some keys could not be
processed and the operation should be terminated with an exception.
5. node1 and node2 are unable to complete the cache operations received from
the thin client (both of them will never receive GridNearAtomicUpdateResponse
or NODE_LEFT event for the primary node ) -> they are unable to complete the
stop procedure
--
This message was sent by Atlassian Jira
(v8.20.10#820010)