Vladislav Pyatkov created IGNITE-26396:
------------------------------------------

             Summary: A node restart might not send a message to the initiator.
                 Key: IGNITE-26396
                 URL: https://issues.apache.org/jira/browse/IGNITE-26396
             Project: Ignite
          Issue Type: Improvement
            Reporter: Vladislav Pyatkov


h3. Motivation
After a ticket about acknowledgment batching  was closed, we do not guarantee a 
message will be delivered when the service future is completed. So the 
restarting process can start, but the other side does not receive a successful 
response yet.
{code}
 private void handleResetClusterMessage(ResetClusterMessage message, 
ClusterNode sender, long correlationId) {
     restartExecutor.execute(() -> {
         storage.saveResetClusterMessage(message);

         messagingService.respond(sender, successResponseMessage(), 
correlationId)
                 .thenRunAsync(() -> {
                     if (!thisNodeName.equals(sender.name())) {
                         restarter.initiateRestart();
                     }
                 }, restartExecutor)
                 .whenComplete((res, ex) -> {
                     if (ex != null) {
                         LOG.error("Error when handling a ResetClusterMessage", 
ex);
                     }
                 });
     });
 }
{code}

h3. Definition of done.
The initiator side has to answer that the successful message is received 
explicitly.
And the node should wait for the message before the restart process starts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to