[
https://issues.apache.org/jira/browse/IGNITE-26396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladislav Pyatkov updated IGNITE-26396:
---------------------------------------
Description:
h3. Motivation
After a ticket about acknowledgment batching (IGNITE-25945) was closed, we do
not guarantee a message will be delivered when the service future is completed.
So the restarting process can start, but the other side does not receive a
successful response yet.
{code}
private void handleResetClusterMessage(ResetClusterMessage message,
ClusterNode sender, long correlationId) {
restartExecutor.execute(() -> {
storage.saveResetClusterMessage(message);
messagingService.respond(sender, successResponseMessage(),
correlationId)
.thenRunAsync(() -> {
if (!thisNodeName.equals(sender.name())) {
restarter.initiateRestart();
}
}, restartExecutor)
.whenComplete((res, ex) -> {
if (ex != null) {
LOG.error("Error when handling a ResetClusterMessage",
ex);
}
});
});
}
{code}
h3. Definition of done.
The initiator side has to answer that the successful message is received
explicitly.
And the node should wait for the message before the restart process starts.
was:
h3. Motivation
After a ticket about acknowledgment batching was closed, we do not guarantee a
message will be delivered when the service future is completed. So the
restarting process can start, but the other side does not receive a successful
response yet.
{code}
private void handleResetClusterMessage(ResetClusterMessage message,
ClusterNode sender, long correlationId) {
restartExecutor.execute(() -> {
storage.saveResetClusterMessage(message);
messagingService.respond(sender, successResponseMessage(),
correlationId)
.thenRunAsync(() -> {
if (!thisNodeName.equals(sender.name())) {
restarter.initiateRestart();
}
}, restartExecutor)
.whenComplete((res, ex) -> {
if (ex != null) {
LOG.error("Error when handling a ResetClusterMessage",
ex);
}
});
});
}
{code}
h3. Definition of done.
The initiator side has to answer that the successful message is received
explicitly.
And the node should wait for the message before the restart process starts.
> A node restart might not send a message to the initiator.
> ---------------------------------------------------------
>
> Key: IGNITE-26396
> URL: https://issues.apache.org/jira/browse/IGNITE-26396
> Project: Ignite
> Issue Type: Improvement
> Reporter: Vladislav Pyatkov
> Priority: Major
> Labels: ignite-3
>
> h3. Motivation
> After a ticket about acknowledgment batching (IGNITE-25945) was closed, we do
> not guarantee a message will be delivered when the service future is
> completed. So the restarting process can start, but the other side does not
> receive a successful response yet.
> {code}
> private void handleResetClusterMessage(ResetClusterMessage message,
> ClusterNode sender, long correlationId) {
> restartExecutor.execute(() -> {
> storage.saveResetClusterMessage(message);
> messagingService.respond(sender, successResponseMessage(),
> correlationId)
> .thenRunAsync(() -> {
> if (!thisNodeName.equals(sender.name())) {
> restarter.initiateRestart();
> }
> }, restartExecutor)
> .whenComplete((res, ex) -> {
> if (ex != null) {
> LOG.error("Error when handling a
> ResetClusterMessage", ex);
> }
> });
> });
> }
> {code}
> h3. Definition of done.
> The initiator side has to answer that the successful message is received
> explicitly.
> And the node should wait for the message before the restart process starts.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)