Anilkumar Gingade created GEODE-8527:
----------------------------------------
Summary: A distributed message can continue to wait for a member
for which it failed to send the message
Key: GEODE-8527
URL: https://issues.apache.org/jira/browse/GEODE-8527
Project: Geode
Issue Type: Bug
Components: messaging
Affects Versions: 1.14.0
Reporter: Anilkumar Gingade
While trying to send/replicate a message (cache operation) by calling
DistibutedCacheOperationMessage._distribute(); if there is any exception in
sending the message to one of the recipient, the message processor created to
wait for the replies can end up waiting for a reply from failed member.
This is observed while doing a code walk through.
The _distribute() method does keep track of nodes for which it fails to send
the message; but is not using it to update the reply process created.
Probable solution:
1. Update the reply processor to remove the failed member from waiting member
list
2. Handle the cache operation to address any data replication issue because of
this.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)