Hi,

The issue could happen like this:
1) RMIConnector.RMINotifClient.fetchNotifs got an IOException
2) communicatorAdmin.gotIOException(ioe) was called to check the connection, it did not close the connection because the connection was now OK. 3) RMIConnector.RMINotifClient.fetchNotifs analyzed the original exception and found it was not a dersialization exception, it re-threw the original IOException 4) the caller ClientNotifForwarder did not know how to treat this exception, decided to end silently.

The fix is to modify RMIConnector.RMINotifClient.fetchNotifs:

if the fetchNotifs request gets an IOException, we examine the chain of exceptions to determine whether this is a deserialization issue. If so - we propagate the appropriate exception to the caller, who will then proceed with fetching notifications one by one, otherwise we call communicatorAdmin.gotIOException(ioe), there are 2 kinds of response: 1) the call returns OK, means the connection is re-established, we re-call the fetchNotifs;
   2) the call throws IOException, we check the connection status:
2-1) "terminated", that means the connection is closed, we re-throw the original IOException, the caller will end silently. 2-2) not "terminated", we add a flag "retried" for this situation, if the flag is false, we set the flag to true and re-do the fetchNotifs request, this is useful for a transient network problem, otherwise we close the connection and re-throw the original IOException, it is here we fix the bug.

We do not modify communicatorAdmin.gotIOException(ioe), it is called too by all other remote requests.

It is not easy to have a test reproducing the bug.

Bug: https://bugs.openjdk.java.net/browse/JDK-8049303
webrev: http://cr.openjdk.java.net/~sjiang/JDK-8049303/00/ <http://cr.openjdk.java.net/%7Esjiang/JDK-8049303/00/>

Thanks,
Shanliang

Reply via email to