YCharanGowda opened a new pull request, #968:
URL: https://github.com/apache/tomcat/pull/968
## Problem
In Apache Tomcat's clustering feature, the `ReplicationValve` handles
sending session replication messages to cluster nodes. Previously, all send
operations (invalid sessions, session replication, and cross-context sessions)
were wrapped in a single `try-catch` block. If any send failed (e.g., due to
network issues or node unavailability), the exception would propagate and skip
all remaining sends, reducing cluster reliability and potentially causing data
inconsistencies across nodes.
This was noted in a FIXME comment: "we have a lot of sends, but the trouble
with one node stops the correct replication to other nodes!"
## Solution
- Split the single `try-catch` block in `sendReplicationMessage()` into
individual `try-catch` blocks for each send operation (`sendInvalidSessions`,
`sendSessionReplicationMessage`, and `sendCrossContextSession`).
- This ensures that if one send fails, the others continue, improving fault
tolerance without changing successful behavior.
- Added specific error messages in `LocalStrings.properties` for better
logging and diagnostics:
- `ReplicationValve.send.replication.failure`
- `ReplicationValve.send.crosscontext.failure`
- Removed the FIXME comments since the issue is now addressed.
## Files Changed
- `java/org/apache/catalina/ha/tcp/ReplicationValve.java`: Modified
`sendReplicationMessage()` method
- `java/org/apache/catalina/ha/tcp/LocalStrings.properties`: Added new error
message keys
## Impact
- **Positive**: Enhances robustness in high-availability setups where
network failures are common.
- **Risk**: Low – no functional changes for successful sends; only improves
error handling.
- **Backward Compatible**: Yes, no breaking changes.
## Testing
- Verified compilation without errors
- Changes are minimal and isolated to error handling paths
- Recommend testing in a clustered environment to confirm sends continue on
failure
## Related
- Resolves FIXME in `ReplicationValve.java` (line 373)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]