vamshipasunuru1 opened a new pull request, #18887: URL: https://github.com/apache/hudi/pull/18887
## Change Description When `MARKERS.type` is absent, `MarkerBasedRollbackUtils.getAllMarkerPaths()` first tries `DIRECT` markers and catches `IOException | IllegalArgumentException` to fall back to `TIMELINE_SERVER_BASED`. This catch is too broad. **The problem:** A transient HDFS error (e.g., "Server too busy" / `RetriableException`) is also an `IOException`. When it's caught, the code falls back to the timeline server marker path, which looks in a different location and finds **0 markers** — causing the rollback to skip deleting data files and leaving **orphan files** on the table. ## Fix Split the exception handling: - **`IOException`** → propagate, let rollback fail and retry (transient HDFS failures should not silently produce an incorrect rollback) - **`IllegalArgumentException`** → keep the fallback (this indicates a marker path format mismatch, the existing intended behavior) ## Testing Added `TestMarkerBasedRollbackUtils` with unit tests covering: - `IOException` is propagated (no fallback) - `IllegalArgumentException` still falls back to timeline server markers ## Risk Low — the change only affects the `MARKERS.type`-absent code path, and only for transient IO failures. The `IllegalArgumentException` fallback behavior is preserved. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
