suryaprasanna opened a new pull request, #19023:
URL: https://github.com/apache/hudi/pull/19023

   ### Describe the issue this Pull Request addresses
   
   During Flink job restart, `StreamWriteOperatorCoordinator.restoreEvents()` 
calls `recommitInstant()`
   without initializing `lastCompletedTxnAndMetadata`. This causes OCC conflict 
resolution to use
   `INIT_INSTANT_TS` as the baseline, treating all completed instants as 
conflict candidates. Since
   streaming upserts write to the same file groups, this always throws a false
   `HoodieWriteConflictException`, preventing the job from restarting.
   
   ### Summary and Changelog
   
   Added `preTxnForRecommit()` to initialize OCC conflict resolution state 
before recommitting inflight
   instants during Flink job recovery. The method finds the last completed 
instant whose requested time
   and completion time are both before the inflight instant, so only genuinely 
concurrent commits are
   checked for conflicts.
   
   ### Impact
   
   No public API or config changes. Fixes a restart failure for Flink streaming 
jobs with OCC enabled.
   
   ### Risk Level
   
   low — the change only adds proper initialization of an existing field that 
was previously left
   uninitialized in the recommit path. Normal (non-recommit) write paths are 
unaffected.
   
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Enough context is provided in the sections above
   - [x] Adequate tests were added if applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to