voonhous opened a new issue, #18898:
URL: https://github.com/apache/hudi/issues/18898

   Follow-up from the RFC-105 Trino plugin PR #18837 (review thread on 
`HudiTrinoReaderContext.getRecordMerger`).
   
   `HudiTrinoReaderContext.getRecordMerger` now dispatches on 
`RecordMergeMode`, mirroring `HoodieAvroReaderContext`:
   - `EVENT_TIME_ORDERING` -> `HoodieAvroRecordMerger`
   - `COMMIT_TIME_ORDERING` -> `OverwriteWithLatestMerger`
   - `CUSTOM` -> `HoodieRecordUtils.createValidRecordMerger(...)`
   
   The existing MoR snapshot / `_rt` tests all use the default payload, where 
event-time merge results are identical regardless of which merger is selected. 
So they do not exercise the paths where the merger choice actually matters:
   
   1. **Delete markers** (`_hoodie_is_deleted` / delete records in log files) 
on MoR snapshot reads - `combineAndGetUpdateValue` propagates deletes, 
`preCombine` does not.
   2. **Custom record payloads** (e.g. `PartialUpdateAvroPayload`, 
`OverwriteNonDefaultsWithLatestAvroPayload`, `AWSDmsAvroPayload`) where 
`combineAndGetUpdateValue` differs from `preCombine`.
   3. **COMMIT_TIME_ORDERING** MoR tables - verify `OverwriteWithLatestMerger` 
semantics (latest write wins).
   
   Add functional tests (preferably Trino smoke-test level against registered 
MoR tables) covering these so the merge-mode dispatch is verified end-to-end.
   
   Tracked inline via a TODO in `HudiTrinoReaderContext.getRecordMerger`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to