ad1happy2go opened a new pull request, #19005:
URL: https://github.com/apache/hudi/pull/19005

   ### Describe the issue this Pull Request addresses
   
   Closes #18943.
   
   Incremental queries on a MOR table can return incorrect rows. Two 
manifestations:
   
   - **Partial updates**: only the changed columns come back (the other columns 
are null/garbled), because a partial-update log block holds only the changed 
columns and the base-file row is dropped before the runtime merge.
   - **EVENT_TIME_ORDERING (even without partial updates)**: a window write 
with a lower ordering value can surface even though the existing 
higher-ordering version should win.
   
   Snapshot and read-optimized queries are correct; only the incremental path 
is affected. Because SQL `MERGE INTO` on MOR writes partial log blocks by 
default (`hoodie.spark.sql.merge.into.partial.updates`), this hits the common 
case.
   
   ### Summary and Changelog
   
   For each file group touched in the incremental window, the read now loads 
the **full file slice** (base file + log files) and runs the standard 
file-group reader merge, then filters the merged output to the window by commit 
time. The fix is unconditional — no new config and no write-path change.
   
   - `MergeOnReadIncrementalRelationV2`: build the incremental file-system view 
from the (modified) partition listing (metadata-table-aware) so each slice 
carries its base file, scoped back to the file groups actually touched in the 
window; bound the view timeline to the window end and close the view after use.
   - `HoodieFileGroupReaderBasedFileFormat`: for incremental merging file 
groups, do **not** push the commit-time span filter into the file reads — apply 
it on the merged output instead. Also set an `InstantRange` bounded to the 
window end on the reader context, so base records and log blocks committed 
after the window are not merged in (a record updated again after the window 
must be returned with its value as of the window end, not its latest value).
   - `HoodieReaderContext`: add `setInstantRange` so the read path can bound 
the merge inputs to the query window.
   - Tests in `TestPartialUpdateForMergeInto`.
   
   Scope: only the V2 relation listing changed (table version 8+); the 
format-level gate covers both V1 and V2 reads. No code was copied.
   
   ### Impact
   
   Incremental queries on MOR tables now return correct, fully-merged rows. 
Performance: the incremental file listing now builds a metadata-table-aware 
file-system view over the modified partitions (instead of a window-files-only 
view) and scopes it to the touched file groups — this can enumerate more file 
groups on wide partitions, but only the touched groups are read and merged.
   
   ### Risk Level
   
   medium — this changes the MOR incremental read path. Mitigated by:
   
   - Snapshot and read-optimized read paths are unchanged; the change is gated 
on incremental + merging file groups.
   - Broad new test coverage in `TestPartialUpdateForMergeInto`: partial-update 
incremental (commit/event-time ordering, avro/parquet, partitioned), 
non-partial event-time ordering, window bounds, insert+update in one window, 
multiple partial updates to one key, exclusion of commits after the window end, 
post-compaction merge, and a COW non-regression.
   - Verified locally on Spark 4.
   
   ### Documentation Update
   
   none — bug fix, no new config or user-facing API change.
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Enough context is provided in the sections above
   - [x] Adequate tests were added if applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to