[ 
https://issues.apache.org/jira/browse/HUDI-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo closed HUDI-6798.
---------------------------
    Resolution: Fixed

> Implement event-time-based merging mode in FileGroupReader
> ----------------------------------------------------------
>
>                 Key: HUDI-6798
>                 URL: https://issues.apache.org/jira/browse/HUDI-6798
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: Ethan Guo
>            Assignee: Ethan Guo
>            Priority: Blocker
>              Labels: hudi-1.0.0-beta2, pull-request-available
>             Fix For: 1.0.0
>
>
> To achieve this, we should add a new table config 
> {{hoodie.record.merge.mode}} to control the record merging mode and behavior 
> in the new file group reader ({{{}HoodieFileGroupReader{}}}) and implements 
> event-time ordering in it. The table config {{hoodie.record.merge.mode}} is 
> going to be the single config that determines how the record merging happens 
> in release 1.0 and beyond.
>  
> Three merging modes to define:
>  * {{{}OVERWRITE_WITH_LATEST{}}}: using transaction time to merge records, 
> i.e., the record from later transaction overwrites the earlier record with 
> the same key. This corresponds to the behavior of existing payload class 
> {{{}OverwriteWithLatestAvroPayload{}}}.
>  * {{{}EVENT_TIME_ORDERING{}}}: using event time as the ordering to merge 
> records, i.e., the record with the larger event time overwrites the record 
> with the smaller event time on the same key, regardless of transaction time. 
> The event time or preCombine field needs to be specified by the user. This 
> corresponds to the behavior of existing payload class 
> {{{}DefaultHoodieRecordPayload{}}}.
>  * {{{}CUSTOM{}}}: using custom merging logic specified by the user. When a 
> user specifies a custom record merger strategy or payload class with Avro 
> record merger, this is going to be specified so the record merging follows 
> user-defined logic as before.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to