Thank you  Nishith.

On Sat, Jan 30, 2021 at 7:29 PM nishith agarwal <n3.nas...@gmail.com> wrote:

> Anton,
>
> Yes, you can achieve this with Hudi. Hudi uses a HoodieRecordPayload
> implementation to be able to merge old and new records. You can define a
> source ordering field (here "sort_key") to govern which record should be
> chosen as the latest one. The DefaultHoodieRecordPayload supports this ->
>
> https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java
>
> You just need to set the correct source ordering field name, take a look at
> an example here ->
>
> https://github.com/apache/hudi/blob/master/hudi-common/src/test/java/org/apache/hudi/common/model/TestDefaultHoodieRecordPayload.java#L44
>
> Please create a GH issue or post in the general slack channel for further
> collaboration if needed.
>
> Thanks,
> Nishith
>
> On Sat, Jan 30, 2021 at 6:59 PM Anton Zuyeu <anton.zu...@gmail.com> wrote:
>
> > Hi Hudi team,
> >
> > We are replicating database table by reading table change logs and
> applying
> > them to Hudi table, we would like to implement our pipeline so it can
> > process records out of order. Pretty much we want to introduce column
> > "sort_key" and only update existing records in the hudi table if a new
> > record's sort_key is greater than the sort_key value of an existing
> record.
> > Initially we thought that we just need to assign to
> > hoodie.datasource.write.precombine.field
> > parameter value= "sort_key" , however it looks like it is not the case as
> > hoodie.datasource.write.precombine.field   comes to play only when pre
> > combining records prior to writing. Is there a way to implement our use
> > case using hudi's primitives ?
> >
> > Thank you,
> > Anton
> >
>

Reply via email to