Thank you Nishith. On Sat, Jan 30, 2021 at 7:29 PM nishith agarwal <n3.nas...@gmail.com> wrote:
> Anton, > > Yes, you can achieve this with Hudi. Hudi uses a HoodieRecordPayload > implementation to be able to merge old and new records. You can define a > source ordering field (here "sort_key") to govern which record should be > chosen as the latest one. The DefaultHoodieRecordPayload supports this -> > > https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java > > You just need to set the correct source ordering field name, take a look at > an example here -> > > https://github.com/apache/hudi/blob/master/hudi-common/src/test/java/org/apache/hudi/common/model/TestDefaultHoodieRecordPayload.java#L44 > > Please create a GH issue or post in the general slack channel for further > collaboration if needed. > > Thanks, > Nishith > > On Sat, Jan 30, 2021 at 6:59 PM Anton Zuyeu <anton.zu...@gmail.com> wrote: > > > Hi Hudi team, > > > > We are replicating database table by reading table change logs and > applying > > them to Hudi table, we would like to implement our pipeline so it can > > process records out of order. Pretty much we want to introduce column > > "sort_key" and only update existing records in the hudi table if a new > > record's sort_key is greater than the sort_key value of an existing > record. > > Initially we thought that we just need to assign to > > hoodie.datasource.write.precombine.field > > parameter value= "sort_key" , however it looks like it is not the case as > > hoodie.datasource.write.precombine.field comes to play only when pre > > combining records prior to writing. Is there a way to implement our use > > case using hudi's primitives ? > > > > Thank you, > > Anton > > >