Hey Gray, Thanks for replying so quickly!
Could you please point me to the documentation of this feature? I would love to take a closer look at it, thanks! Best regards, Bill On Thu, Sep 10, 2020 at 12:20 AM Gary Li <[email protected]> wrote: > Hello. > Yes this feature was supported by Hudi. You can write your own payload > class to handle precombine(dedup within delta) and > updateHistoryRecord(delta merge with history). The default payload is > updateWithLatestRecord. > > Gary Li > ________________________________ > From: Jialun Liu <[email protected]> > Sent: Thursday, September 10, 2020 1:28:09 PM > To: [email protected] <[email protected]> > Subject: Apache Hudi Data Reconciliation > > Hey guys, > > I want to confirm if Apache Hudi has the capability of handling data > reconciliation for use cases like late record, out of order records, retry > etc. > > A simple example: > @11:00 > RecordA, updatedAt = 11:00 (failed to update) > > @11:30 > RecordA, updatedAt = 11:30 (success) > > @12:00 (Retry the failed update) > RecordA, updatedAt = 11:00 (should drop the record since it is stale) > > I know delta lake can update based on conditions so that I can use the > updatedAt timestamp as the key. But how does Hudi do data reconciliation? > > Best regards, > Bill >
