Hey Gray,

Thanks for replying so quickly!

Could you please point me to the documentation of this feature? I would
love to take a closer look at it, thanks!

Best regards,
Bill

On Thu, Sep 10, 2020 at 12:20 AM Gary Li <[email protected]> wrote:

> Hello.
> Yes this feature was supported by Hudi. You can write your own payload
> class to handle precombine(dedup within delta) and
> updateHistoryRecord(delta merge with history). The default payload is
> updateWithLatestRecord.
>
> Gary Li
> ________________________________
> From: Jialun Liu <[email protected]>
> Sent: Thursday, September 10, 2020 1:28:09 PM
> To: [email protected] <[email protected]>
> Subject: Apache Hudi Data Reconciliation
>
> Hey guys,
>
> I want to confirm if Apache Hudi has the capability of handling data
> reconciliation for use cases like late record, out of order records, retry
> etc.
>
> A simple example:
> @11:00
> RecordA, updatedAt = 11:00 (failed to update)
>
> @11:30
> RecordA, updatedAt = 11:30 (success)
>
> @12:00 (Retry the failed update)
> RecordA, updatedAt = 11:00 (should drop the record since it is stale)
>
> I know delta lake can update based on conditions so that I can use the
> updatedAt timestamp as the key. But how does Hudi do data reconciliation?
>
> Best regards,
> Bill
>

Reply via email to