Hey Balaji,

Thanks for your help!

I am new to Apache Hudi and am slowly exploring things. I will reach out to
you if I have something that could contribute back to the community.

Best regards,
Bill

On Sat, Sep 12, 2020 at 3:51 PM Balaji Varadarajan
<[email protected]> wrote:

>
> Hi Jialun,
> There is no outside documentation for this case except Javadocs (
> https://issues.apache.org/jira/browse/HUDI-1277).  The payload interface
> are themselves first class citizens of Hudi (
> https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecordPayload.java
> ).
> We will add a generic support for this case (
> https://issues.apache.org/jira/browse/HUDI-1278) . You can implement a
> specific implementation for your case or you can also contribute to
> HUDI-1278 and I can work with you to get this landed.
> Thanks,Balaji.V
>
>
>
>
>
>     On Thursday, September 10, 2020, 11:05:44 AM PDT, Jialun Liu <
> [email protected]> wrote:
>
>  Hey Gray,
>
> Thanks for replying so quickly!
>
> Could you please point me to the documentation of this feature? I would
> love to take a closer look at it, thanks!
>
> Best regards,
> Bill
>
> On Thu, Sep 10, 2020 at 12:20 AM Gary Li <[email protected]> wrote:
>
> > Hello.
> > Yes this feature was supported by Hudi. You can write your own payload
> > class to handle precombine(dedup within delta) and
> > updateHistoryRecord(delta merge with history). The default payload is
> > updateWithLatestRecord.
> >
> > Gary Li
> > ________________________________
> > From: Jialun Liu <[email protected]>
> > Sent: Thursday, September 10, 2020 1:28:09 PM
> > To: [email protected] <[email protected]>
> > Subject: Apache Hudi Data Reconciliation
> >
> > Hey guys,
> >
> > I want to confirm if Apache Hudi has the capability of handling data
> > reconciliation for use cases like late record, out of order records,
> retry
> > etc.
> >
> > A simple example:
> > @11:00
> > RecordA, updatedAt = 11:00 (failed to update)
> >
> > @11:30
> > RecordA, updatedAt = 11:30 (success)
> >
> > @12:00 (Retry the failed update)
> > RecordA, updatedAt = 11:00 (should drop the record since it is stale)
> >
> > I know delta lake can update based on conditions so that I can use the
> > updatedAt timestamp as the key. But how does Hudi do data reconciliation?
> >
> > Best regards,
> > Bill
> >
>

Reply via email to