Re: Re:Re: [DISCUSS] RFC-12 : Efficient migration of large parquet tables to Apache Hudi

Sivabalan Sun, 15 Dec 2019 19:24:12 -0800

Nice one Balaji. have left few comments. Overall looks good :)

On Sun, Dec 15, 2019 at 9:30 AM Balaji Varadarajan
<[email protected]> wrote:


>  Hi Nicholas,
> Once I get high level comments on the RFC,  we can have concrete subtasks
> around this.
> Balaji.V     On Saturday, December 14, 2019, 07:04:52 PM PST, 蒋晓峰 <
> [email protected]> wrote:
>
>  Hi Balaji,
> About plan of "Efficient migration of large parquet tables to Apache
> Hudi", have you split the plan into multiple subtasks？
> Thanks,
> Nicholas
>
>
> At 2019-12-14 00:18:12, "Vinoth Chandar" <[email protected]> wrote:
> >+1 (per asf policy)
> >
> >+100 per my own excitement :) .. Happy to review this!
> >
> >On Fri, Dec 13, 2019 at 3:07 AM Balaji Varadarajan <[email protected]>
> >wrote:
> >
> >> With Apache Hudi growing in popularity, one of the fundamental
> challenges
> >> for users has been about efficiently migrating their historical
> datasets to
> >> Apache Hudi. Apache Hudi maintains per record metadata to perform core
> >> operations such as upserts and incremental pull. To take advantage of
> >> Hudi’s upsert and incremental processing support, users would need to
> >> rewrite their whole dataset to make it a Hudi table. This RFC provides a
> >> mechanism to efficiently migrate their datasets without the need to
> rewrite
> >> the entire dataset.
> >>
> >>  Please find the link for the RFC below.
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/HUDI/RFC+-+12+%3A+Efficient+Migration+of+Large+Parquet+Tables+to+Apache+Hudi
> >>
> >> Please review and let me know your thoughts.
> >>
> >> Thanks,
> >> Balaji.V
> >>
>



-- 
Regards,
-Sivabalan

Re: Re:Re: [DISCUSS] RFC-12 : Efficient migration of large parquet tables to Apache Hudi

Reply via email to