Nice one Balaji. have left few comments. Overall looks good :) On Sun, Dec 15, 2019 at 9:30 AM Balaji Varadarajan <[email protected]> wrote:
> Hi Nicholas, > Once I get high level comments on the RFC, we can have concrete subtasks > around this. > Balaji.V On Saturday, December 14, 2019, 07:04:52 PM PST, 蒋晓峰 < > [email protected]> wrote: > > Hi Balaji, > About plan of "Efficient migration of large parquet tables to Apache > Hudi", have you split the plan into multiple subtasks? > Thanks, > Nicholas > > > At 2019-12-14 00:18:12, "Vinoth Chandar" <[email protected]> wrote: > >+1 (per asf policy) > > > >+100 per my own excitement :) .. Happy to review this! > > > >On Fri, Dec 13, 2019 at 3:07 AM Balaji Varadarajan <[email protected]> > >wrote: > > > >> With Apache Hudi growing in popularity, one of the fundamental > challenges > >> for users has been about efficiently migrating their historical > datasets to > >> Apache Hudi. Apache Hudi maintains per record metadata to perform core > >> operations such as upserts and incremental pull. To take advantage of > >> Hudi’s upsert and incremental processing support, users would need to > >> rewrite their whole dataset to make it a Hudi table. This RFC provides a > >> mechanism to efficiently migrate their datasets without the need to > rewrite > >> the entire dataset. > >> > >> Please find the link for the RFC below. > >> > >> > >> > https://cwiki.apache.org/confluence/display/HUDI/RFC+-+12+%3A+Efficient+Migration+of+Large+Parquet+Tables+to+Apache+Hudi > >> > >> Please review and let me know your thoughts. > >> > >> Thanks, > >> Balaji.V > >> > -- Regards, -Sivabalan
