@Ken yes, that's the plan eventually -- to rely on Execution (Query)
Engines to provide their own representation that Hudi will be handling w/o
any intermediate transitions. If you're curious to learn more i'd encourage
you (and everyone) to check out the RFC-46 itself.

In the Phase 1 though, we're only focusing on Spark integration for now (as
proof of concept) and then later on after the new infra is hardened enough
we can expand it to other engines as well.

On Tue, Sep 27, 2022 at 7:07 PM Gary Li <[email protected]> wrote:

> Great work! Really excited about this feature. Kudos to RFC-46 team.
>
> Best,
> Gary
>
> On Wed, Sep 28, 2022 at 7:22 AM Ken Krugler <[email protected]>
> wrote:
>
> > Hi Alexey,
> >
> > Thanks for the update!
> >
> > So for maximum performance when writing to Hudi from a low-level
> > (DataStream, not Table) Flink workflow, we’d be creating RowData records?
> >
> > — Ken
> >
> >
> > > On Sep 27, 2022, at 2:08 PM, Alexey Kudinkin <[email protected]>
> wrote:
> > >
> > > Hello, everyone!
> > >
> > > As you might be aware, community has been very busy at work on RFC-46
> > > aiming to bring long-awaited cutting edge level of performance to Hudi
> by
> > > avoiding using Avro as an intermediate representation, instead relying
> on
> > > individual engines to host data in their own formats (InternalRow for
> > > Spark, RowData for Flink, etc)
> > >
> > > We wanted to share an update in terms of where we are and what are the
> > next
> > > steps from here:
> > >
> > >   - We're very close to completing the work and are already preparing
> to
> > >   be landing complete implementation of the Phase 1 of the RFC-46
> > currently
> > >   being developed in a feature branch
> > >   <https://github.com/apache/hudi/tree/release-feature-rfc46>
> > >   - To be able to successfully merge the change of such scale, we will
> > >   have to do a *code freeze* for the master branch barring any changes
> to
> > >   land before we're able to merge the feature-branch.
> > >   - To make sure that this activity doesn't interrupt the 0.12.1
> release
> > >   that is currently in progress we're tentatively planning to schedule
> > this
> > >   code-freeze *after* successful finalization of the release process
> with
> > >   RC branch being cut and validated for release. As of now, provided RC
> > >   candidate will be cut tomorrow on 09/28 we're aiming to schedule a
> > merge
> > >   attempt somewhere mid to late next week.
> > >   - We will follow-up on this thread separately at least *24h* before
> the
> > >   scheduled code-freeze with an exact date and time frame for it. Stay
> > tuned.
> > >
> > >
> > > Alexey, on behalf of the RFC-46 group
> >
> > --------------------------
> > Ken Krugler
> > http://www.scaleunlimited.com
> > Custom big data solutions
> > Flink, Pinot, Solr, Elasticsearch
> >
> >
> >
> >
>

Reply via email to