Hi Alexey,

Thanks for the update!

So for maximum performance when writing to Hudi from a low-level (DataStream, 
not Table) Flink workflow, we’d be creating RowData records?

— Ken


> On Sep 27, 2022, at 2:08 PM, Alexey Kudinkin <[email protected]> wrote:
> 
> Hello, everyone!
> 
> As you might be aware, community has been very busy at work on RFC-46
> aiming to bring long-awaited cutting edge level of performance to Hudi by
> avoiding using Avro as an intermediate representation, instead relying on
> individual engines to host data in their own formats (InternalRow for
> Spark, RowData for Flink, etc)
> 
> We wanted to share an update in terms of where we are and what are the next
> steps from here:
> 
>   - We're very close to completing the work and are already preparing to
>   be landing complete implementation of the Phase 1 of the RFC-46 currently
>   being developed in a feature branch
>   <https://github.com/apache/hudi/tree/release-feature-rfc46>
>   - To be able to successfully merge the change of such scale, we will
>   have to do a *code freeze* for the master branch barring any changes to
>   land before we're able to merge the feature-branch.
>   - To make sure that this activity doesn't interrupt the 0.12.1 release
>   that is currently in progress we're tentatively planning to schedule this
>   code-freeze *after* successful finalization of the release process with
>   RC branch being cut and validated for release. As of now, provided RC
>   candidate will be cut tomorrow on 09/28 we're aiming to schedule a merge
>   attempt somewhere mid to late next week.
>   - We will follow-up on this thread separately at least *24h* before the
>   scheduled code-freeze with an exact date and time frame for it. Stay tuned.
> 
> 
> Alexey, on behalf of the RFC-46 group

--------------------------
Ken Krugler
http://www.scaleunlimited.com
Custom big data solutions
Flink, Pinot, Solr, Elasticsearch



Reply via email to