yes, sounds good. Appreciate that. my github profile is
https://github.com/nsivabalan


On Fri, 12 Aug 2022 at 01:25, 田昕峣 (Xinyao Tian) <xinyaot...@yeah.net> wrote:

>
>
>
> Hi Sivabalan,
>
>
>
>
> Hope you are doing well. As promised, we finished writing the RFC proposal
> and now we are ready to submit them as a PR with confident.
>
> According to the RFC Process, in order to check our elaborated designed
> RFC proposal, we need to add at least two PMCs as reviewers to examine it.
> Therefore, we would like to invite you as one of the reviewers sincerely to
> check our RFC proposal  as well as give us some comments and feedbacks.
>
> Since we really put a lot of effort when writing this RFC proposal, and
> you don’t hesitate sacrifice your time to helped us land our first PR and
> give valuable comments, we sincerely hope that you could accept our
> invitation so that I can put your Github account in the RFC.
>
> Likewise, if you have other suggested candidates, we'd be happy to invite
> them as reviewers, since the number of reviewers has no limitation.
>
> Wish you all good and look forward to receiving your reply.
>
>
>
>
> Sincerely,
>
> Xinyao Tian
>
> On 08/9/2022 21:46,Sivabalan<n.siv...@gmail.com> wrote:
> Eagerly looking forward for the RFC Xinyao. Definitely see a lot of folks
> benefitting from this.
>
> On Sun, 7 Aug 2022 at 20:00, 田昕峣 (Xinyao Tian) <xinyaot...@yeah.net>
> wrote:
>
> Hi Shiyan,
>
>
> Thanks so much for your feedback as well as your kind encouragement! It’s
> always our honor to contribute our effort to everyone and make Hudi much
> awesome :)
>
>
> We are now carefully preparing materials for the new RFC. Once we
> finished, we would strictly follow the RFC process shown in the Hudi
> official documentation to propose the new RFC and share all details of the
> new feature as well as related code to everyone. Since we benefit from Hudi
> community, we would like to give back our effort to the community and make
> Hudi benefit more people!
>
>
> As always, please stay healthy and keep safe.
>
>
> Kind regards,
> Xinyao Tian
> On 08/6/2022 10:11,Shiyan Xu<xu.shiyan.raym...@gmail.com> wrote:
> Hi Xinyao, awesome achievement! And really appreciate your keenness in
> contributing to Hudi. Certainly we'd love to see an RFC for this.
>
> On Fri, Aug 5, 2022 at 4:21 AM 田昕峣 (Xinyao Tian) <xinyaot...@yeah.net>
> wrote:
>
> Greetings everyone,
>
>
> My name is Xinyao and I'm currently working for an Insurance company. We
> found that Apache Hudi is an extremely awesome utility and when it
> cooprates with Apache Flink it can be even more powerful. Thus, we have
> been using it for months and still keep benefiting from it.
>
>
> However, there is one feature that we really desire but Hudi doesn't
> currently have: It is called "Multiple event_time fields verification".
> Because in the insurance industry, data is often stored distributed in
> dozens of tables and conceptually connected by same primary keys. When the
> data is being used, we often need to associate several or even dozens of
> tables through the Join operation, and stitch all partial columns into an
> entire record with dozens or even hundreds of columns for downstream
> services to use.
>
>
> Here comes to the problem. If we want to guarantee that every part of the
> data being joined is up to date, Hudi must have the ability to filter
> multiple event_time timestamps in a table and keep the most recent records.
> So, in this scenario, the signle event_time filtering field provided by
> Hudi (i.e. option 'write.precombine.field' in Hudi 0.10.0) is a bit
> inadequate. Obviously, in order to cope with the use case with complex Join
> operations like above, as well as to provide much potential for Hudi to
> support more application scenarios and engage into more industries, Hudi
> definitely needs to support the multiple event_time timestamps filtering
> feature in a single table.
>
>
> A good news is that, after more than two months of development, me and my
> colleagues have made some changes in the hudi-flink and hudi-common modules
> based on the hudi-0.10.0 and basically have achieved this feature.
> Currently, my team is using the enhanced source code and working with Kafka
> and Flink 1.13.2 to conduct some end-to-end testing on a dataset of more
> than 140 million real-world insurance data and verifying the accuracy of
> the data. The result is quite good: every part of the extremely-wide
> records have been updated to latest status based on our continuous
> observations during these weeks. We're very keen to make this new feature
> available to everyone. We benefit from the Hudi community, so we really
> desire to give back to the community with our efforts.
>
>
> The only problem is that, we are not sure whether we need to create a RFC
> to illusrtate our design and implementations in detail. According to "RFC
> Process" in Hudi official documentation, we have to confirm that this
> feature has not already exsited so that we could create a new RFC to share
> concept and code as well as explain them in detail. Thus, we really would
> like to create a new RFC that would explain our implementation in detail
> with theory and code, as well as make it easier for everyone to understand
> and make improvement based on our RFC.
>
>
> Look forward to receiving your feedback whether we should create a new RFC
> and make Hudi better and better to benifit everyone.
>
>
> Kind regards,
> Xinyao Tian
>
>
>
> --
> Best,
> Shiyan
>
>
>
> --
> Regards,
> -Sivabalan
>


-- 
Regards,
-Sivabalan

Reply via email to