Hi Nicholas,

Sorry for the late reply. Thanksgiving :)

>>Now Hudi's design, in order to highlight its core components, is a
patchwork of the Spark RDD API mixed with business logic scattered in
multiple modules and various types of methods.
Agree that the hudi-client module needs to be refactored and generalized.
vinoyang is already tracking some of this work for Flink integration. I
personally would not go so far to call it a patch work. There is clear
layers of abstractions in the code. :) . We have not done a good job of
documenting these. All of it was written with Spark in mind, so thats why
we are where we are.

>>refactor the Hudi integration module through plug-inization to facilitate
the subsequent integration of Spark and FLink.
I think a lot of this could be achieved by just better class hierarchy IMO.
But open to exploring other means as well, once there is an RFC.. I would
really love to do the refactoring with a clear goal in mind, otherwise
there is a lot of lower hanging fruits we can do to improve understanding
in-place without revamping everything.

just my 2c

thanks
vinoth





On Tue, Nov 26, 2019 at 9:59 PM 蒋晓峰 <programg...@163.com> wrote:

> Hi guys,
>
>
> Feeling the pain of supporting Flink engine for Hudi, it is necessary to
> discuss the design of high cohesion, low coupling, and plug-in for the
> calculation engine module here.
>
>
> Now Hudi's design, in order to highlight its core components, is a
> patchwork of the Spark RDD API mixed with business logic scattered in
> multiple modules and various types of methods. As a result, developers with
> a background in computing engines have difficulty understanding the main
> process of Spark job, and the calculation engine plug-in is also more
> difficult, because the general interface carries the context of RDD and
> Spark, unless large-scale restructuring is started.
>
>
> In my opinion, it is necessary to refactor the Hudi integration module
> through plug-inization to facilitate the subsequent integration of Spark
> and FLink.
>
>
> Best,
> Nicholas

Reply via email to