Great! HUDI-572 filed!

On Sun, Jan 19, 2020 at 6:09 PM vino yang <[email protected]> wrote:

> Hi Vinoth,
>
> OK, I find that our main difference lies in our understanding of
> "integration".
>
> In general, my understanding is that the integration module is a separate
> module. But it seems that "hudi-hadoop-mr" just contains some basic class
> files, and the module will be used as a public library by other modules.
>
> But on the other hand, hudi itself exists as a library. Therefore, there
> seems to be no problem in understanding this way. The "integration" I
> understand is still stuck in the model where the main framework can
> function as a single individual (for example Flink, Presto and so on).
>
> In short, now I have no problem, +1 to rename "hudi-hadoop-mr" to
> "hudi-hive".
>
> Best,
> Vino
>
> Vinoth Chandar <[email protected]> 于2020年1月19日周日 上午1:41写道:
>
> > >I suggest that we use the "hudi-{another bigdata framework}" naming
> > pattern more carefully
> >
> > Fully understand your concern. But thats exactly what hudi-hadoop-mr is
> > doing, is it not? :)  InputFormats are how you integrate to Hive.
> >
> > hudi-spark, hudi-presto etc have their own integrations, but they both
> can
> > fallback to the hive integration.
> >
> > On Thu, Jan 16, 2020 at 8:36 PM vino yang <[email protected]> wrote:
> >
> > > Hi Vionth, Bhavani,
> > >
> > > +1 for renaming hudi-hive -> hudi-hive-sync
> > >
> > > About "hudi-hadoopm-mr -> hudi-hive", I suggest that we use the
> > > "hudi-{another bigdata framework}" naming pattern more carefully. On a
> > > superficial level of understanding. It is very easy for users to
> > > misunderstand that the module is doing ecosystem integration.
> Especially
> > > those who have seen the source code of mainstream projects, such as
> > > presto.[1]
> > >
> > > When we go to check out the hardi-hadoop-mr, it actually just contains
> > some
> > > InputFormat.
> > >
> > > If we do want to mention other frameworks without letting users
> > > misunderstand that we are doing ecosystem integration. Then, we need to
> > add
> > > additional information, for example: "hudi- {another bigdata framework}
> > > -xxx" or "hudi-xxx- {another bigdata framework}".
> > >
> > > [1]: https://github.com/prestodb/presto
> > >
> > > Best,
> > > Vino
> > >
> > > Bhavani Sudha <[email protected]> 于2020年1月17日周五 上午5:42写道:
> > >
> > > > Thanks @vinoth for giving a overall picture. I think I can relate
> > better
> > > > with the name changes you proposed.
> > > >
> > > > +1 for renaming hudi-hive -> hudi-hive-sync and hudi-hadoopm-mr ->
> > > > hudi-hive
> > > >
> > > > On Thu, Jan 16, 2020 at 1:33 PM Vinoth Chandar <[email protected]>
> > > wrote:
> > > >
> > > > > First let me share the context for the existing name.. We saw how
> > > Parquet
> > > > > hands out the InputFormat and named it similar to  parquet-mr.
> > > > > InputFormat is indeed a MapReduce class.. I know we live in the age
> > of
> > > > > Flink and Spark.. But its true :)
> > > > >
> > > > > I think this is the crux of the "understandability" issue..
> > > > >
> > > > > Here are my thoughts..
> > > > >
> > > > >  - +0 (neutral) on the rename to hudi-query-common., (whatever we
> > > decide,
> > > > > we need to rename the bundle accordingly)
> > > > >  - On hudi-query-bundle being confusing with hive/spark/presto
> > > bundles, I
> > > > > don't feel its more confusing than it is today
> > > > >
> > > > > Real issue IMO, is hudi-hive, which is really about syncing to
> hive,
> > > not
> > > > > querying Hive.
> > > > > Then, may be we can rename
> > > > > - hudi-hadoop-mr to hudi-hive (more understandable, Hive does use
> > > > > InputFormat as the abstraction)
> > > > > - current hudi-hive to hudi-hive-sync
> > > > > (bundles renamed accordingly)
> > > > >
> > > > > I know this hijacks the conversation. Apologize :). But thought I'd
> > > > present
> > > > > a broader take
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jan 16, 2020 at 11:26 AM Bhavani Sudha Saktheeswaran
> > > > > <[email protected]> wrote:
> > > > >
> > > > > > +1 to generally renaming the packages. Since this is about
> renaming
> > > for
> > > > > the
> > > > > > purpose of making it user friendly, I am concerned if we make
> this
> > as
> > > > > > hudi-query-bundle, users might get confused with other modules
> like
> > > > > > hudi-hive and hudi-spark. And inside packaging module, we further
> > > have
> > > > > > bundles specific to spark, hive and presto.
> > > > > >
> > > > > > Any suggestions on how to rename broadly to avoid these
> confusions?
> > > Let
> > > > > me
> > > > > > also think and get back.
> > > > > >
> > > > > > Thanks,
> > > > > > Sudha
> > > > > >
> > > > > > On Wed, Jan 15, 2020 at 9:56 PM vino yang <[email protected]
> >
> > > > wrote:
> > > > > >
> > > > > > > Hi guys,
> > > > > > >
> > > > > > > I want to start a proposal about refactoring the naming of the
> > > > > > > "hudi-hadoop-mr" module.
> > > > > > >
> > > > > > > IMHO, this module name is not user-friendly. It may make users
> > > > > confused.
> > > > > > > Because it looks like that it's about integrating with
> MapReduce(
> > > > > > although
> > > > > > > I know it referenced parquet-mr[1] project).
> > > > > > >
> > > > > > > Based on the purpose of this module (contains InputFormat
> > > > > implementations
> > > > > > > for ReadOptimized, Incremental, Realtime views).
> > > > > > >
> > > > > > > I suggest that we can rename it to "*hudi-query-common*". Then,
> > we
> > > > can
> > > > > > also
> > > > > > > rename "hudi-hadoop-mr-bundle" to "*hudi-query-bundle*".
> > > > > > >
> > > > > > > What do you think?
> > > > > > >
> > > > > > > Any thoughts and suggestions are welcome and appreciated.
> > > > > > >
> > > > > > > Best,
> > > > > > > Vino
> > > > > > >
> > > > > > > [1]:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_parquet-2Dmr&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=oyPDRKU5b-LuEWWyf8gacx4mFFydIGdyS50OKdxizX0&m=dmZJgDEuo5sZCNsoyMRQUpiJoBP7u4r2i8cdHDMmQic&s=4CnBhu54QxDqAWdCb3NXUdQg9beV2xEmgx-N0yhTr9Y&e=
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to