>I suggest that we use the "hudi-{another bigdata framework}" naming
pattern more carefullyFully understand your concern. But thats exactly what hudi-hadoop-mr is doing, is it not? :) InputFormats are how you integrate to Hive. hudi-spark, hudi-presto etc have their own integrations, but they both can fallback to the hive integration. On Thu, Jan 16, 2020 at 8:36 PM vino yang <[email protected]> wrote: > Hi Vionth, Bhavani, > > +1 for renaming hudi-hive -> hudi-hive-sync > > About "hudi-hadoopm-mr -> hudi-hive", I suggest that we use the > "hudi-{another bigdata framework}" naming pattern more carefully. On a > superficial level of understanding. It is very easy for users to > misunderstand that the module is doing ecosystem integration. Especially > those who have seen the source code of mainstream projects, such as > presto.[1] > > When we go to check out the hardi-hadoop-mr, it actually just contains some > InputFormat. > > If we do want to mention other frameworks without letting users > misunderstand that we are doing ecosystem integration. Then, we need to add > additional information, for example: "hudi- {another bigdata framework} > -xxx" or "hudi-xxx- {another bigdata framework}". > > [1]: https://github.com/prestodb/presto > > Best, > Vino > > Bhavani Sudha <[email protected]> 于2020年1月17日周五 上午5:42写道: > > > Thanks @vinoth for giving a overall picture. I think I can relate better > > with the name changes you proposed. > > > > +1 for renaming hudi-hive -> hudi-hive-sync and hudi-hadoopm-mr -> > > hudi-hive > > > > On Thu, Jan 16, 2020 at 1:33 PM Vinoth Chandar <[email protected]> > wrote: > > > > > First let me share the context for the existing name.. We saw how > Parquet > > > hands out the InputFormat and named it similar to parquet-mr. > > > InputFormat is indeed a MapReduce class.. I know we live in the age of > > > Flink and Spark.. But its true :) > > > > > > I think this is the crux of the "understandability" issue.. > > > > > > Here are my thoughts.. > > > > > > - +0 (neutral) on the rename to hudi-query-common., (whatever we > decide, > > > we need to rename the bundle accordingly) > > > - On hudi-query-bundle being confusing with hive/spark/presto > bundles, I > > > don't feel its more confusing than it is today > > > > > > Real issue IMO, is hudi-hive, which is really about syncing to hive, > not > > > querying Hive. > > > Then, may be we can rename > > > - hudi-hadoop-mr to hudi-hive (more understandable, Hive does use > > > InputFormat as the abstraction) > > > - current hudi-hive to hudi-hive-sync > > > (bundles renamed accordingly) > > > > > > I know this hijacks the conversation. Apologize :). But thought I'd > > present > > > a broader take > > > > > > > > > > > > On Thu, Jan 16, 2020 at 11:26 AM Bhavani Sudha Saktheeswaran > > > <[email protected]> wrote: > > > > > > > +1 to generally renaming the packages. Since this is about renaming > for > > > the > > > > purpose of making it user friendly, I am concerned if we make this as > > > > hudi-query-bundle, users might get confused with other modules like > > > > hudi-hive and hudi-spark. And inside packaging module, we further > have > > > > bundles specific to spark, hive and presto. > > > > > > > > Any suggestions on how to rename broadly to avoid these confusions? > Let > > > me > > > > also think and get back. > > > > > > > > Thanks, > > > > Sudha > > > > > > > > On Wed, Jan 15, 2020 at 9:56 PM vino yang <[email protected]> > > wrote: > > > > > > > > > Hi guys, > > > > > > > > > > I want to start a proposal about refactoring the naming of the > > > > > "hudi-hadoop-mr" module. > > > > > > > > > > IMHO, this module name is not user-friendly. It may make users > > > confused. > > > > > Because it looks like that it's about integrating with MapReduce( > > > > although > > > > > I know it referenced parquet-mr[1] project). > > > > > > > > > > Based on the purpose of this module (contains InputFormat > > > implementations > > > > > for ReadOptimized, Incremental, Realtime views). > > > > > > > > > > I suggest that we can rename it to "*hudi-query-common*". Then, we > > can > > > > also > > > > > rename "hudi-hadoop-mr-bundle" to "*hudi-query-bundle*". > > > > > > > > > > What do you think? > > > > > > > > > > Any thoughts and suggestions are welcome and appreciated. > > > > > > > > > > Best, > > > > > Vino > > > > > > > > > > [1]: > > > > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_parquet-2Dmr&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=oyPDRKU5b-LuEWWyf8gacx4mFFydIGdyS50OKdxizX0&m=dmZJgDEuo5sZCNsoyMRQUpiJoBP7u4r2i8cdHDMmQic&s=4CnBhu54QxDqAWdCb3NXUdQg9beV2xEmgx-N0yhTr9Y&e= > > > > > > > > > > > > > > >
