I added myself to the proposal. Thank you JB,
Kent On 2024/07/31 10:27:41 Jean-Baptiste Onofré wrote: > Hi Kent > > Happy to add you as mentor! Do you want to update the proposal wiki or I do > ? > > Thanks ! > > Regards > JB > > Le mer. 31 juil. 2024 à 09:31, Kent Yao <y...@apache.org> a écrit : > > > +1 on Poraris entering to Apache. I'm interested in helping as a mentor if > > needed. > > > > Kent Yao > > > > On 2024/07/31 07:02:05 ConradJam wrote: > > > As members of the Amoro project, our team is thrilled to see the growing > > > attention towards Amoro. > > > > > > We are excited about Polaris becoming open source, as it opens up greater > > > possibilities for future collaboration with the Amoro community. > > > > > > Amoro focuses on data lake formats and aims to provide optimization > > > services and enhancements for the lake. Our primary goal is to offer > > > optimization services that support multiple table formats (though > > > currently, Iceberg is the most supported), such as small file > > optimization, > > > Z-order sorting optimization, and future index optimization. > > > > > > Amoro provides both Internal Catalog and External Catalog methods to > > > optimize lake tables. To gather optimization information, we have > > conducted > > > some catalog management work. > > > > > > I often hear people comparing Gravitino and Polaris as potential > > > competitors to Amoro, which I think is a misconception (I noticed that > > some > > > previous discussions about Amoro's positioning seemed unclear, so I > > wanted > > > to clarify this). > > > > > > While there might be some overlap between Amoro, Gravitino, and Polaris: > > > > > > - Gravitino focuses on unified metadata management across various areas, > > > including Kafka and AI, not just on data lakes. > > > - Polaris is an interoperable, open-source catalog for Apache Iceberg. > > > > > > If there are any errors, please correct them. > > > > > > Amoro plans to support both Polaris and Gravitino in the future. > > > Additionally, the Amoro community will continue to engage with the > > > Gravitino and Polaris communities to foster more collaborative efforts in > > > lake optimization. > > > > > > [1] Amoro docs: https://amoro.apache.org/docs/latest/ > > > [2] Gravitino docs: https://datastrato.ai/docs/0.5.1/ > > > [3] Polaris docs: https://polaris.io/ > > > > > > Jack Ye <yezhao...@gmail.com> 于2024年7月31日周三 13:22写道: > > > > > > > > What's the difference between this project and Amoro > > > > > > > > Here is my $0.01, please correct me if I am wrong, especially for > > people > > > > working on Amoro and Gravitino. > > > > > > > > I think Apache Amoro is focused more on being a self-contained complete > > > > data lakehouse management and ingestion system. It is a complete > > solution > > > > with its own connectors in engines like Spark [1], and customized > > > > mixed-format integrations in engines like Trino [2]. Polaris is mostly > > > > focused on the data catalog aspect of a data lakehouse, and offers an > > open > > > > source vendor-neutral Iceberg catalog with additional governance > > support. > > > > By integrating with the Iceberg REST catalog interface, the intention > > is > > > > for it to leverage Iceberg for all the engine integrations to begin > > with. > > > > Similarly, any table management or ingestion system that works with > > Iceberg > > > > REST API will be able to be plugged in to directly work with Polaris. > > So > > > > you could imagine it could be possible for an Iceberg table to be > > ingested > > > > and managed by Amoro, but cataloged using Polaris. > > > > > > > > This does make Polaris more similar to Apache Gravitino. However, I > > think > > > > the key difference between them is that the emphasis of Gravitino is > > more > > > > breath-first on aspects like multi-format, multi-catalog, > > multi-datasource, > > > > different data catalog objects in AI [3], etc. It exposes different > > sets of > > > > APIs for different purposes, with Iceberg REST API being a part of it > > for > > > > the Iceberg tables, and other APIs for other data sources [4]. Polaris > > is > > > > more depth-first on Iceberg at this moment. Our future plan does say > > that > > > > it could extend to non-Iceberg data lakes, and there could be some > > overlap > > > > at that time. But even then, there could be different ways to achieve > > such > > > > support. For example, we could surface Hive Parquet tables as Iceberg > > > > tables, if the Iceberg REST catalog standard can be updated to > > accommodate > > > > that. There could also be potential collaborations between Polaris and > > > > Gravitino to achieve the goal together, and I am personally pretty > > excited > > > > about that opportunity. > > > > > > > > Best, > > > > Jack Ye > > > > > > > > [1] https://amoro.apache.org/docs/latest/spark-configuration/ > > > > [2] https://amoro.apache.org/docs/latest/trino/#mixed-format > > > > [3] > > > > > > > > > > https://github.com/apache/gravitino-site/blob/10a967f18730c28018e064f3ee1ddd3cc32aa506/src/components/HomepageFeatures/index.tsx#L74 > > > > [4] https://github.com/apache/gravitino/tree/main/catalogs > > > > > > > > On Tue, Jul 30, 2024 at 10:06 PM Jean-Baptiste Onofré <j...@nanthrax.net > > > > > > > wrote: > > > > > > > > > Hi Manu > > > > > > > > > > Thanks for the details ! > > > > > I agree with you. As mentor on Gravitino, I would be more than happy > > > > > to connect the two podlings. > > > > > > > > > > Regards > > > > > JB > > > > > > > > > > On Wed, Jul 31, 2024 at 7:00 AM Manu Zhang <owenzhang1...@gmail.com> > > > > > wrote: > > > > > > > > > > > > AFAIK, Amoro is a management system with optimization service, > > catalog > > > > > > service, etc. It has a built-in catalog but can also work with > > other > > > > > > catalogs like Polaris. > > > > > > I think Polaris is more comparable to Gravitino which entered the > > > > > incubator > > > > > > recently. It would be interesting to see how these two communities > > can > > > > > > collaborate. > > > > > > > > > > > > Regards, > > > > > > Manu > > > > > > > > > > > > > > > > > > On Wed, Jul 31, 2024 at 12:36 PM Jean-Baptiste Onofré < > > j...@nanthrax.net > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi > > > > > > > > > > > > > > The proposal is more generic: today it's Apache Iceberg, but > > after > > > > the > > > > > > > discussions with the initial community we agreed it could make > > sense > > > > > > > to address other use cases. > > > > > > > > > > > > > > I don't know Amoro in details, but I am happy to bridge the > > > > > > > communities to work together. > > > > > > > > > > > > > > Regards > > > > > > > JB > > > > > > > > > > > > > > On Wed, Jul 31, 2024 at 5:16 AM Xuanwo <xua...@apache.org> > > wrote: > > > > > > > > > > > > > > > > Hi, JB > > > > > > > > > > > > > > > > Thank you for starting this thread; it's great to see an > > increasing > > > > > > > number of projects being developed around Iceberg. > > > > > > > > > > > > > > > > I have two questions: > > > > > > > > > > > > > > > > - The polaris github repo said it's "an open source catalog for > > > > > Apache > > > > > > > Iceberg", but the proposal changed into "a catalog for data > > lakes". > > > > > Does it > > > > > > > mean Polaris's scope has been changed? > > > > > > > > - What's the difference between this project and Amoro: > > > > > > > https://github.com/apache/amoro? How do these two communities > > > > > collaborate? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jul 31, 2024, at 04:19, Dave Fisher wrote: > > > > > > > > >> On Jul 30, 2024, at 11:34 AM, Jean-Baptiste Onofré < > > > > > j...@nanthrax.net> > > > > > > > wrote: > > > > > > > > >> > > > > > > > > >> Hi Dave, > > > > > > > > >> > > > > > > > > >> That's a good question. The main reason is because we wanted > > > > > people > > > > > > > > >> with Apache experience in the PPMC to mentor the committers > > and > > > > > > > > >> contributors heading to PPMC as well. > > > > > > > > >> Also, the initial committers worked closely with PPMC > > guidance > > > > > > > > >> (explaining the ICLA, good practice, etc). > > > > > > > > >> So, we wanted to have PPMC acting more as mentor (both > > > > > technically but > > > > > > > > >> also with their Apache experience) with committers. > > > > > > > > > > > > > > > > > > That makes sense. Are any of the proposed PPMC members also > > ASF > > > > > Members > > > > > > > > > and/or potentially future Mentors? > > > > > > > > > > > > > > > > > >> If it's problematic, we can start only with the PPMC group > > and > > > > > invite > > > > > > > > >> new committers/PPMC members during incubation period. > > > > > > > > > > > > > > > > > > No problem. It will actually provide the Mentors and later > > the > > > > IPMC > > > > > > > > > additional data to see if the PPMC is properly growing the > > PPMC > > > > and > > > > > > > > > Committer base. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Dave > > > > > > > > > > > > > > > > > >> > > > > > > > > >> Regards > > > > > > > > >> JB > > > > > > > > >> > > > > > > > > >> On Tue, Jul 30, 2024 at 8:19 PM Dave Fisher < > > w...@apache.org> > > > > > wrote: > > > > > > > > >>> > > > > > > > > >>> Hi JB, > > > > > > > > >>> > > > > > > > > >>> An interesting project that looks pretty mature. > > > > > > > > >>> > > > > > > > > >>> I’m curious about the split between Initial PPMC and > > initial > > > > > > > Committer. In the usual case a new podling will have all of the > > > > Initial > > > > > > > Committers on the PPMC. Can you tell us why this is not the case > > with > > > > > > > Polaris? > > > > > > > > >>> > > > > > > > > >>> Best, > > > > > > > > >>> Dave > > > > > > > > >>> > > > > > > > > >>>> On Jul 30, 2024, at 10:33 AM, Jean-Baptiste Onofré < > > > > > j...@nanthrax.net> > > > > > > > wrote: > > > > > > > > >>>> > > > > > > > > >>>> Hi folks, > > > > > > > > >>>> > > > > > > > > >>>> We would like to propose a new project to the ASF > > incubator: > > > > > > > Polaris. > > > > > > > > >>>> > > > > > > > > >>>> Polaris is a catalog for data lakes. It provides new > > levels of > > > > > > > choice, > > > > > > > > >>>> flexibility and control over data, with full enterprise > > > > > security and > > > > > > > > >>>> Apache Iceberg interoperability across a multitude of > > engines > > > > > and > > > > > > > > >>>> infrastructure. Polaris builds on standards such as those > > > > > created by > > > > > > > > >>>> Apache Iceberg, providing the following benefits for the > > > > > ecosystem: > > > > > > > > >>>> * Multi-engine interoperability over a single copy of > > data, > > > > > > > > >>>> eliminating the need for moving and copying data across > > > > > different > > > > > > > > >>>> engines and catalogs. > > > > > > > > >>>> * An interoperable security model providing a unified > > > > > authorization > > > > > > > > >>>> layer independent from the engines processing analytical > > > > tables. > > > > > > > > >>>> * For multi-catalog scenarios, a unified catalog level > > view of > > > > > data > > > > > > > > >>>> across multiple catalogs via catalog notification > > > > integrations. > > > > > > > > >>>> * The ability to host Polaris Catalog on the > > infrastructure of > > > > > your > > > > > > > choice. > > > > > > > > >>>> > > > > > > > > >>>> Here is the proposal: > > > > > > > > >>>> > > > > > > > > > > > https://cwiki.apache.org/confluence/display/INCUBATOR/PolarisProposal > > > > > > > > >>>> > > > > > > > > >>>> Comments and feedback are welcome. > > > > > > > > >>>> > > > > > > > > >>>> Thanks! > > > > > > > > >>>> Regards > > > > > > > > >>>> JB > > > > > > > > >>>> > > > > > > > > >>>> > > > > > > > > > --------------------------------------------------------------------- > > > > > > > > >>>> To unsubscribe, e-mail: > > > > > general-unsubscr...@incubator.apache.org > > > > > > > > >>>> For additional commands, e-mail: > > > > > general-h...@incubator.apache.org > > > > > > > > >>>> > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > >>> > > > > > --------------------------------------------------------------------- > > > > > > > > >>> To unsubscribe, e-mail: > > > > general-unsubscr...@incubator.apache.org > > > > > > > > >>> For additional commands, e-mail: > > > > > general-h...@incubator.apache.org > > > > > > > > >>> > > > > > > > > >> > > > > > > > > >> > > > > > --------------------------------------------------------------------- > > > > > > > > >> To unsubscribe, e-mail: > > > > general-unsubscr...@incubator.apache.org > > > > > > > > >> For additional commands, e-mail: > > > > > general-h...@incubator.apache.org > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > > > > > To unsubscribe, e-mail: > > general-unsubscr...@incubator.apache.org > > > > > > > > > For additional commands, e-mail: > > > > general-h...@incubator.apache.org > > > > > > > > > > > > > > > > -- > > > > > > > > Xuanwo > > > > > > > > > > > > > > > > https://xuanwo.io/ > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > > > > To unsubscribe, e-mail: > > general-unsubscr...@incubator.apache.org > > > > > > > > For additional commands, e-mail: > > general-h...@incubator.apache.org > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > > > > > > For additional commands, e-mail: > > general-h...@incubator.apache.org > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > > > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > > > > > > > > > > > > > > > > > > -- > > > Best > > > > > > ConradJam > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org