RE: Re: [DISCUSS] Apache Amoro proposal

2024-02-23 Thread nathan ma
hi, JB As co-creator of this project, I’d love to explain more about the positioning of lakehouse management system. When discussing databases or traditional data warehouses, we often used the term DBMS (Database Management System) to describe them. Traditional databases, including MPP

Re: [DISCUSS] Graduate Apache SDAP (Incubating) as a Top Level Project

2024-02-23 Thread Riley Kuttruff
Thank you for finding those issues. I've updated the site (sdap.a.o shows the changes but sdap.i.a.o still hasn't updated at the time I'm writing this). I believe I've the issues you've found. Please let me know if this is not the case. On 2024/02/23 15:59:07 sebb wrote: > The Downloads page

Re: [DISCUSS] Graduate Apache SDAP (Incubating) as a Top Level Project

2024-02-23 Thread Riley Kuttruff
Thank you very much. We have actually found an issue with that release (GPL licensed dependency) and have been working on preparing another release candidate. It seems we have forgotten to cancel that vote. On 2024/02/23 14:27:52 PJ Fanning wrote: > +1 (binding) > > I had a look at the mailing

[NOTICE] Incubation Report for February 2024

2024-02-23 Thread tison
Hi, I'm trying to create an Incubation Report page for February 2024 at [1], including the following podlings according to the report group info that they should report in this month: * answer * fury * horaedb * streampark But it still lacks information that I need some help: 1. What is the

Re: [DISCUSS] Graduate Apache SDAP (Incubating) as a Top Level Project

2024-02-23 Thread sebb
The Downloads page has some issues: https://sdap.apache.org/downloads No link to KEYS file Links for older releases are broken Copyright page is 2023 On Fri, 23 Feb 2024 at 14:30, PJ Fanning wrote: > > +1 (binding) > > I had a look at the mailing lists and the community seems in a pretty good

Re: [DISCUSS] Apache Amoro proposal

2024-02-23 Thread 周劲松
Hi JB, Yes, you can say it is an abstraction layer on top of data lake table formats and query engines and we often call it the service layer in Lakehouse architecture. The service layer primarily provides unified metadata and access control, as well as common audit services, and so on. Of

Re: [DISCUSS] Apache Amoro proposal

2024-02-23 Thread 周劲松
Hi Ayush, I am Jinsong from Amoro community. Thank you very much for your attention and feedback on Amoro. Amoro aims to support multiple versions of Hadoop and Hive clusters as much as possible, allowing users to specify versions during build time, but just as you said, our default version

Re: [DISCUSS] Graduate Apache SDAP (Incubating) as a Top Level Project

2024-02-23 Thread PJ Fanning
+1 (binding) I had a look at the mailing lists and the community seems in a pretty good state. As a matter of interest, are you still looking at completing the v1.2.0 release [1]? If so, I could have a look at the RC over the weekend. [1]

Re: [DISCUSS] Apache Amoro proposal

2024-02-23 Thread Jean-Baptiste Onofré
Hi Justin Even if it looks interesting, I'm not sure to understand exactly the purpose of the proposal. What lakehouse management system means exactly ? Is it an abstraction layer on top of Iceberg, Paimon + query engine powered by Flink, Spark, Trino ? Please let me know if you want an

Re: [DISCUSS] Apache Amoro proposal

2024-02-23 Thread Ayush Saxena
+1, I remember exploring this while exploring a way for compaction for iceberg tables for a Hive usecase, got some good pointers for cleaning up orphan files, I think it was using a pretty old version of Hive(3.1.1 I believe), so couldn't pull it in as dependency in Hive master branch itself,

[DISCUSS] Apache Amoro proposal

2024-02-23 Thread Justin Mclean
Hi, I would like to propose a new project to the ASF incubator - Apache Amoro. I’m one of the mentors, but there are a lot of other people involved who have done all of the hard work. Amoro is a Lakehouse management system built on open data lake formats like Apache Iceberg and Apache Paimon