Re: [VOTE] Release Apache Iceberg Rust 0.2.0 RC1

2024-02-20 Thread Fokko Driesprong
Hey everyone, Thanks for voting! The 72 hours have passed, and a minimum of 3 binding votes have been cast: +1 Xuanwo (non-binding) +1 Jan Kaul (non-binding) +1 NOTME ZE (non-binding) +1 Chojan Shang (non-binding) +1 Amogh Jahagirdar (non-binding) +1 Renjie Liu (non-binding) +1 Daniel Weeks (bind

Community Over Code Asia 2024 Travel Assistance Applications now open!

2024-02-20 Thread Gavin McDonald
Hello to all users, contributors and Committers! The Travel Assistance Committee (TAC) are pleased to announce that travel assistance applications for Community over Code Asia 2024 are now open! We will be supporting Community over Code Asia, Hangzhou, China July 26th - 28th, 2024. TAC exists to

Re: Improve Change Data Capture Use Case for Iceberg

2024-02-20 Thread Manu Zhang
Bump up this thread again. Are we actively working on any proposed approaches? Manu On Fri, May 5, 2023 at 9:14 AM Ryan Blue wrote: > Thanks for taking the time to write this up, Jack! It definitely overlaps > my own thinking, which is a good confirmation that we're on the right > track. There

Re: Table Portability Proposal

2024-02-20 Thread Manu Zhang
Do we still want to move forward with this feature? It's on the roadmap for Spec V3 but it hasn't appeared in our discussion for a while. Manu On Sat, Aug 26, 2023 at 2:43 AM Mohit Garg wrote: > hi > > Please review the approach captured here Iceberg Table

Re: Table Portability Proposal

2024-02-20 Thread Jean-Baptiste Onofré
Hi Manu Thanks for the reminder. It sounds like a good feature and worth discussing it :). It was my intention to define what we plan to include (or not) in Spec v3 / Iceberg 2.0.0 (I sent a message about that last week). Regards JB On Tue, Feb 20, 2024 at 10:36 AM Manu Zhang wrote: > > Do we

Re: [VOTE] Release Apache Iceberg 1.5.0 RC0

2024-02-20 Thread Ajantha Bhat
Thanks Eduard, I will share a new RC info with the fix. - Ajantha On Tue, Feb 20, 2024 at 12:17 PM Jean-Baptiste Onofré wrote: > Hi Ryan, > > If it's "used" section is not strictly required in NOTICE from a legal > perspective, the embedded dependencies should be mentioned (either > under the

[ANNOUNCE] Release Apache Iceberg Rust 0.2.0

2024-02-20 Thread Driesprong, Fokko
Hi all, The Apache Iceberg Rust community is pleased to announce that Apache Iceberg Rust 0.2.0 has been released! Iceberg is a data access layer that allows users to easily and efficiently retrieve data from various storage services in a unified way. This first release provides integration with

[VOTE] Release Apache Iceberg 1.5.0 RC1

2024-02-20 Thread Ajantha Bhat
Hi Everyone, I propose that we release the following RC as the official Apache Iceberg 1.5.0 release. The commit ID is 5b84f34a5386fc61b17bfe7dc7c1cbe565550958 * This corresponds to the tag: apache-iceberg-1.5.0-rc1 * https://github.com/apache/iceberg/commits/apache-iceberg-1.5.0-rc1 * https://gi

Re: [VOTE] Release Apache Iceberg 1.5.0 RC0

2024-02-20 Thread Fokko Driesprong
Just using this thread to come back to the NOTICE discussion. This came also up with the latest Python release, and I spent quite a bit of time on it. If it's "used" section is not strictly required in NOTICE from a legal > perspective, the embedded dependencies should be mentioned (either > under

Re: [VOTE] Release Apache Iceberg 1.5.0 RC0

2024-02-20 Thread Jean-Baptiste Onofré
OK, no problem, let's keep as it is if you prefer (as I said it's not a blocker). I still consider that it's not complete (I don't see the value of NOTICE if it's just to say that we use ASF projects, it's not a point for Iceberg but generally speaking, I already had disagreement with ASF members

Re: [VOTE] Release Apache Iceberg 1.5.0 RC1

2024-02-20 Thread Jean-Baptiste Onofré
Unfortunately, we identified an issue with Trino and JDBC catalog (see https://github.com/apache/iceberg/issues/9764 for details). I'm working on a fix right now (PR will be available soon). Sorry, but we will need a RC2 :/ Regards JB On Tue, Feb 20, 2024 at 11:33 AM Ajantha Bhat wrote: > > Hi

Re: [VOTE] Release Apache Iceberg 1.5.0 RC1

2024-02-20 Thread Jean-Baptiste Onofré
I created https://github.com/apache/iceberg/pull/9765 to fix JDBC Catalog schema management. Regards JB On Tue, Feb 20, 2024 at 5:17 PM Jean-Baptiste Onofré wrote: > > Unfortunately, we identified an issue with Trino and JDBC catalog (see > https://github.com/apache/iceberg/issues/9764 for detai

Re: Table Portability Proposal

2024-02-20 Thread Ryan Blue
JB, The spec and the reference implementation are released separately so v3 and 2.0 are independent. There's no requirement that v3 is completed for Iceberg Java 2.0 and the goal of a 2.0 is to have an opportunity to deprecate and remove things so that we don't continue to carry forward and mainta

Re: [VOTE] Release Apache Iceberg 1.5.0 RC0

2024-02-20 Thread Ryan Blue
JB, Iceberg documents the licenses and copyright of bundled projects in the LICENSE file. You can see that gradlew is documented here: https://github.com/apache/iceberg/blob/main/LICENSE#L204-L213 This is based on the how-to guide that Fokko linked: Bundling permissively-licensed dependencies .

Re: Table Portability Proposal

2024-02-20 Thread Jean-Baptiste Onofré
Hi Ryan Ah ok, I thought that an Iceberg release is "based"/implement a spec (I assumed the opposite is wrong). Thanks for the explanation! Regards JB On Tue, Feb 20, 2024 at 6:04 PM Ryan Blue wrote: > > JB, > > The spec and the reference implementation are released separately so v3 and > 2.0

Re: [VOTE] Release Apache Iceberg 1.5.0 RC0

2024-02-20 Thread Jean-Baptiste Onofré
Thanks Ryan. As I said, we have a different "view"/"read" on that (we already have long discussions about that especially in the Incubator :)). As I said to Fokko, that's OK (even if I strongly convinced that NOTICE should mention the dependencies we ship like gradlew, as an user and legal standp

Table Schema History Pruning

2024-02-20 Thread Barron Wei
Hi folks, I have a few questions regarding the schema history of an Iceberg table. The table metadata file keeps track of every table schema version (at least in v2). Depending on the size of the schema, this history can become large in terms of byte size. 1. Is removing a schema from the

Re:Table Schema History Pruning

2024-02-20 Thread Sung Yun (BLOOMBERG/ 120 PARK)
Hi Barron, we've noticed the same issue as well since this PR was merged in to introduce schema versions: https://github.com/apache/iceberg/pull/2096 There's a closed issue where folks were discussions options in remediating this problem, that also has links to other related PRs and Issues: htt

Re: [ANNOUNCE] Release Apache Iceberg Rust 0.2.0

2024-02-20 Thread Jack Ye
Congratulations on the first release! -Jack On Tue, Feb 20, 2024 at 2:32 AM Driesprong, Fokko wrote: > Hi all, > > The Apache Iceberg Rust community is pleased to announce that Apache > Iceberg Rust 0.2.0 has been released! > > Iceberg is a data access layer that allows users to easily and effi

Re: Materialized view integration with REST spec

2024-02-20 Thread Jack Ye
Thanks for the response from everyone! Before proceeding further, I see a few people referring back to the current design from Jan. I specifically raised this thread based on the information in the doc and a few latest discussions we had there. Because there are many threads in the doc, and each t

Re: Materialized view integration with REST spec

2024-02-20 Thread Walaa Eldin Moustafa
I would vote to keep a log in the doc with open questions, and keep the doc updated with open questions as they arise/get resolved. On Tue, Feb 20, 2024 at 11:37 AM Jack Ye wrote: > Thanks for the response from everyone! > > Before proceeding further, I see a few people referring back to the > c

Re: Table Portability Proposal

2024-02-20 Thread Jack Ye
Just to put another alternative solution on the table. In S3FileIO, we implemented the support for S3 access point and bucket alias, which actually accidentally enabled "relative path" if you are just switching bucket name. At read time, you can supply a catalog property "s3.access-points.=" indic

Re: Materialized view integration with REST spec

2024-02-20 Thread Manish Malhotra
Very excited for MV to be in Iceberg :) Keeping in the same doc. would be helpful, to have the trail. But also agreed, if there are too many directions/threads, then keep closing the old one, if there are no more questions. And put down the assumptions for the initial version to move forward. On

Re: Table Schema History Pruning

2024-02-20 Thread Jack Ye
The feature sounds reasonable to me, if a schema or partition spec is no longer referenced and used for any time travel purpose, then it seems to me that it could be safely pruned through some utility actions. If schema changes frequently and there are many columns it might be helpful in reducing m

Re: Support permission concepts in REST spec

2024-02-20 Thread Jack Ye
Thanks for the response JB & Micah. > Is this intended to be information only? I would expect the engine to honor it to some extent. Consider the case of writing to a table, LoadTableRequest needs to be able to express this intent of requesting write access, such that the credentials vended back

Re: Proposal for RESTful Data Operations

2024-02-20 Thread Drew
Hi everyone, As we are discussing the rest spec changes to add support for DataFiles and DeleteFiles for both appends and scan planning API (PR: https://github.com/apache/iceberg/pull/9717). One thing that came up for appends was that this logic shouldn’t be in the table update API but instead it

Re: Proposal for RESTful Data Operations

2024-02-20 Thread Jack Ye
I think there is also a point we were discussing but never closed regarding AppendDeleteFiles, if that should be supported. The recent development in Kafka, and vendor products like Upsolver Zero-ETL

[DISCUSS] Iceberg Summit proposal

2024-02-20 Thread Ryan Blue
Hi everyone, JB and I have been working to update the Iceberg Summit proposal. The proposal doc that JB already sent out (here ) is now up to date with the committee that was selected by th

Re: Process for creating new Proposals

2024-02-20 Thread Renjie Liu
> > In my mind there is a distinction between a voting and discussion. I > agree that discussion is probably best served on the document. I see > voting as a final notice that the feature is officially finalized. > +1 for having a voting phrase once we have the discussion finalized. Also I real

Re: Process for creating new Proposals

2024-02-20 Thread Manu Zhang
I think discussions can happen everywhere by nature. It's the proposal and summary of different ideas that should have a central place and can easily be retrieved. Regards, Manu On Wed, Feb 21, 2024 at 9:30 AM Renjie Liu wrote: > In my mind there is a distinction between a voting and discussion

Re: Process for creating new Proposals

2024-02-20 Thread Renjie Liu
> > I think discussions can happen everywhere by nature. In fact, I want to say that discussions for the same proposal happen in one place, either all in docs or all in prs. It's quite easy for people to lose context if it happens in different places. It's the proposal and summary of different

Re: Process for creating new Proposals

2024-02-20 Thread Manu Zhang
> > discussions are also quite valuable to help people to understand the > summary, people could understand the context of the final summary and > decision > Sometimes, they can also be distracting and have lower signal-to-noise ratio. We can link discussions in the final summary if possible. On

Re: Table Portability Proposal

2024-02-20 Thread Manu Zhang
Hi Jack, Thanks for sharing this idea. Our typical usage of "relative path" is distcp between two HDFS clusters for disaster recovery. It looks to me that by extending this feature, we should always take the authority and scheme from HDFS configurations in that cluster for any path. The downside

Re: [ANNOUNCE] Release Apache Iceberg Rust 0.2.0

2024-02-20 Thread John Zhuge
Congratulations! On Tue, Feb 20, 2024 at 11:07 AM Jack Ye wrote: > Congratulations on the first release! > > -Jack > > On Tue, Feb 20, 2024 at 2:32 AM Driesprong, Fokko > wrote: > >> Hi all, >> >> The Apache Iceberg Rust community is pleased to announce that Apache >> Iceberg Rust 0.2.0 has bee

Re: [DISCUSS] Iceberg Summit proposal

2024-02-20 Thread Ajantha Bhat
Thanks for the proposal. Looking forward to the first official Iceberg summit. I think the event time is odd for people in Asia to attend. Suggestions are welcome. - Ajantha On Wed, Feb 21, 2024 at 6:01 AM Ryan Blue wrote: > Hi everyone, > > JB and I have been working to update the Iceberg Sum