Re: Table and Snapshot Level Configs and Metadata

2021-05-17 Thread Jack Ye
Yeah I agree that this use case is quite common. I added some thoughts to the issue, we can continue the discussion there. -Jack On Mon, May 17, 2021 at 3:22 AM Peter Vary wrote: > Hi Qinhua, Jack, > > We are also trying to explore the possibilities for users to share a > specific version of a t

Usage of parquet field_id

2021-05-17 Thread Weston Pace
Hello Iceberg devs, I'm Weston, I've been working on the Arrow project lately and I am reviewing how we handle the parquet field_id (and also adding support for specifying a field_id at write time) from parquet[1][2]. This has brought up two questions. 1. The original PR adding field_id suppor

Re: Stableness of V2 Spec/API

2021-05-17 Thread OpenInx
The PR-2303 defines how the batch job does the compaction work, the PR-2308 decides what's the behavior that compaction txn and row-delta txn commit at the same time. They should n't block each other, but we will need to resolve both of them. On Tue, May 18, 2021 at 9:36 AM Huadong Liu wrot

Re: Stableness of V2 Spec/API

2021-05-17 Thread Huadong Liu
Thanks. Compaction is https://github.com/apache/iceberg/pull/2303 and it is currently blocked by https://github.com/apache/iceberg/issues/2308? On Mon, May 17, 2021 at 6:17 PM OpenInx wrote: > Hi Huadong > > From the perspective of iceberg developers, we don't expose the format v2 > to end users

Re: Stableness of V2 Spec/API

2021-05-17 Thread OpenInx
Hi Huadong >From the perspective of iceberg developers, we don't expose the format v2 to end users because we think there is still other work that needs to be done. As you can see there are still some unfinished issues from your link. As for whether v2 will cause data loss, from my perspective as

Re: Table and Snapshot Level Configs and Metadata

2021-05-17 Thread Peter Vary
Hi Qinhua, Jack, We are also trying to explore the possibilities for users to share a specific version of a table easily. The use-case is that we have a quite frequently updated working table, but during that we identify specific snapshot which are good working copies to share. Other users do