Here is a PR to change the default format version in the library:

https://github.com/apache/iceberg/pull/8381

There are some failing REST catalog tests, which look like bugs. I'd appreciate 
if someone could take a look. I will also check the remaining tests later today.

- Anton

On 2023/05/29 17:25:41 Ryan Blue wrote:
> Since the last time we discussed this, we've also updated our default
> version to v2. I definitely like the idea we settled on last time, that
> this is an administrator setting and it can be controlled already by
> catalog deployments. However, I'm coming around on updating the library
> default to v2.
> 
> My rationale is that we want people that are setting up Iceberg data
> platforms (administrator roles) to be as successful as possible without
> knowing all the internal details. While you _can_ set this at the catalog
> level, those new platform administrators don't know to do that. So I'd
> probably opt to make this v2 now.
> 
> Ryan
> 
> On Thu, May 25, 2023 at 2:51 PM Steven Wu <stevenz...@gmail.com> wrote:
> 
> > +1. Anton made a good case with the new perspective.
> >
> > On Thu, May 25, 2023 at 2:29 PM Anton Okolnychyi
> > <aokolnyc...@apple.com.invalid> wrote:
> >
> >> Oh, I missed the earlier discussion. Thanks for sharing it, Gabor!
> >>
> >> I am approaching this from a slightly different perspective. Defaulting
> >> to v2 does not mean supporting delete files. My primary concern is that our
> >> default behavior may be either confusing or inefficient. For instance,
> >> using always null transforms in v1 spec evolution is hard to explain to
> >> users. Not enabling snapshot ID inheritance means rewriting manifests in
> >> huge tables can take hours. Managed catalogs or teams that run forks have
> >> more control over tables and can make better choices but I also worry about
> >> folks that just start with Iceberg and use built-in catalogs.
> >>
> >> Can we think of potential issues with having a v2 table with no delete
> >> files vs a v1 table?
> >>
> >> - Anton
> >>
> >> On May 24, 2023, at 10:43 PM, Szehon Ho <szehon.apa...@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I'm +1 to making v2 the default, say after this release.
> >>
> >> It seems most of the features brought up as concerns on Spark side in the
> >> thread Gabor linked have been implemented (like position delete lifecycle).
> >>
> >> But Anton's point is also good.  Even if some delete file features are
> >> missing, V2 is not only about delete files, which are not produced by
> >> default in Spark, and Flink(?), but rather the fixes for partition spec
> >> evolution / snapshot id inheritance.  Hence it makes sense to me, from that
> >> angle.
> >>
> >> Thanks
> >> Szehon
> >>
> >> On Wed, May 24, 2023 at 12:34 AM Gabor Kaszab <
> >> gaborkas...@cloudera.com.invalid> wrote:
> >>
> >>> Hey Anton,
> >>>
> >>> Just adding a note that back around January the same topic was brought
> >>> up on this mail list. There the conclusion was to use the 'table-default.'
> >>> catalog level property to create V2 tables by default.
> >>> https://lists.apache.org/thread/9ct0p817qxqqdnv7nb35kghsfygjkqdf
> >>>
> >>> I'm not saying that we shouldn't default to V2 just drawing attention to
> >>> this previous conversation.
> >>>
> >>> Cheers,
> >>> Gabor
> >>>
> >>> On Wed, May 24, 2023 at 12:04 AM Anton Okolnychyi <
> >>> aokolnyc...@apple.com.invalid> wrote:
> >>>
> >>>> Hi folks,
> >>>>
> >>>> Would it be appropriate for us to consider changing the default table
> >>>> format version for new tables from v1 to v2?
> >>>>
> >>>> I don’t think defaulting to v2 tables means all readers have to support
> >>>> delete files. DELETE, UPDATE, MERGE operations will only produce delete
> >>>> files if configured explicitly.
> >>>>
> >>>> The primary reason I am starting this thread is to avoid our
> >>>> workarounds in v1 spec evolution, and snapshot ID inheritance. The latter
> >>>> is critical for the performance of rewriting manifests.
> >>>>
> >>>> Any thoughts?
> >>>>
> >>>> - Anton
> >>>
> >>>
> >>
> 
> -- 
> Ryan Blue
> Tabular
> 

Reply via email to