Also, UTs like TestCreateTableAsSelect#testCreateRTASWithPartitionSpecChanging need update to match different logics and results for partition spec update in v2.
Regards, Manu On Mon, Mar 20, 2023 at 8:57 PM Manu Zhang <[email protected]> wrote: > Thanks Gabor, I realized it's already done after sending out the last > reply. The setting is actually "table-default.<TABLE_PARAM>". > In case someone else needs a back-port as well, the related PR is > https://github.com/apache/iceberg/pull/4011 > > Regards, > Manu > > On Mon, Mar 20, 2023 at 6:09 PM Gabor Kaszab > <[email protected]> wrote: > >> I believe the conclusion here was that there is already a catalog level >> property with the purpose of adding table defaults. This could be used to >> make the default table format to v2 on a particular catalog. See my last >> email on this thread. One thing I haven't checked is if this property works >> for all the catalog types or just a subset of them. But I think it's worth >> a try to see if it works in your environment. >> It's "table.default.<TABLE_PARAM>" setting >> >> On Mon, Mar 20, 2023 at 5:41 AM Manu Zhang <[email protected]> >> wrote: >> >>> Is there any progress to make default format version a catalog property? >>> >>> Thanks, >>> Manu >>> >>> On Wed, Jan 18, 2023 at 5:43 PM Gabor Kaszab >>> <[email protected]> wrote: >>> >>>> I also ran into this "table-default." setting >>>> <https://github.com/apache/iceberg/blob/35151fe17b47c0af22787db4e4964b0cfcfdb215/core/src/main/java/org/apache/iceberg/CatalogProperties.java#L30> >>>> prefix. For me it seems that it's a catalog level config so it's enough to >>>> provide e.g. "table-default.format-version" = "2" to each catalog as a >>>> startup flag. For me it seems that catalogs derived from >>>> BaseMetastoreCatalog use this table default prefix >>>> <https://github.com/apache/iceberg/blob/35151fe17b47c0af22787db4e4964b0cfcfdb215/core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java#L148> >>>> . >>>> >>>> Gabor >>>> >>>> On Wed, Jan 18, 2023 at 12:00 AM Yufei Gu <[email protected]> wrote: >>>> >>>>> The functionality has been there if we are talking about setting the >>>>> default format at the Iceberg catalog. For example, we can set a catalog >>>>> like this. All tables created will be v2 tables. >>>>> spark.sql.catalog.hive_prod.table-default.format-version = "2" >>>>> >>>>> Of course, we need to set it for each Spark App. Setting Trino >>>>> would be easier. It would be one catalog level change. >>>>> >>>>> Best, >>>>> >>>>> Yufei >>>>> >>>>> `This is not a contribution` >>>>> >>>>> >>>>> On Mon, Jan 16, 2023 at 1:34 AM Gabor Kaszab >>>>> <[email protected]> wrote: >>>>> >>>>>> It seems we have a consensus on the approach. I can take a look at >>>>>> implementing this if no one has any objections. >>>>>> >>>>>> Gabor >>>>>> >>>>>> On Fri, Jan 13, 2023 at 11:28 PM Ryan Blue <[email protected]> wrote: >>>>>> >>>>>>> That sounds like a good idea to me. >>>>>>> >>>>>>> On Fri, Jan 13, 2023 at 11:04 AM Jack Ye <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> > I think the issue is that all of the built-in catalogs currently >>>>>>>> call the version of `newTableMetadata` that defaults to v1. >>>>>>>> >>>>>>>> Yes I think this seems like the key issue for the catalogs that >>>>>>>> extend BaseMetastoreCatalog. Looks like we should make changes to make >>>>>>>> the >>>>>>>> default format version a catalog property, instead of hard-coded in >>>>>>>> TableMetadata? >>>>>>>> >>>>>>>> -Jack >>>>>>>> >>>>>>>> On Thu, Jan 12, 2023 at 11:47 PM Jean-Baptiste Onofré < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Gabor, >>>>>>>>> >>>>>>>>> It makes sense to me. AFAIK, as the tables creation comes from >>>>>>>>> catalog >>>>>>>>> "controller", they can "decide" the version. So, it would be each >>>>>>>>> catalog to deal with the way/version they want to create tables. >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> JB >>>>>>>>> >>>>>>>>> On Wed, Jan 11, 2023 at 11:11 PM Gabor Kaszab < >>>>>>>>> [email protected]> wrote: >>>>>>>>> > >>>>>>>>> > Naively asking, can't we add some property to tell Iceberg which >>>>>>>>> version to use as default when creating tables? (If there is no such >>>>>>>>> setting currently) >>>>>>>>> > >>>>>>>>> > Gabor >>>>>>>>> > >>>>>>>>> > Jack Ye <[email protected]> ezt írta (időpont: 2023. jan. >>>>>>>>> 11., Sze 20:04): >>>>>>>>> >> >>>>>>>>> >> Should we start a community vote on this? >>>>>>>>> >> >>>>>>>>> >> I remember in today's community sync meeting Russell briefly >>>>>>>>> discussed about some compaction supports that are not there yet and >>>>>>>>> some >>>>>>>>> users are struggled with small delete files issue, and it was to some >>>>>>>>> extent why Spark is still defaulting v1. >>>>>>>>> >> >>>>>>>>> >> Regarding feature side, changelog scan is mostly there in >>>>>>>>> Spark, and there will also likely be movements on Trino side for it >>>>>>>>> very >>>>>>>>> soon. >>>>>>>>> >> >>>>>>>>> >> Overall, I think it would be beneficial to move default to v2, >>>>>>>>> which could incentivize the completion of those missing parts across >>>>>>>>> engines. >>>>>>>>> >> >>>>>>>>> >> Best, >>>>>>>>> >> Jack Ye >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> On Wed, Jan 11, 2023 at 5:47 AM Piotr Findeisen < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>> >>>>>>>>> >>> Hi, >>>>>>>>> >>> >>>>>>>>> >>> FWIW Trino already creates v2 tables by default. >>>>>>>>> >>> Thought it's worth sharing for context. >>>>>>>>> >>> >>>>>>>>> >>> Best >>>>>>>>> >>> PF >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> >>> On Tue, Jan 10, 2023 at 10:09 AM Manu Zhang < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>> >>>>>>>>> >>>> Hi all, >>>>>>>>> >>>> >>>>>>>>> >>>> We've maintained a forked Iceberg internally and all our use >>>>>>>>> cases involve v2 tables with row-level updates and deletes. Our users >>>>>>>>> need >>>>>>>>> to remember to create table with the `'format-version'='2'` option or >>>>>>>>> alter >>>>>>>>> table afterwards. >>>>>>>>> >>>> >>>>>>>>> >>>> I'm thinking about changing the default format-version of our >>>>>>>>> forked Iceberg to v2 . Is there any concern for this change? Any >>>>>>>>> hidden >>>>>>>>> issues I've missed? >>>>>>>>> >>>> >>>>>>>>> >>>> Thanks, >>>>>>>>> >>>> Manu >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ryan Blue >>>>>>> Tabular >>>>>>> >>>>>>
