Thanks Gabor, I realized it's already done after sending out the last
reply. The setting is actually "table-default.<TABLE_PARAM>".
In case someone else needs a back-port as well, the related PR is
https://github.com/apache/iceberg/pull/4011

Regards,
Manu

On Mon, Mar 20, 2023 at 6:09 PM Gabor Kaszab
<[email protected]> wrote:

> I believe the conclusion here was that there is already a catalog level
> property with the purpose of adding table defaults. This could be used to
> make the default table format to v2 on a particular catalog. See my last
> email on this thread. One thing I haven't checked is if this property works
> for all the catalog types or just a subset of them. But I think it's worth
> a try to see if it works in your environment.
> It's "table.default.<TABLE_PARAM>" setting
>
> On Mon, Mar 20, 2023 at 5:41 AM Manu Zhang <[email protected]>
> wrote:
>
>> Is there any progress to make default format version a catalog property?
>>
>> Thanks,
>> Manu
>>
>> On Wed, Jan 18, 2023 at 5:43 PM Gabor Kaszab
>> <[email protected]> wrote:
>>
>>> I also ran into this "table-default." setting
>>> <https://github.com/apache/iceberg/blob/35151fe17b47c0af22787db4e4964b0cfcfdb215/core/src/main/java/org/apache/iceberg/CatalogProperties.java#L30>
>>> prefix. For me it seems that it's a catalog level config so it's enough to
>>> provide e.g. "table-default.format-version" = "2" to each catalog as a
>>> startup flag. For me it seems that catalogs derived from
>>> BaseMetastoreCatalog use this table default prefix
>>> <https://github.com/apache/iceberg/blob/35151fe17b47c0af22787db4e4964b0cfcfdb215/core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java#L148>
>>> .
>>>
>>> Gabor
>>>
>>> On Wed, Jan 18, 2023 at 12:00 AM Yufei Gu <[email protected]> wrote:
>>>
>>>> The functionality has been there if we are talking about setting the
>>>> default format at the Iceberg catalog.  For example, we can set a catalog
>>>> like this. All tables created will be v2 tables.
>>>> spark.sql.catalog.hive_prod.table-default.format-version = "2"
>>>>
>>>> Of course, we need to set it for each Spark App. Setting Trino would be
>>>> easier. It would be one catalog level change.
>>>>
>>>> Best,
>>>>
>>>> Yufei
>>>>
>>>> `This is not a contribution`
>>>>
>>>>
>>>> On Mon, Jan 16, 2023 at 1:34 AM Gabor Kaszab
>>>> <[email protected]> wrote:
>>>>
>>>>> It seems we have a consensus on the approach. I can take a look at
>>>>> implementing this if no one has any objections.
>>>>>
>>>>> Gabor
>>>>>
>>>>> On Fri, Jan 13, 2023 at 11:28 PM Ryan Blue <[email protected]> wrote:
>>>>>
>>>>>> That sounds like a good idea to me.
>>>>>>
>>>>>> On Fri, Jan 13, 2023 at 11:04 AM Jack Ye <[email protected]> wrote:
>>>>>>
>>>>>>> > I think the issue is that all of the built-in catalogs currently
>>>>>>> call the version of `newTableMetadata` that defaults to v1.
>>>>>>>
>>>>>>> Yes I think this seems like the key issue for the catalogs that
>>>>>>> extend BaseMetastoreCatalog. Looks like we should make changes to make 
>>>>>>> the
>>>>>>> default format version a catalog property, instead of hard-coded in
>>>>>>> TableMetadata?
>>>>>>>
>>>>>>> -Jack
>>>>>>>
>>>>>>> On Thu, Jan 12, 2023 at 11:47 PM Jean-Baptiste Onofré <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Hi Gabor,
>>>>>>>>
>>>>>>>> It makes sense to me. AFAIK, as the tables creation comes from
>>>>>>>> catalog
>>>>>>>> "controller", they can "decide" the version. So, it would be each
>>>>>>>> catalog to deal with the way/version they want to create tables.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> JB
>>>>>>>>
>>>>>>>> On Wed, Jan 11, 2023 at 11:11 PM Gabor Kaszab <
>>>>>>>> [email protected]> wrote:
>>>>>>>> >
>>>>>>>> > Naively asking, can't we add some property to tell Iceberg which
>>>>>>>> version to use as default when creating tables? (If there is no such
>>>>>>>> setting currently)
>>>>>>>> >
>>>>>>>> > Gabor
>>>>>>>> >
>>>>>>>> > Jack Ye <[email protected]> ezt írta (időpont: 2023. jan. 11.,
>>>>>>>> Sze 20:04):
>>>>>>>> >>
>>>>>>>> >> Should we start a community vote on this?
>>>>>>>> >>
>>>>>>>> >> I remember in today's community sync meeting Russell briefly
>>>>>>>> discussed about some compaction supports that are not there yet and 
>>>>>>>> some
>>>>>>>> users are struggled with small delete files issue, and it was to some
>>>>>>>> extent why Spark is still defaulting v1.
>>>>>>>> >>
>>>>>>>> >> Regarding feature side, changelog scan is mostly there in Spark,
>>>>>>>> and there will also likely be movements on Trino side for it very soon.
>>>>>>>> >>
>>>>>>>> >> Overall, I think it would be beneficial to move default to v2,
>>>>>>>> which could incentivize the completion of those missing parts across
>>>>>>>> engines.
>>>>>>>> >>
>>>>>>>> >> Best,
>>>>>>>> >> Jack Ye
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> On Wed, Jan 11, 2023 at 5:47 AM Piotr Findeisen <
>>>>>>>> [email protected]> wrote:
>>>>>>>> >>>
>>>>>>>> >>> Hi,
>>>>>>>> >>>
>>>>>>>> >>> FWIW Trino already creates v2 tables by default.
>>>>>>>> >>> Thought it's worth sharing for context.
>>>>>>>> >>>
>>>>>>>> >>> Best
>>>>>>>> >>> PF
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>> On Tue, Jan 10, 2023 at 10:09 AM Manu Zhang <
>>>>>>>> [email protected]> wrote:
>>>>>>>> >>>>
>>>>>>>> >>>> Hi all,
>>>>>>>> >>>>
>>>>>>>> >>>> We've maintained a forked Iceberg internally and all our use
>>>>>>>> cases involve v2 tables with row-level updates and deletes. Our users 
>>>>>>>> need
>>>>>>>> to remember to create table with the `'format-version'='2'` option or 
>>>>>>>> alter
>>>>>>>> table afterwards.
>>>>>>>> >>>>
>>>>>>>> >>>> I'm thinking about changing the default format-version of our
>>>>>>>> forked Iceberg to v2 . Is there any concern for this change? Any hidden
>>>>>>>> issues I've missed?
>>>>>>>> >>>>
>>>>>>>> >>>> Thanks,
>>>>>>>> >>>> Manu
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ryan Blue
>>>>>> Tabular
>>>>>>
>>>>>

Reply via email to