The functionality has been there if we are talking about setting the
default format at the Iceberg catalog.  For example, we can set a catalog
like this. All tables created will be v2 tables.
spark.sql.catalog.hive_prod.table-default.format-version = "2"

Of course, we need to set it for each Spark App. Setting Trino would be
easier. It would be one catalog level change.

Best,

Yufei

`This is not a contribution`


On Mon, Jan 16, 2023 at 1:34 AM Gabor Kaszab
<[email protected]> wrote:

> It seems we have a consensus on the approach. I can take a look at
> implementing this if no one has any objections.
>
> Gabor
>
> On Fri, Jan 13, 2023 at 11:28 PM Ryan Blue <[email protected]> wrote:
>
>> That sounds like a good idea to me.
>>
>> On Fri, Jan 13, 2023 at 11:04 AM Jack Ye <[email protected]> wrote:
>>
>>> > I think the issue is that all of the built-in catalogs currently call
>>> the version of `newTableMetadata` that defaults to v1.
>>>
>>> Yes I think this seems like the key issue for the catalogs that extend
>>> BaseMetastoreCatalog. Looks like we should make changes to make the default
>>> format version a catalog property, instead of hard-coded in TableMetadata?
>>>
>>> -Jack
>>>
>>> On Thu, Jan 12, 2023 at 11:47 PM Jean-Baptiste Onofré <[email protected]>
>>> wrote:
>>>
>>>> Hi Gabor,
>>>>
>>>> It makes sense to me. AFAIK, as the tables creation comes from catalog
>>>> "controller", they can "decide" the version. So, it would be each
>>>> catalog to deal with the way/version they want to create tables.
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> On Wed, Jan 11, 2023 at 11:11 PM Gabor Kaszab <[email protected]>
>>>> wrote:
>>>> >
>>>> > Naively asking, can't we add some property to tell Iceberg which
>>>> version to use as default when creating tables? (If there is no such
>>>> setting currently)
>>>> >
>>>> > Gabor
>>>> >
>>>> > Jack Ye <[email protected]> ezt írta (időpont: 2023. jan. 11., Sze
>>>> 20:04):
>>>> >>
>>>> >> Should we start a community vote on this?
>>>> >>
>>>> >> I remember in today's community sync meeting Russell briefly
>>>> discussed about some compaction supports that are not there yet and some
>>>> users are struggled with small delete files issue, and it was to some
>>>> extent why Spark is still defaulting v1.
>>>> >>
>>>> >> Regarding feature side, changelog scan is mostly there in Spark, and
>>>> there will also likely be movements on Trino side for it very soon.
>>>> >>
>>>> >> Overall, I think it would be beneficial to move default to v2, which
>>>> could incentivize the completion of those missing parts across engines.
>>>> >>
>>>> >> Best,
>>>> >> Jack Ye
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Wed, Jan 11, 2023 at 5:47 AM Piotr Findeisen <
>>>> [email protected]> wrote:
>>>> >>>
>>>> >>> Hi,
>>>> >>>
>>>> >>> FWIW Trino already creates v2 tables by default.
>>>> >>> Thought it's worth sharing for context.
>>>> >>>
>>>> >>> Best
>>>> >>> PF
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Tue, Jan 10, 2023 at 10:09 AM Manu Zhang <
>>>> [email protected]> wrote:
>>>> >>>>
>>>> >>>> Hi all,
>>>> >>>>
>>>> >>>> We've maintained a forked Iceberg internally and all our use cases
>>>> involve v2 tables with row-level updates and deletes. Our users need to
>>>> remember to create table with the `'format-version'='2'` option or alter
>>>> table afterwards.
>>>> >>>>
>>>> >>>> I'm thinking about changing the default format-version of our
>>>> forked Iceberg to v2 . Is there any concern for this change? Any hidden
>>>> issues I've missed?
>>>> >>>>
>>>> >>>> Thanks,
>>>> >>>> Manu
>>>>
>>>
>>
>> --
>> Ryan Blue
>> Tabular
>>
>

Reply via email to