There is already a Catalog#registerTable method:
https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/catalog/Catalog.java#L346-L348

On Wed, Jul 9, 2025 at 2:41 PM Steve <hongyue.apa...@gmail.com> wrote:

> Thanks Ryan and Russell:
>
>   That's a wonderful suggestion, I don't imagine we need such
> "set function" at other places so let me see how I can achieve this
> directly in registerTable at the catalog.
>
> Cheers,
> Hongyue
>
> On Wed, Jul 9, 2025 at 2:07 PM Russell Spitzer <russell.spit...@gmail.com>
> wrote:
>
>> I think that was my request before as well :) I want a Catalog api for
>> "register" directly then each implementation can decide how that gets
>> applied.
>>
>> On Wed, Jul 9, 2025 at 4:13 PM Ryan Blue <rdb...@gmail.com> wrote:
>>
>>> Thanks for bringing this up, Hongyue. I think the logic here makes sense
>>> and that `commit(base, new)` probably isn't a good API to use for
>>> `registerTable`. But my main objection is that I don't think that it makes
>>> sense to use `TableOperations` for this. Adding a `set` method is awkward
>>> because the table may not already exist.
>>>
>>> Why not have catalogs implement `registerTable` directly? For instance,
>>> JDBC could run an INSERT query.
>>>
>>> Ryan
>>>
>>> On Tue, Jul 8, 2025 at 4:42 PM Steve <hongyue.apa...@gmail.com> wrote:
>>>
>>>> Hey Iceberg devs:
>>>>
>>>>   While implementing the overwrite option for registering an external
>>>> table (see PR12228), I realized we might want to evaluate the option to add
>>>> a new method *set(metadata)* on TableOperations interfaces for
>>>> unconditionally set latest table metadata. After some discussions with
>>>> Steven and Russell, I want to seek opinions from the community.
>>>>
>>>> Today, the register-table under the hood use the commit API from
>>>> TableOperations, but it might not work well with overwrite registration due
>>>> to given assumptions
>>>>
>>>>    1.
>>>>
>>>>    commit (base, new) is designed to avoid overwriting updates and
>>>>    mandate the provided base metadata is the same as the current metadata 
>>>> of
>>>>    the table. However for overwrite registration, the end goal is to reset
>>>>    table metadata to a desired state, so we do not want to retry on failure
>>>>    even if the base table state changes.
>>>>    2.
>>>>
>>>>    commit(base, new) will always write new metadata.json files in case
>>>>    of successful commit. For a successful registration. The file that
>>>>    is being passed is the file that should be used for registration.
>>>>    Previous workarounds (e.g., PR6591) reused user-provided metadata only 
>>>> for
>>>>    new tables, but cannot generalize to the overwrite case as it cannot
>>>>    differentiate normal update and overwrite registration at the
>>>>    TableOperations layer.
>>>>    3.
>>>>
>>>>    When committing fails, the system attempts to delete the new
>>>>    metadata file, presuming it was authored by the committer. This is not
>>>>    appropriate for registration scenarios where the metadata file might 
>>>> have
>>>>    been generated elsewhere (see PR13169)
>>>>
>>>>
>>>> One alternative—dropping and recreating the table—raises concerns about
>>>> atomicity, since it can leave the table in an invalid intermediate state if
>>>> a failure occurs between drop and creation.
>>>>
>>>> Given this, I would like to share proposed new API to be add:
>>>>
>>>> *  /***
>>>>
>>>> *   * *Atomically set the provided table metadata as current,
>>>> bypassing base state checks.
>>>>
>>>> *   * @param metadata *the new table metadata to make current
>>>>
>>>> *   */*
>>>>
>>>> *  void set(TableMetadata metadata);*
>>>>
>>>>
>>>> PR12228 Add overwrite option when register external table to catalog:
>>>> https://github.com/apache/iceberg/pull/12228
>>>>
>>>> PR6591 Avoid creating new metadata file when registerTable API is used:
>>>> https://github.com/apache/iceberg/pull/6591
>>>> PR13169 Only remove metadata files if TableOp created a new one:
>>>> github.com/apache/iceberg/pull/13169
>>>>
>>>> Thanks,
>>>>
>>>> Hongyue Zhang
>>>>
>>>>
>>>>

Reply via email to