Hey folks,

The undelete column PR has been updated to account for required columns. If
data has been written since a required column was deleted, it cannot be
restored; instead there is a set_nullable flag that will force the column
to be nullable for undeleting.

https://github.com/apache/iceberg/pull/16089

Please let me know if there is any further feedback on naming as well.

Frank

On Thu, Mar 5, 2026, 9:45 PM Frank Bertsch <[email protected]> wrote:

> Thanks for the feedback, Anton!
>
> Naming: We could call it a `column_restore`. I opted for `undelete`
> because it undoes a `delete`, which is a specific schema operation. I
> wanted to use the terminology that most clearly shows that this function
> does not touch any data files. Would be interested to hear any other
> perspectives on naming!
>
> > Also, it seems like we should follow the spec wording correctly by
> default. This means we should throw an exception if the column was required
> and new data has been written to the table. Restoring the column back as
> nullable is reasonable but must be explicit user intention (flag or
> argument).
>
> Let me take a look at doing it this way. I defaulted to un-nulling it for
> the simplicity of the solution (no need to check data).
>
> On Tue, Mar 3, 2026 at 8:33 PM Anton Okolnychyi <[email protected]>
> wrote:
>
>> Is undelete the right word here? Is this more like restoring the column?
>>
>> Also, it seems like we should follow the spec wording correctly by
>> default. This means we should throw an exception if the column was required
>> and new data has been written to the table. Restoring the column back as
>> nullable is reasonable but must be explicit user intention (flag or
>> argument).
>>
>> On Tue, Mar 3, 2026 at 5:27 PM Frank Bertsch <[email protected]>
>> wrote:
>>
>>> Hi all, we have 3 positive reviews on the PR. Any chance a committer
>>> could take a look?
>>>
>>> Thanks,
>>> Frank
>>>
>>> On Wed, Feb 18, 2026 at 10:11 PM Frank Bertsch <[email protected]>
>>> wrote:
>>>
>>>> Hey folks,
>>>>
>>>> I have a (relatively) small change here
>>>> <https://github.com/apache/iceberg/pull/15084> to add the capability
>>>> to undelete columns, and a corresponding Spark function. I was hoping to
>>>> get some feedback from the list on this improvement.
>>>>
>>>> I'll note that the spec specifically says column undeletion is allowed,
>>>> but only for these two situations: "Field deletion cannot be rolled back
>>>> unless the field was nullable or if the current snapshot has not changed".
>>>> The PR I've proposed allows for undeleting of non-nullable columns, but
>>>> forces them to be optional upon rollback. I believe this is a reasonable
>>>> tradeoff.
>>>>
>>>> Looking forward to any feedback on the change.
>>>> Frank
>>>>
>>>

Reply via email to