Re: [DISCUSS] Offloading Snapshots from Metadata.json

Innocent Djiofack Thu, 16 Apr 2026 00:13:04 -0700

Hi all,

Thank you for the replies. Steven the change is scoped to only offloading
snapshots history. Yufei, yes this is a large change. I agreed that
removing the requirement for a metadata.json file per commit in storage
would help most of the concerns. If there is already a design doc for that
direction, please share it with me. If not, I can start something around
that line of reasoning.


Thanks.

On Tue, Apr 14, 2026 at 4:09 PM Yufei Gu <[email protected]> wrote:

> Separating snapshot history from table metadata feels like a large,
> invasive change since it would require updates across all clients and
> engines. If we instead remove the requirement for a metadata.json file per
> commit in storage, many of the current concerns could be addressed. This
> seems like a more practical path forward. There are already
> multiple discussions over there. I'd suggest to move forward with that
> direction.
>
> Yufei
>
>
> On Tue, Apr 14, 2026 at 8:44 AM Steven Wu <[email protected]> wrote:
>
>> I understand the problem we are trying to solve here. But the actual
>> proposed solution is unclear to me. The proposal seems lack some details in
>> the actual design/solution.
>>
>> How do the proposed snapshot read and write APIs differ from the current
>> APIs? I can't tell the difference.
>>
>> > Once defined, this interface could be implemented by various backing
>> stores, such as another file or even a Catalog.
>>
>> To support offloading, we probably have to update the table metadata in
>> the table spec <https://iceberg.apache.org/spec/#table-metadata-fields>.
>> Does this depend on making metadata.json file optional? Or is this limited
>> to just externalizing the snapshot list?
>>
>> On Tue, Apr 14, 2026 at 2:53 AM Jean-Baptiste Onofré <[email protected]>
>> wrote:
>>
>>> Hi Innocent
>>>
>>> Maybe it's a kind of redundant with the V4 initiative ?
>>> What are your thoughts on this?
>>>
>>> Thanks!
>>>
>>> Regards
>>> JB
>>>
>>> On Tue, Apr 14, 2026 at 6:44 AM Innocent Djiofack <[email protected]>
>>> wrote:
>>>
>>>> Hello Everyone,
>>>>
>>>> My name is Innocent and I have enjoyed working on the apache Iceberg
>>>> project so far and have learned a lot from people in the group.
>>>> I wanted to follow up on a concern raised by Anton around the growing
>>>> size of metadata.json and the problems it brings. Before going ahead and
>>>> doing the implementation work, I wanted to share the high level thinking
>>>> with the community and get feedback. You will find the link to the proposal
>>>> here
>>>> <https://docs.google.com/document/d/1xpzpsA9BGSkxo58yUhSdDQaSu7_ITQLFmGarEOyM8P0/edit?tab=t.0#heading=h.7g59t9p9o1xi>
>>>>  I
>>>> would appreciate comments and feedback on it.
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>>
>>>> *DJIOFACK INNOCENT*
>>>> *"Be better than the day before!" -*
>>>> *+1 404 751 8024*
>>>>
>>>

-- 

*DJIOFACK INNOCENT*
*"Be better than the day before!" -*
*+1 404 751 8024*

Re: [DISCUSS] Offloading Snapshots from Metadata.json

Reply via email to