Can we add an abstraction to spec like root metadata (or snapshot history
manager) with the default implementation being metadata.json?


On Wed, Feb 11, 2026 at 9:07 AM Prashant Singh <[email protected]>
wrote:

> +1 i think snapshot summary bloating was a major factor for bloating
> metadata.json too specially for streaming writer based on my past exp, one
> other way since we didn't wanted to propose the spec change was to have
> strict requirement on how many snapshot we wanted to keep and let the
> remove orphans do the clean up, also we removed the snapshot summaries
> since they are optional anyways in addition to as in streaming mode we
> create a large number of snapshot (not all were required anyways).
> I believe there had been a lot of interesting discussion to optimize read
> [1] as well as write [2] if we are open to make spec a bit relaxed, it
> would be nice to move to the tracking of the metadata to the catalog and
> then a protocol to retrieve it back without compromising the portability,
> maybe we can just have a dedicate api which can help export this to a file
> and in an intermediate stage we just operate on what we have stored in
> catalog and we just materialize to the file when and if asked we are kind
> of having similar discussion in IRC.
>
> All i think acknowledge it being a real problem for streaming writers :) !
>
> Past discussions :
> [1] https://lists.apache.org/thread/pwdd7qmdsfcrzjtsll53d3m9f74d03l8
> [2] https://github.com/apache/iceberg/issues/2723
>
> Best,
> Prashant Singh
>
> On Tue, Feb 10, 2026 at 4:45 PM Anton Okolnychyi <[email protected]>
> wrote:
>
>> I think Yufei is right and the snapshot history is the main contributor.
>> Streaming jobs that write every minute would generate over 10K of snapshot
>> entries per week. We had a similar problem with the list of manifests that
>> kept growing (until we added manifest lists) and with references to
>> previous metadata files (we only keep the last 100 now). So we can
>> definitely come up with something for snapshot entries. We will have to
>> ensure the entire set of snapshots is reachable from the latest root file,
>> even if it requires multiple IO operations.
>>
>> The main question is whether we still want to require writing root JSON
>> files during commits. If so, our commits will never be single file commits.
>> In V4, we will have to write the root manifest as well as the root metadata
>> file. I would prefer the second to be optional but we will need to think
>> about static tables and how to incorporate that in the spec.
>>
>>
>>
>> вт, 10 лют. 2026 р. о 15:58 Yufei Gu <[email protected]> пише:
>>
>>> AFAIK, the snapshot history is the main, if not the only, reason for the
>>> large metadata.json file. Moving the extra snapshot history to additional
>>> file and keep it referenced in the root one may just resolve the issue.
>>>
>>> Yufei
>>>
>>>
>>> On Tue, Feb 10, 2026 at 3:27 PM huaxin gao <[email protected]>
>>> wrote:
>>>
>>>> +1, I think this is a real problem, especially for streaming / frequent
>>>> appends where commit latency matters and metadata.json keeps getting
>>>> bigger.
>>>>
>>>> I also agree we probably shouldn’t remove the root metadata file
>>>> completely. Having one file that describes the whole table is really useful
>>>> for portability and debugging.
>>>>
>>>> Of the options you listed, I like “offload pieces to external files” as
>>>> a first step. We still write the root file every commit, but it won’t grow
>>>> as fast. The downside is extra maintenance/GC complexity.
>>>>
>>>> A couple questions/ideas:
>>>>
>>>>    - Do we have any data on what parts of metadata.json grow the most
>>>>    (snapshots / history / refs)? Even a rough breakdown could help decide 
>>>> what
>>>>    to move out first.
>>>>    - Could we do a hybrid: still write the root file every commit, but
>>>>    only keep a “recent window” in it, and move older history to referenced
>>>>    files? (portable, but bounded growth)
>>>>    - For “optional on commit”, maybe make it a catalog capability
>>>>    (fast commits if the catalog can serve metadata), but still support an
>>>>    export/materialize step when portability is needed.
>>>>
>>>> Thanks,
>>>> Huaxin
>>>>
>>>> On Tue, Feb 10, 2026 at 2:58 PM Anton Okolnychyi <[email protected]>
>>>> wrote:
>>>>
>>>>> I don't think we have any consensus or concrete plan. In fact, I don't
>>>>> know what my personal preference is at this point. The intention of this
>>>>> thread is to gain that clarity. I don't think removing the root metadata
>>>>> file entirely is a good idea. It is great to have a way to describe the
>>>>> entire state of a table in a file. We just need to find a solution for
>>>>> streaming appends that suffer from the increasing size of the root 
>>>>> metadata
>>>>> file.
>>>>>
>>>>> Like I said, making the generation of the json file on commit optional
>>>>> is one way to solve this problem. We can also think about offloading 
>>>>> pieces
>>>>> of it to external files (say old snapshots). This would mean we still have
>>>>> to write the root file on each commit but it will be smaller. One clear
>>>>> downside is more complicated maintenance.
>>>>>
>>>>> Any other ideas/thoughts/feedback? Do people see this as a problem?
>>>>>
>>>>>
>>>>> вт, 10 лют. 2026 р. о 14:18 Yufei Gu <[email protected]> пише:
>>>>>
>>>>>> Hi Anton, thanks for raising this. I would really like to make this
>>>>>> optional and then build additional use cases on top of it. For example, a
>>>>>> catalog like IRC could completely eliminate storage IO during commit and
>>>>>> load, which is a big win. It could also provide better protection for
>>>>>> encrypted Iceberg tables, since metadata.json files are plain text today.
>>>>>>
>>>>>> That said, do we have consensus that metadata.json can be optional?
>>>>>> There are real portability concerns, and engine-side work also needs
>>>>>> consideration. For example, static tables and the Spark driver still 
>>>>>> expect
>>>>>> to read this file directly from storage. It feels like the first step 
>>>>>> here
>>>>>> is aligning on whether metadata.json can be optional at all, before we go
>>>>>> deeper into how we get rid of. What do you think?
>>>>>>
>>>>>> Yufei
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 10, 2026 at 11:23 AM Anton Okolnychyi <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> While it may be common knowledge among Iceberg devs that writing the
>>>>>>> root JSON file on commit is somewhat optional with a right catalog, what
>>>>>>> can we do in V4 to solve this problem for all? My problem is the 
>>>>>>> suboptimal
>>>>>>> behavior that new users get by default with HMS or Hadoop catalogs and 
>>>>>>> how
>>>>>>> this impacts their perception of Iceberg. We are doing a bunch of work 
>>>>>>> for
>>>>>>> streaming (e.g. changelog scans, single file commits, etc), but the 
>>>>>>> need to
>>>>>>> write the root JSON file may cancel all of that.
>>>>>>>
>>>>>>> Let me throw some ideas out there.
>>>>>>>
>>>>>>> - Describe how catalogs can make the generation of the root metadata
>>>>>>> file optional in the spec. Ideally, implement that in a built-in 
>>>>>>> catalog of
>>>>>>> choice as a reference implementation.
>>>>>>> - Offload portions of the root metadata file to external files and
>>>>>>> keep references to them.
>>>>>>>
>>>>>>> Thoughts?
>>>>>>>
>>>>>>> - Anton
>>>>>>>
>>>>>>>
>>>>>>>

Reply via email to