Re: [DISCUSS] metadata.json in v4?

Prashant Singh Tue, 10 Feb 2026 17:07:33 -0800

+1 i think snapshot summary bloating was a major factor for bloating
metadata.json too specially for streaming writer based on my past exp, one
other way since we didn't wanted to propose the spec change was to have
strict requirement on how many snapshot we wanted to keep and let the
remove orphans do the clean up, also we removed the snapshot summaries
since they are optional anyways in addition to as in streaming mode we
create a large number of snapshot (not all were required anyways).
I believe there had been a lot of interesting discussion to optimize read
[1] as well as write [2] if we are open to make spec a bit relaxed, it
would be nice to move to the tracking of the metadata to the catalog and
then a protocol to retrieve it back without compromising the portability,
maybe we can just have a dedicate api which can help export this to a file
and in an intermediate stage we just operate on what we have stored in
catalog and we just materialize to the file when and if asked we are kind
of having similar discussion in IRC.


All i think acknowledge it being a real problem for streaming writers :) !

Past discussions :
[1] https://lists.apache.org/thread/pwdd7qmdsfcrzjtsll53d3m9f74d03l8
[2] https://github.com/apache/iceberg/issues/2723

Best,
Prashant Singh

On Tue, Feb 10, 2026 at 4:45 PM Anton Okolnychyi <[email protected]>
wrote:

> I think Yufei is right and the snapshot history is the main contributor.
> Streaming jobs that write every minute would generate over 10K of snapshot
> entries per week. We had a similar problem with the list of manifests that
> kept growing (until we added manifest lists) and with references to
> previous metadata files (we only keep the last 100 now). So we can
> definitely come up with something for snapshot entries. We will have to
> ensure the entire set of snapshots is reachable from the latest root file,
> even if it requires multiple IO operations.
>
> The main question is whether we still want to require writing root JSON
> files during commits. If so, our commits will never be single file commits.
> In V4, we will have to write the root manifest as well as the root metadata
> file. I would prefer the second to be optional but we will need to think
> about static tables and how to incorporate that in the spec.
>
>
>
> вт, 10 лют. 2026 р. о 15:58 Yufei Gu <[email protected]> пише:
>
>> AFAIK, the snapshot history is the main, if not the only, reason for the
>> large metadata.json file. Moving the extra snapshot history to additional
>> file and keep it referenced in the root one may just resolve the issue.
>>
>> Yufei
>>
>>
>> On Tue, Feb 10, 2026 at 3:27 PM huaxin gao <[email protected]>
>> wrote:
>>
>>> +1, I think this is a real problem, especially for streaming / frequent
>>> appends where commit latency matters and metadata.json keeps getting
>>> bigger.
>>>
>>> I also agree we probably shouldn’t remove the root metadata file
>>> completely. Having one file that describes the whole table is really useful
>>> for portability and debugging.
>>>
>>> Of the options you listed, I like “offload pieces to external files” as
>>> a first step. We still write the root file every commit, but it won’t grow
>>> as fast. The downside is extra maintenance/GC complexity.
>>>
>>> A couple questions/ideas:
>>>
>>>    - Do we have any data on what parts of metadata.json grow the most
>>>    (snapshots / history / refs)? Even a rough breakdown could help decide 
>>> what
>>>    to move out first.
>>>    - Could we do a hybrid: still write the root file every commit, but
>>>    only keep a “recent window” in it, and move older history to referenced
>>>    files? (portable, but bounded growth)
>>>    - For “optional on commit”, maybe make it a catalog capability (fast
>>>    commits if the catalog can serve metadata), but still support an
>>>    export/materialize step when portability is needed.
>>>
>>> Thanks,
>>> Huaxin
>>>
>>> On Tue, Feb 10, 2026 at 2:58 PM Anton Okolnychyi <[email protected]>
>>> wrote:
>>>
>>>> I don't think we have any consensus or concrete plan. In fact, I don't
>>>> know what my personal preference is at this point. The intention of this
>>>> thread is to gain that clarity. I don't think removing the root metadata
>>>> file entirely is a good idea. It is great to have a way to describe the
>>>> entire state of a table in a file. We just need to find a solution for
>>>> streaming appends that suffer from the increasing size of the root metadata
>>>> file.
>>>>
>>>> Like I said, making the generation of the json file on commit optional
>>>> is one way to solve this problem. We can also think about offloading pieces
>>>> of it to external files (say old snapshots). This would mean we still have
>>>> to write the root file on each commit but it will be smaller. One clear
>>>> downside is more complicated maintenance.
>>>>
>>>> Any other ideas/thoughts/feedback? Do people see this as a problem?
>>>>
>>>>
>>>> вт, 10 лют. 2026 р. о 14:18 Yufei Gu <[email protected]> пише:
>>>>
>>>>> Hi Anton, thanks for raising this. I would really like to make this
>>>>> optional and then build additional use cases on top of it. For example, a
>>>>> catalog like IRC could completely eliminate storage IO during commit and
>>>>> load, which is a big win. It could also provide better protection for
>>>>> encrypted Iceberg tables, since metadata.json files are plain text today.
>>>>>
>>>>> That said, do we have consensus that metadata.json can be optional?
>>>>> There are real portability concerns, and engine-side work also needs
>>>>> consideration. For example, static tables and the Spark driver still 
>>>>> expect
>>>>> to read this file directly from storage. It feels like the first step here
>>>>> is aligning on whether metadata.json can be optional at all, before we go
>>>>> deeper into how we get rid of. What do you think?
>>>>>
>>>>> Yufei
>>>>>
>>>>>
>>>>> On Tue, Feb 10, 2026 at 11:23 AM Anton Okolnychyi <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> While it may be common knowledge among Iceberg devs that writing the
>>>>>> root JSON file on commit is somewhat optional with a right catalog, what
>>>>>> can we do in V4 to solve this problem for all? My problem is the 
>>>>>> suboptimal
>>>>>> behavior that new users get by default with HMS or Hadoop catalogs and 
>>>>>> how
>>>>>> this impacts their perception of Iceberg. We are doing a bunch of work 
>>>>>> for
>>>>>> streaming (e.g. changelog scans, single file commits, etc), but the need 
>>>>>> to
>>>>>> write the root JSON file may cancel all of that.
>>>>>>
>>>>>> Let me throw some ideas out there.
>>>>>>
>>>>>> - Describe how catalogs can make the generation of the root metadata
>>>>>> file optional in the spec. Ideally, implement that in a built-in catalog 
>>>>>> of
>>>>>> choice as a reference implementation.
>>>>>> - Offload portions of the root metadata file to external files and
>>>>>> keep references to them.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> - Anton
>>>>>>
>>>>>>
>>>>>>

Re: [DISCUSS] metadata.json in v4?

Reply via email to