Can we add an abstraction to spec like root metadata (or snapshot history manager) with the default implementation being metadata.json?
On Wed, Feb 11, 2026 at 9:07 AM Prashant Singh <[email protected]> wrote: > +1 i think snapshot summary bloating was a major factor for bloating > metadata.json too specially for streaming writer based on my past exp, one > other way since we didn't wanted to propose the spec change was to have > strict requirement on how many snapshot we wanted to keep and let the > remove orphans do the clean up, also we removed the snapshot summaries > since they are optional anyways in addition to as in streaming mode we > create a large number of snapshot (not all were required anyways). > I believe there had been a lot of interesting discussion to optimize read > [1] as well as write [2] if we are open to make spec a bit relaxed, it > would be nice to move to the tracking of the metadata to the catalog and > then a protocol to retrieve it back without compromising the portability, > maybe we can just have a dedicate api which can help export this to a file > and in an intermediate stage we just operate on what we have stored in > catalog and we just materialize to the file when and if asked we are kind > of having similar discussion in IRC. > > All i think acknowledge it being a real problem for streaming writers :) ! > > Past discussions : > [1] https://lists.apache.org/thread/pwdd7qmdsfcrzjtsll53d3m9f74d03l8 > [2] https://github.com/apache/iceberg/issues/2723 > > Best, > Prashant Singh > > On Tue, Feb 10, 2026 at 4:45 PM Anton Okolnychyi <[email protected]> > wrote: > >> I think Yufei is right and the snapshot history is the main contributor. >> Streaming jobs that write every minute would generate over 10K of snapshot >> entries per week. We had a similar problem with the list of manifests that >> kept growing (until we added manifest lists) and with references to >> previous metadata files (we only keep the last 100 now). So we can >> definitely come up with something for snapshot entries. We will have to >> ensure the entire set of snapshots is reachable from the latest root file, >> even if it requires multiple IO operations. >> >> The main question is whether we still want to require writing root JSON >> files during commits. If so, our commits will never be single file commits. >> In V4, we will have to write the root manifest as well as the root metadata >> file. I would prefer the second to be optional but we will need to think >> about static tables and how to incorporate that in the spec. >> >> >> >> вт, 10 лют. 2026 р. о 15:58 Yufei Gu <[email protected]> пише: >> >>> AFAIK, the snapshot history is the main, if not the only, reason for the >>> large metadata.json file. Moving the extra snapshot history to additional >>> file and keep it referenced in the root one may just resolve the issue. >>> >>> Yufei >>> >>> >>> On Tue, Feb 10, 2026 at 3:27 PM huaxin gao <[email protected]> >>> wrote: >>> >>>> +1, I think this is a real problem, especially for streaming / frequent >>>> appends where commit latency matters and metadata.json keeps getting >>>> bigger. >>>> >>>> I also agree we probably shouldn’t remove the root metadata file >>>> completely. Having one file that describes the whole table is really useful >>>> for portability and debugging. >>>> >>>> Of the options you listed, I like “offload pieces to external files” as >>>> a first step. We still write the root file every commit, but it won’t grow >>>> as fast. The downside is extra maintenance/GC complexity. >>>> >>>> A couple questions/ideas: >>>> >>>> - Do we have any data on what parts of metadata.json grow the most >>>> (snapshots / history / refs)? Even a rough breakdown could help decide >>>> what >>>> to move out first. >>>> - Could we do a hybrid: still write the root file every commit, but >>>> only keep a “recent window” in it, and move older history to referenced >>>> files? (portable, but bounded growth) >>>> - For “optional on commit”, maybe make it a catalog capability >>>> (fast commits if the catalog can serve metadata), but still support an >>>> export/materialize step when portability is needed. >>>> >>>> Thanks, >>>> Huaxin >>>> >>>> On Tue, Feb 10, 2026 at 2:58 PM Anton Okolnychyi <[email protected]> >>>> wrote: >>>> >>>>> I don't think we have any consensus or concrete plan. In fact, I don't >>>>> know what my personal preference is at this point. The intention of this >>>>> thread is to gain that clarity. I don't think removing the root metadata >>>>> file entirely is a good idea. It is great to have a way to describe the >>>>> entire state of a table in a file. We just need to find a solution for >>>>> streaming appends that suffer from the increasing size of the root >>>>> metadata >>>>> file. >>>>> >>>>> Like I said, making the generation of the json file on commit optional >>>>> is one way to solve this problem. We can also think about offloading >>>>> pieces >>>>> of it to external files (say old snapshots). This would mean we still have >>>>> to write the root file on each commit but it will be smaller. One clear >>>>> downside is more complicated maintenance. >>>>> >>>>> Any other ideas/thoughts/feedback? Do people see this as a problem? >>>>> >>>>> >>>>> вт, 10 лют. 2026 р. о 14:18 Yufei Gu <[email protected]> пише: >>>>> >>>>>> Hi Anton, thanks for raising this. I would really like to make this >>>>>> optional and then build additional use cases on top of it. For example, a >>>>>> catalog like IRC could completely eliminate storage IO during commit and >>>>>> load, which is a big win. It could also provide better protection for >>>>>> encrypted Iceberg tables, since metadata.json files are plain text today. >>>>>> >>>>>> That said, do we have consensus that metadata.json can be optional? >>>>>> There are real portability concerns, and engine-side work also needs >>>>>> consideration. For example, static tables and the Spark driver still >>>>>> expect >>>>>> to read this file directly from storage. It feels like the first step >>>>>> here >>>>>> is aligning on whether metadata.json can be optional at all, before we go >>>>>> deeper into how we get rid of. What do you think? >>>>>> >>>>>> Yufei >>>>>> >>>>>> >>>>>> On Tue, Feb 10, 2026 at 11:23 AM Anton Okolnychyi < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> While it may be common knowledge among Iceberg devs that writing the >>>>>>> root JSON file on commit is somewhat optional with a right catalog, what >>>>>>> can we do in V4 to solve this problem for all? My problem is the >>>>>>> suboptimal >>>>>>> behavior that new users get by default with HMS or Hadoop catalogs and >>>>>>> how >>>>>>> this impacts their perception of Iceberg. We are doing a bunch of work >>>>>>> for >>>>>>> streaming (e.g. changelog scans, single file commits, etc), but the >>>>>>> need to >>>>>>> write the root JSON file may cancel all of that. >>>>>>> >>>>>>> Let me throw some ideas out there. >>>>>>> >>>>>>> - Describe how catalogs can make the generation of the root metadata >>>>>>> file optional in the spec. Ideally, implement that in a built-in >>>>>>> catalog of >>>>>>> choice as a reference implementation. >>>>>>> - Offload portions of the root metadata file to external files and >>>>>>> keep references to them. >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> - Anton >>>>>>> >>>>>>> >>>>>>>
