Thanks Péter for highlighting the Hive case. I’ve created a one-page doc to track specific places with hard dependencies on the file in storage to help ground the ongoing discussion: https://docs.google.com/document/d/17PBhJ0IBxHxMKvCW6CstGOp7cZnboMDdpO6BCPO2kmA/edit?usp=sharing
Yufei On Fri, Apr 17, 2026 at 12:54 AM Péter Váry <[email protected]> wrote: > I don’t think splitting the metadata.json is the right approach. > > Making it optional in V4 could be a better direction, but many systems > rely on it today. For example, Hive uses SerializableTable to ensure > consistency between query planning and execution. As mentioned earlier, > SerializableTable relies on StaticTableOperations, which reads the table > metadata from the expected metadataFileLocation. Writing out a > metadata.json each time we serialize a table could therefore introduce > performance bottlenecks. > > That said, I agree we need a way to speed up metadata reads and updates to > support more frequent table operations. Removing the need to serialize the > metadata JSON could be a good path forward, as long as the metadata remains > fully and reliably accessible whenever it is required. > > Yufei Gu <[email protected]> ezt írta (időpont: 2026. ápr. 17., P, > 0:19): > >> Ryan, StaticTableOperations is the one reading the metadata.json files. >> Everything depending on it makes the assumption that metadata.json is in >> storage, including almost all metadata tables and some Spark actions. The >> executor use case I mentioned is somewhere like here, >> https://github.com/apache/iceberg/blob/dde712ec9ed6c9d28183ee4615d50f97b246af5d/spark/v4.1/spark/src/main/java/org/apache/iceberg/spark/source/SparkWrite.java#L215 >> >> Broadcast<Table> tableBroadcast = >> sparkContext.broadcast(SerializableTableWithSize.copyOf(table)); >> >> The driver broadcasts a trimmed table metadata, and executor will pick up >> the full table metadata from storage. >> >> Yufei >> >> >> On Thu, Apr 16, 2026 at 2:24 PM huaxin gao <[email protected]> >> wrote: >> >>> +1 to the direction Ryan and Yufei outlined. Making metadata.json >>> optional in storage for v4 and fixing the REST client to not request all >>> snapshots seems like the right path forward. >>> >>> On the executor side, Prashant's earlier work in #14944 >>> <https://github.com/apache/iceberg/pull/14944> looks like a good >>> starting point to remove the direct metadata file reads from >>> SerializableTable. Happy to help review when that gets revived. >>> >>> Thanks, >>> >>> Huaxin >>> >>> On Thu, Apr 16, 2026 at 12:43 PM Amogh Jahagirdar <[email protected]> >>> wrote: >>> >>>> I pretty much agree with about everything Yufei and Ryan said. I >>>> feel like sharding the metadata json across multiple files is >>>> overcomplicated when the REST protocol already abstracts which snapshots a >>>> client even sees. It would be much better for us to make progress on >>>> relaxing the requirement for metadata.json storage. We should also look at >>>> the client implementation defaults to make sure those are sane as well. >>>> >>>> +1 to removing the code where executors fetch full metadata from the >>>> metadata.json. I remember when we did the analysis on that PR, if I recall >>>> correctly, that effectively is dead code so I think there's a good cleanup >>>> opportunity there. >>>> >>>> Thanks, >>>> Amogh Jahagirdar >>>> >>>> On Thu, Apr 16, 2026 at 11:09 AM Prashant Singh < >>>> [email protected]> wrote: >>>> >>>>> Hey Ryan / Yufei, >>>>> Here is my one attempt to get rid of that, it was from gov pov, it's >>>>> mostly from Serializable Table [1] >>>>> If we are all onboard, I can clean up and revive this effort. >>>>> >>>>> [1] >>>>> https://github.com/apache/iceberg/pull/14944#issuecomment-3812676977 >>>>> >>>>> Best, >>>>> Prashant Singh >>>>> >>>>> On Thu, Apr 16, 2026 at 9:08 AM Ryan Blue <[email protected]> wrote: >>>>> >>>>>> They do? Where is that? >>>>>> >>>>>> Definitely something we should remove as soon as we can. >>>>>> >>>>>> On Thu, Apr 16, 2026 at 8:58 AM Yufei Gu <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> To add to that, some engines like Spark still assume metadata.json >>>>>>> exists in storage. The executors load the file directly instead of >>>>>>> checking >>>>>>> the REST catalog for table metadata. We will need to modify that. >>>>>>> >>>>>>> Yufei >>>>>>> >>>>>>> >>>>>>> On Thu, Apr 16, 2026 at 8:45 AM Ryan Blue <[email protected]> wrote: >>>>>>> >>>>>>>> I think that the problem of large metadata.json files is largely >>>>>>>> solved by the REST protocol, which does not need to send snapshots to >>>>>>>> clients. I agree with Anton's suggestion to relax the requirement that >>>>>>>> the >>>>>>>> metadata.json file has to be stored somewhere (for v4). As long as >>>>>>>> catalogs >>>>>>>> are required to be able to produce the full content of metadata.json >>>>>>>> when >>>>>>>> loading the table for a client requesting all snapshots, we don't need >>>>>>>> to >>>>>>>> worry about storing the file. >>>>>>>> >>>>>>>> There are two things to keep in mind though: >>>>>>>> 1. I think the current Java REST implementation still requests all >>>>>>>> snapshots to commit, which we should fix >>>>>>>> 2. I think it is a bad idea to split up the metadata.json file for >>>>>>>> non-REST catalogs. This introduces way too much complexity that >>>>>>>> necessarily >>>>>>>> leaks out of the catalog implementation. I don't think this is a >>>>>>>> problem >>>>>>>> worth solving when we have a perfectly good solution that has >>>>>>>> significant >>>>>>>> benefits. >>>>>>>> >>>>>>>> Ryan >>>>>>>> >>>>>>>> On Thu, Apr 16, 2026 at 12:13 AM Innocent Djiofack < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Thank you for the replies. Steven the change is scoped to only >>>>>>>>> offloading snapshots history. Yufei, yes this is a large change. I >>>>>>>>> agreed that removing the requirement for a metadata.json file per >>>>>>>>> commit in >>>>>>>>> storage would help most of the concerns. If there is already a design >>>>>>>>> doc >>>>>>>>> for that direction, please share it with me. If not, I can start >>>>>>>>> something >>>>>>>>> around that line of reasoning. >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>>> On Tue, Apr 14, 2026 at 4:09 PM Yufei Gu <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Separating snapshot history from table metadata feels like a >>>>>>>>>> large, invasive change since it would require updates across all >>>>>>>>>> clients >>>>>>>>>> and engines. If we instead remove the requirement for a >>>>>>>>>> metadata.json file >>>>>>>>>> per commit in storage, many of the current concerns could be >>>>>>>>>> addressed. >>>>>>>>>> This seems like a more practical path forward. There are already >>>>>>>>>> multiple discussions over there. I'd suggest to move forward with >>>>>>>>>> that >>>>>>>>>> direction. >>>>>>>>>> >>>>>>>>>> Yufei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Apr 14, 2026 at 8:44 AM Steven Wu <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I understand the problem we are trying to solve here. But the >>>>>>>>>>> actual proposed solution is unclear to me. The proposal seems lack >>>>>>>>>>> some >>>>>>>>>>> details in the actual design/solution. >>>>>>>>>>> >>>>>>>>>>> How do the proposed snapshot read and write APIs differ from the >>>>>>>>>>> current APIs? I can't tell the difference. >>>>>>>>>>> >>>>>>>>>>> > Once defined, this interface could be implemented by various >>>>>>>>>>> backing stores, such as another file or even a Catalog. >>>>>>>>>>> >>>>>>>>>>> To support offloading, we probably have to update the table >>>>>>>>>>> metadata in the table spec >>>>>>>>>>> <https://iceberg.apache.org/spec/#table-metadata-fields>. Does >>>>>>>>>>> this depend on making metadata.json file optional? Or is this >>>>>>>>>>> limited to >>>>>>>>>>> just externalizing the snapshot list? >>>>>>>>>>> >>>>>>>>>>> On Tue, Apr 14, 2026 at 2:53 AM Jean-Baptiste Onofré < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Innocent >>>>>>>>>>>> >>>>>>>>>>>> Maybe it's a kind of redundant with the V4 initiative ? >>>>>>>>>>>> What are your thoughts on this? >>>>>>>>>>>> >>>>>>>>>>>> Thanks! >>>>>>>>>>>> >>>>>>>>>>>> Regards >>>>>>>>>>>> JB >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Apr 14, 2026 at 6:44 AM Innocent Djiofack < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello Everyone, >>>>>>>>>>>>> >>>>>>>>>>>>> My name is Innocent and I have enjoyed working on the apache >>>>>>>>>>>>> Iceberg project so far and have learned a lot from people in the >>>>>>>>>>>>> group. >>>>>>>>>>>>> I wanted to follow up on a concern raised by Anton around the >>>>>>>>>>>>> growing size of metadata.json and the problems it brings. Before >>>>>>>>>>>>> going >>>>>>>>>>>>> ahead and doing the implementation work, I wanted to share the >>>>>>>>>>>>> high level >>>>>>>>>>>>> thinking with the community and get feedback. You will find the >>>>>>>>>>>>> link to the >>>>>>>>>>>>> proposal here >>>>>>>>>>>>> <https://docs.google.com/document/d/1xpzpsA9BGSkxo58yUhSdDQaSu7_ITQLFmGarEOyM8P0/edit?tab=t.0#heading=h.7g59t9p9o1xi> >>>>>>>>>>>>> I >>>>>>>>>>>>> would appreciate comments and feedback on it. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks. >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> >>>>>>>>>>>>> *DJIOFACK INNOCENT* >>>>>>>>>>>>> *"Be better than the day before!" -* >>>>>>>>>>>>> *+1 404 751 8024* >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> *DJIOFACK INNOCENT* >>>>>>>>> *"Be better than the day before!" -* >>>>>>>>> *+1 404 751 8024* >>>>>>>>> >>>>>>>>
