We have been hitting all the metadata problems you mentioned, Ryan. I’m on-board to help however I can to improve this area.
~ Anurag Mantripragada > On Jun 3, 2025, at 2:22 AM, Huang-Hsiang Cheng <[email protected]> > wrote: > > I am interested in this idea and looking forward to collaboration. > > Thanks, > Huang-Hsiang > >> On Jun 2, 2025, at 10:14 AM, namratha mk <[email protected]> wrote: >> >> Hello, >> >> I am interested in contributing to this effort. >> >> Thanks, >> Namratha >> >> On Thu, May 29, 2025 at 1:36 PM Amogh Jahagirdar <[email protected] >> <mailto:[email protected]>> wrote: >>> Thanks for kicking this thread off Ryan, I'm interested in helping out >>> here! I've been working on a proposal in this area and it would be great to >>> collaborate with different folks and exchange ideas here, since I think a >>> lot of people are interested in solving this problem. >>> >>> Thanks, >>> Amogh Jahagirdar >>> >>> On Thu, May 29, 2025 at 2:25 PM Ryan Blue <[email protected] >>> <mailto:[email protected]>> wrote: >>>> Hi everyone, >>>> >>>> Like Russell’s recent note, I’m starting a thread to connect those of us >>>> that are interested in the idea of changing Iceberg’s metadata in v4 so >>>> that in most cases committing a change only requires writing one >>>> additional metadata file. >>>> >>>> Idea: One-file commits >>>> >>>> The current Iceberg metadata structure requires writing at least one >>>> manifest and a new manifest list to produce a new snapshot. The goal of >>>> this work is to allow more flexibility by allowing the manifest list layer >>>> to store data and delete files. As a result, only one file write would be >>>> needed before committing the new snapshot. In addition, this work will >>>> also try to explore: >>>> >>>> Avoiding small manifests that must be read in parallel and later compacted >>>> (metadata maintenance changes) >>>> Extend metadata skipping to use aggregated column ranges that are >>>> compatible with geospatial data (manifest metadata) >>>> Using soft deletes to avoid rewriting existing manifests (metadata DVs) >>>> If you’re interested in these problems, please reply! >>>> >>>> Ryan >>>> >
