Hey y'all!

We (Yi Fang, Steven Wu and Myself) wanted to share some
of the thoughts we had on how one-file commits could work in Iceberg. This
is pretty
much just a high level overview of the concepts we think we need and how
Iceberg would behave.
We haven't gone very far into the actual implementation and changes that
would need to occur in the
SDK to make this happen.

The high level summary is:

Manifest Lists are out
Root Manifests take their place
  A Root manifest can have data manifests, delete manifests, manifest
delete vectors, data delete vectors and data files
  Manifest delete vectors allow for modifying a manifest without deleting
it entirely
  Data files let you append without writing an intermediary manifest
  Having child data and delete manifests lets you still scale

Please take a look if you like,
https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.0

I'm excited to see what other proposals and Ideas are floating around the
community,
Russ

On Wed, Jul 2, 2025 at 6:29 PM John Zhuge <jzh...@apache.org> wrote:

> Very excited about the idea!
>
> On Wed, Jul 2, 2025 at 1:17 PM Anoop Johnson <anoop.k.john...@gmail.com>
> wrote:
>
>> I'm very interested in this initiative. Micah Kornfield and I presented
>> <https://youtu.be/4d4nqKkANdM?si=9TXgaUIXbq-l8idi&t=1405> on
>> high-throughput ingestion for Iceberg tables at the 2024 Iceberg Summit,
>> which leveraged Google infrastructure like Colossus for efficient appends.
>>
>> This new proposal is particularly exciting because it offers significant
>> advancements in commit latency and metadata storage footprint. Furthermore,
>> a consistent manifest structure promises to simplify the design and
>> codebase, which is a major benefit.
>>
>> A related idea I've been exploring is having a loose affinity between
>> data and delete manifests. While the current separation of data and delete
>> manifests in Iceberg is valuable for avoiding data file rewrites (and stats
>> updates) when deletes change, it does necessitate a join operation during
>> reads. I'd be keen to discuss approaches that could potentially reduce this
>> read-side cost while retaining the benefits of separate manifests.
>>
>> Best,
>> Anoop
>>
>>
>>
>> On Fri, Jun 13, 2025 at 11:06 AM Jagdeep Sidhu <sidhujagde...@gmail.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> I am new to the Iceberg community but would love to participate in these
>>> discussions to reduce the number of file writes, especially for small
>>> writes/commits.
>>>
>>> Thank you!
>>> -Jagdeep
>>>
>>> On Thu, Jun 5, 2025 at 4:02 PM Anurag Mantripragada
>>> <amantriprag...@apple.com.invalid> wrote:
>>>
>>>> We have been hitting all the metadata problems you mentioned, Ryan. I’m
>>>> on-board to help however I can to improve this area.
>>>>
>>>>
>>>> ~ Anurag Mantripragada
>>>>
>>>> On Jun 3, 2025, at 2:22 AM, Huang-Hsiang Cheng <hua...@apple.com.INVALID>
>>>> wrote:
>>>>
>>>> I am interested in this idea and looking forward to collaboration.
>>>>
>>>> Thanks,
>>>> Huang-Hsiang
>>>>
>>>> On Jun 2, 2025, at 10:14 AM, namratha mk <nmk...@gmail.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I am interested in contributing to this effort.
>>>>
>>>> Thanks,
>>>> Namratha
>>>>
>>>> On Thu, May 29, 2025 at 1:36 PM Amogh Jahagirdar <2am...@gmail.com>
>>>> wrote:
>>>>
>>>>> Thanks for kicking this thread off Ryan, I'm interested in helping out
>>>>> here! I've been working on a proposal in this area and it would be great 
>>>>> to
>>>>> collaborate with different folks and exchange ideas here, since I think a
>>>>> lot of people are interested in solving this problem.
>>>>>
>>>>> Thanks,
>>>>> Amogh Jahagirdar
>>>>>
>>>>> On Thu, May 29, 2025 at 2:25 PM Ryan Blue <rdb...@gmail.com> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> Like Russell’s recent note, I’m starting a thread to connect those of
>>>>>> us that are interested in the idea of changing Iceberg’s metadata in v4 
>>>>>> so
>>>>>> that in most cases committing a change only requires writing one 
>>>>>> additional
>>>>>> metadata file.
>>>>>>
>>>>>> *Idea: One-file commits*
>>>>>>
>>>>>> The current Iceberg metadata structure requires writing at least one
>>>>>> manifest and a new manifest list to produce a new snapshot. The goal of
>>>>>> this work is to allow more flexibility by allowing the manifest list 
>>>>>> layer
>>>>>> to store data and delete files. As a result, only one file write would be
>>>>>> needed before committing the new snapshot. In addition, this work will 
>>>>>> also
>>>>>> try to explore:
>>>>>>
>>>>>>    - Avoiding small manifests that must be read in parallel and
>>>>>>    later compacted (metadata maintenance changes)
>>>>>>    - Extend metadata skipping to use aggregated column ranges that
>>>>>>    are compatible with geospatial data (manifest metadata)
>>>>>>    - Using soft deletes to avoid rewriting existing manifests
>>>>>>    (metadata DVs)
>>>>>>
>>>>>> If you’re interested in these problems, please reply!
>>>>>>
>>>>>> Ryan
>>>>>>
>>>>>
>>>>
>>>>
>
> --
> John Zhuge
>

Reply via email to