Re: [DISCUSS] v4 - One file commits

Amogh Jahagirdar Sat, 22 Nov 2025 13:36:28 -0800

Hey all,

Here is the meeting recording
<https://drive.google.com/file/d/1lG9sM-JTwqcIgk7JsAryXXCc1vMnstJs/view?usp=sharing>
 and generated meeting summary
<https://docs.google.com/document/d/1e50p8TXL2e3CnUwKMOvm8F4s2PeVMiKWHPxhxOW1fIM/edit?usp=sharing>.
Thanks all for attending yesterday!


On Thu, Nov 20, 2025 at 8:49 AM Amogh Jahagirdar <[email protected]> wrote:

> Hey folks,
>
> I was out for some time, but set up a sync for tomorrow at 9am PST. For
> this discussion, I do think it would be great to focus on the manifest DV
> representation, factoring in analyses on bitmap representation storage
> footprints, and the entry structure considering how we want to approach
> change detection. If there are other topics that people want to highlight,
> please do bring those up as well!
>
> I also recognize that this is a bit short term scheduling, so please do
> reach out to me if this time is difficult to work with; next week is the
> Thanksgiving holidays here, and since people would be travelling/out I
> figured I'd try to schedule before then.
>
> Thanks,
> Amogh Jahagirdar
>
>
>
> On Fri, Oct 17, 2025 at 9:03 AM Amogh Jahagirdar <[email protected]> wrote:
>
>> Hey folks,
>>
>> Sorry for the delay, here's the recording link
>> <https://drive.google.com/file/d/1YOmPROXjAKYAWAcYxqAFHdADbqELVVf2/view>  
>> from
>> last week's discussion.
>>
>> Thanks,
>> Amogh Jahagirdar
>>
>> On Fri, Oct 10, 2025 at 9:44 AM Péter Váry <[email protected]>
>> wrote:
>>
>>> Same here.
>>> Please record if you can.
>>> Thanks, Peter
>>>
>>> On Fri, Oct 10, 2025, 17:39 Fokko Driesprong <[email protected]> wrote:
>>>
>>>> Hey Amogh,
>>>>
>>>> Thanks for the write-up. Unfortunately, I won’t be able to attend. Will
>>>> it be recorded? Thanks!
>>>>
>>>> Kind regards,
>>>> Fokko
>>>>
>>>> Op di 7 okt 2025 om 20:36 schreef Amogh Jahagirdar <[email protected]>
>>>>
>>>>> Hey all,
>>>>>
>>>>> I've setup time this Friday at 9am PST for another sync on single file
>>>>> commits. In terms of what would be great to focus on for the discussion:
>>>>>
>>>>> 1. Whether it makes sense or not to eliminate the tuple, and instead
>>>>> representing the tuple via lower/upper boundaries. As a reminder, one of
>>>>> the goals is to avoid tying a partition spec to a manifest; in the root we
>>>>> can have a mix of files spanning different partition specs, and even in
>>>>> leaf manifests avoiding this coupling can enable more desirable clustering
>>>>> of metadata.
>>>>> In the vast majority of cases, we could leverage the property that a
>>>>> file is effectively partitioned if the lower/upper for a given field is
>>>>> equal. The nuance here is with the particular case of identity partitioned
>>>>> string/binary columns which can be truncated in stats. One approach is to
>>>>> require that writers must not produce truncated stats for identity
>>>>> partitioned columns. It's also important to keep in mind that all of this
>>>>> is just for the purpose of reconstructing the partition tuple, which is
>>>>> only required during equality delete matching. Another area we need to
>>>>> cover as part of this is on exact bounds on stats. There are other options
>>>>> here as well such as making all new equality deletes in V4 be global and
>>>>> instead match based on bounds, or keeping the tuple but each tuple is
>>>>> effectively based off a union schema of all partition specs. I am adding a
>>>>> separate appendix section outlining the span of options here and the
>>>>> different tradeoffs.
>>>>> Once we get this more to a conclusive state, I'll move a summarized
>>>>> version to the main doc.
>>>>>
>>>>> 2. @[email protected] <[email protected]> has updated the doc
>>>>> with a section
>>>>> <https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.rrpksmp8zkb#heading=h.qau0y5xkh9mn>
>>>>>  on
>>>>> how we can do change detection from the root in a variety of write
>>>>> scenarios. I've done a review on it, and it covers the cases I would
>>>>> expect. It'd be good for folks to take a look and please give feedback
>>>>> before we discuss. Thank you Steven for adding that section and all the
>>>>> diagrams.
>>>>>
>>>>> Thanks,
>>>>> Amogh Jahagirdar
>>>>>
>>>>> On Thu, Sep 18, 2025 at 3:19 PM Amogh Jahagirdar <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hey folks just following up from the discussion last Friday with a
>>>>>> summary and some next steps:
>>>>>>
>>>>>> 1.) For the various change detection cases, we concluded it's best
>>>>>> just to go through those in an offline manner on the doc since it's hard 
>>>>>> to
>>>>>> verify all that correctness in a large meeting setting.
>>>>>> 2.) We mostly discussed eliminating the partition tuple. On the
>>>>>> original proposal, I was mostly aiming for the ability to re-constructing
>>>>>> the tuple from the stats for the purpose of equality delete matching (a
>>>>>> file is partitioned if the lower and upper bounds are equal); There's 
>>>>>> some
>>>>>> nuance in how we need to handle identity partition values since for
>>>>>> string/binary they cannot be truncated. Another potential option is to
>>>>>> treat all equality deletes as effectively global and narrow their
>>>>>> application based on the stats values. This may require defining tight
>>>>>> bounds. I'm still collecting my thoughts on this one.
>>>>>>
>>>>>> Thanks folks! Please also let me know if any of the following links
>>>>>> are inaccessible for any reason.
>>>>>>
>>>>>> Meeting recording link:
>>>>>> https://drive.google.com/file/d/1gv8TrR5xzqqNxek7_sTZkpbwQx1M3dhK/view
>>>>>>
>>>>>> Meeting summary:
>>>>>> https://docs.google.com/document/d/131N0CDpzZczURxitN0HGS7dTqRxQT_YS9jMECkGGvQU
>>>>>>
>>>>>> On Mon, Sep 8, 2025 at 3:40 PM Amogh Jahagirdar <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Update: I moved the discussion time to this Friday at 9 am PST since
>>>>>>> I found out that quite a few folks involved in the proposals will be out
>>>>>>> next week, and I also know some folks will also be out the week after 
>>>>>>> that.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Amogh J
>>>>>>>
>>>>>>> On Mon, Sep 8, 2025 at 8:57 AM Amogh Jahagirdar <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey folks sorry for the late follow up here,
>>>>>>>>
>>>>>>>> Thanks @Kevin Liu <[email protected]> for sharing the
>>>>>>>> recording link of the previous discussion! I've set up another sync for
>>>>>>>> next Tuesday 09/16 at 9am PST. This time I've set it up from my 
>>>>>>>> corporate
>>>>>>>> email so we can get recordings and transcriptions (and I've made sure 
>>>>>>>> to
>>>>>>>> keep the meeting invite open so we don't have to manually let people 
>>>>>>>> in).
>>>>>>>>
>>>>>>>> In terms of next steps of areas which I think would be good to
>>>>>>>> focus on for establishing consensus:
>>>>>>>>
>>>>>>>> 1. How do we model the manifest entry structure so that changes to
>>>>>>>> manifest DVs can be obtained easily from the root? There are a few 
>>>>>>>> options
>>>>>>>> here; the most promising approach is to keep an additional DV which 
>>>>>>>> encodes
>>>>>>>> the diff in additional positions which have been removed from a leaf
>>>>>>>> manifest.
>>>>>>>>
>>>>>>>> 2. Modeling partition transforms via expressions and establishing a
>>>>>>>> unified table ID space so that we can simplify how partition tuples 
>>>>>>>> may be
>>>>>>>> represented via stats and also have a way in the future to store stats 
>>>>>>>> on
>>>>>>>> any derived column. I have a short proposal
>>>>>>>> <https://docs.google.com/document/d/1oV8dapKVzB4pZy5pKHUCj5j9i2_1p37BJSeT7hyKPpg/edit?tab=t.0>
>>>>>>>>  for
>>>>>>>> this that probably still needs some tightening up on the expression
>>>>>>>> modeling itself (and some prototyping) but the general idea for
>>>>>>>> establishing a unified table ID space is covered. All feedback welcome!
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Amogh Jahagirdar
>>>>>>>>
>>>>>>>> On Mon, Aug 25, 2025 at 1:34 PM Kevin Liu <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks Amogh. Looks like the recording for last week's sync is
>>>>>>>>> available on Youtube. Here's the link,
>>>>>>>>> https://www.youtube.com/watch?v=uWm-p--8oVQ
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Kevin Liu
>>>>>>>>>
>>>>>>>>> On Tue, Aug 12, 2025 at 9:10 PM Amogh Jahagirdar <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hey folks,
>>>>>>>>>>
>>>>>>>>>> Just following up on this to give the community as to where we're
>>>>>>>>>> at and my proposed next steps.
>>>>>>>>>>
>>>>>>>>>> I've been editing and merging the contents from our proposal into
>>>>>>>>>> the proposal
>>>>>>>>>> <https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.0#heading=h.unn922df0zzw>
>>>>>>>>>>  from
>>>>>>>>>> Russell and others. For any future comments on docs, please comment 
>>>>>>>>>> on the
>>>>>>>>>> linked proposal. I've also marked it on our doc in red text so it's 
>>>>>>>>>> clear
>>>>>>>>>> to redirect to the other proposal as a source of truth for comments.
>>>>>>>>>>
>>>>>>>>>> In terms of next steps,
>>>>>>>>>>
>>>>>>>>>> 1. An important design decision point is around inline manifest
>>>>>>>>>> DVs, external manifest DVs or enabling both. I'm working on
>>>>>>>>>> measuring different approaches for representing the compressed DV
>>>>>>>>>> representation since that will inform how many entries can 
>>>>>>>>>> reasonably fit
>>>>>>>>>> in a small root manifest; from that we can derive implications on 
>>>>>>>>>> different
>>>>>>>>>> write patterns and determine the right approach for storing these 
>>>>>>>>>> manifest
>>>>>>>>>> DVs.
>>>>>>>>>>
>>>>>>>>>> 2. Another key point is around determining if/how we can
>>>>>>>>>> reasonably enable V4 to represent changes in the root manifest so 
>>>>>>>>>> that
>>>>>>>>>> readers can effectively just infer file level changes from the root.
>>>>>>>>>>
>>>>>>>>>> 3. One of the aspects of the proposal is getting away from
>>>>>>>>>> partition tuple requirement in the root which currently holds us to 
>>>>>>>>>> have
>>>>>>>>>> associativity between a partition spec and a manifest. These aspects 
>>>>>>>>>> can be
>>>>>>>>>> modeled as essentially column stats which gives a lot of flexibility 
>>>>>>>>>> into
>>>>>>>>>> the organization of the manifest. There are important details around 
>>>>>>>>>> field
>>>>>>>>>> ID spaces here which tie into how the stats are structured. What 
>>>>>>>>>> we're
>>>>>>>>>> proposing here is to have a unified expression ID space that could 
>>>>>>>>>> also
>>>>>>>>>> benefit us for storing things like virtual columns down the line. I 
>>>>>>>>>> go into
>>>>>>>>>> this in the proposal but I'm working on separating the appropriate 
>>>>>>>>>> parts so
>>>>>>>>>> that the original proposal can mostly just focus on the organization 
>>>>>>>>>> of the
>>>>>>>>>> content metadata tree and not how we want to solve this particular 
>>>>>>>>>> ID space
>>>>>>>>>> problem.
>>>>>>>>>>
>>>>>>>>>> 4. I'm planning on scheduling a recurring community sync starting
>>>>>>>>>> next Tuesday at 9am PST, every 2 weeks. If I get feedback from folks 
>>>>>>>>>> that
>>>>>>>>>> this time will never work, I can certainly adjust. For some reason, 
>>>>>>>>>> I don't
>>>>>>>>>> have the ability to add to the Iceberg Dev calendar, so I'll figure 
>>>>>>>>>> that
>>>>>>>>>> out and update the thread when the event is scheduled.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 22, 2025 at 11:47 AM Russell Spitzer <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> I think this is a great way forward, starting out with this much
>>>>>>>>>>> parallel development shows that we have a lot of consensus already 
>>>>>>>>>>> :)
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 22, 2025 at 12:42 PM Amogh Jahagirdar <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hey folks, just following up on this. It looks like our
>>>>>>>>>>>> proposal and the proposal that @Russell Spitzer
>>>>>>>>>>>> <[email protected]> shared are pretty aligned. I was
>>>>>>>>>>>> just chatting with Russell about this, and we think it'd be best 
>>>>>>>>>>>> to combine
>>>>>>>>>>>> both proposals and have a singular large effort on this. I can 
>>>>>>>>>>>> also set up
>>>>>>>>>>>> a focused community discussion (similar to what we're doing on the 
>>>>>>>>>>>> other V4
>>>>>>>>>>>> proposals) on this starting sometime next week just to get things 
>>>>>>>>>>>> moving,
>>>>>>>>>>>> if that works for people.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jul 14, 2025 at 9:48 PM Amogh Jahagirdar <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hey Russell,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for sharing the proposal! A few of us (Ryan, Dan, Anoop
>>>>>>>>>>>>> and I) have also been working on a proposal for an adaptive 
>>>>>>>>>>>>> metadata tree
>>>>>>>>>>>>> structure as part of enabling more efficient one file commits. 
>>>>>>>>>>>>> From a read
>>>>>>>>>>>>> of the summary, it's great to see that we're thinking along the 
>>>>>>>>>>>>> same lines
>>>>>>>>>>>>> about how to tackle this fundamental area!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here is our proposal:
>>>>>>>>>>>>> https://docs.google.com/document/d/1q2asTpq471pltOTC6AsTLQIQcgEsh0AvEhRWnCcvZn0
>>>>>>>>>>>>> <https://docs.google.com/document/d/1q2asTpq471pltOTC6AsTLQIQcgEsh0AvEhRWnCcvZn0>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 8:08 PM Russell Spitzer <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hey y'all!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We (Yi Fang, Steven Wu and Myself) wanted to share some
>>>>>>>>>>>>>> of the thoughts we had on how one-file commits could work in
>>>>>>>>>>>>>> Iceberg. This is pretty
>>>>>>>>>>>>>> much just a high level overview of the concepts we think we
>>>>>>>>>>>>>> need and how Iceberg would behave.
>>>>>>>>>>>>>> We haven't gone very far into the actual implementation and
>>>>>>>>>>>>>> changes that would need to occur in the
>>>>>>>>>>>>>> SDK to make this happen.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The high level summary is:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Manifest Lists are out
>>>>>>>>>>>>>> Root Manifests take their place
>>>>>>>>>>>>>>   A Root manifest can have data manifests, delete manifests,
>>>>>>>>>>>>>> manifest delete vectors, data delete vectors and data files
>>>>>>>>>>>>>>   Manifest delete vectors allow for modifying a manifest
>>>>>>>>>>>>>> without deleting it entirely
>>>>>>>>>>>>>>   Data files let you append without writing an intermediary
>>>>>>>>>>>>>> manifest
>>>>>>>>>>>>>>   Having child data and delete manifests lets you still scale
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please take a look if you like,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.0
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm excited to see what other proposals and Ideas are
>>>>>>>>>>>>>> floating around the community,
>>>>>>>>>>>>>> Russ
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Jul 2, 2025 at 6:29 PM John Zhuge <[email protected]>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Very excited about the idea!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Jul 2, 2025 at 1:17 PM Anoop Johnson <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm very interested in this initiative. Micah Kornfield and
>>>>>>>>>>>>>>>> I presented
>>>>>>>>>>>>>>>> <https://youtu.be/4d4nqKkANdM?si=9TXgaUIXbq-l8idi&t=1405>
>>>>>>>>>>>>>>>> on high-throughput ingestion for Iceberg tables at the 2024 
>>>>>>>>>>>>>>>> Iceberg Summit,
>>>>>>>>>>>>>>>> which leveraged Google infrastructure like Colossus for 
>>>>>>>>>>>>>>>> efficient appends.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This new proposal is particularly exciting because it
>>>>>>>>>>>>>>>> offers significant advancements in commit latency and metadata 
>>>>>>>>>>>>>>>> storage
>>>>>>>>>>>>>>>> footprint. Furthermore, a consistent manifest structure 
>>>>>>>>>>>>>>>> promises to
>>>>>>>>>>>>>>>> simplify the design and codebase, which is a major benefit.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> A related idea I've been exploring is having a loose
>>>>>>>>>>>>>>>> affinity between data and delete manifests. While the current 
>>>>>>>>>>>>>>>> separation of
>>>>>>>>>>>>>>>> data and delete manifests in Iceberg is valuable for avoiding 
>>>>>>>>>>>>>>>> data file
>>>>>>>>>>>>>>>> rewrites (and stats updates) when deletes change, it does 
>>>>>>>>>>>>>>>> necessitate a
>>>>>>>>>>>>>>>> join operation during reads. I'd be keen to discuss approaches 
>>>>>>>>>>>>>>>> that could
>>>>>>>>>>>>>>>> potentially reduce this read-side cost while retaining the 
>>>>>>>>>>>>>>>> benefits of
>>>>>>>>>>>>>>>> separate manifests.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Anoop
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jun 13, 2025 at 11:06 AM Jagdeep Sidhu <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am new to the Iceberg community but would love to
>>>>>>>>>>>>>>>>> participate in these discussions to reduce the number of file 
>>>>>>>>>>>>>>>>> writes,
>>>>>>>>>>>>>>>>> especially for small writes/commits.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>> -Jagdeep
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Jun 5, 2025 at 4:02 PM Anurag Mantripragada
>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> We have been hitting all the metadata problems you
>>>>>>>>>>>>>>>>>> mentioned, Ryan. I’m on-board to help however I can to 
>>>>>>>>>>>>>>>>>> improve this area.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ~ Anurag Mantripragada
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Jun 3, 2025, at 2:22 AM, Huang-Hsiang Cheng
>>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am interested in this idea and looking forward to
>>>>>>>>>>>>>>>>>> collaboration.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Huang-Hsiang
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Jun 2, 2025, at 10:14 AM, namratha mk <
>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am interested in contributing to this effort.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Namratha
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, May 29, 2025 at 1:36 PM Amogh Jahagirdar <
>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks for kicking this thread off Ryan, I'm interested
>>>>>>>>>>>>>>>>>>> in helping out here! I've been working on a proposal in 
>>>>>>>>>>>>>>>>>>> this area and it
>>>>>>>>>>>>>>>>>>> would be great to collaborate with different folks and 
>>>>>>>>>>>>>>>>>>> exchange ideas here,
>>>>>>>>>>>>>>>>>>> since I think a lot of people are interested in solving 
>>>>>>>>>>>>>>>>>>> this problem.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Amogh Jahagirdar
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, May 29, 2025 at 2:25 PM Ryan Blue <
>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Like Russell’s recent note, I’m starting a thread to
>>>>>>>>>>>>>>>>>>>> connect those of us that are interested in the idea of 
>>>>>>>>>>>>>>>>>>>> changing Iceberg’s
>>>>>>>>>>>>>>>>>>>> metadata in v4 so that in most cases committing a change 
>>>>>>>>>>>>>>>>>>>> only requires
>>>>>>>>>>>>>>>>>>>> writing one additional metadata file.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> *Idea: One-file commits*
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The current Iceberg metadata structure requires writing
>>>>>>>>>>>>>>>>>>>> at least one manifest and a new manifest list to produce a 
>>>>>>>>>>>>>>>>>>>> new snapshot.
>>>>>>>>>>>>>>>>>>>> The goal of this work is to allow more flexibility by 
>>>>>>>>>>>>>>>>>>>> allowing the manifest
>>>>>>>>>>>>>>>>>>>> list layer to store data and delete files. As a result, 
>>>>>>>>>>>>>>>>>>>> only one file write
>>>>>>>>>>>>>>>>>>>> would be needed before committing the new snapshot. In 
>>>>>>>>>>>>>>>>>>>> addition, this work
>>>>>>>>>>>>>>>>>>>> will also try to explore:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>    - Avoiding small manifests that must be read in
>>>>>>>>>>>>>>>>>>>>    parallel and later compacted (metadata maintenance 
>>>>>>>>>>>>>>>>>>>> changes)
>>>>>>>>>>>>>>>>>>>>    - Extend metadata skipping to use aggregated column
>>>>>>>>>>>>>>>>>>>>    ranges that are compatible with geospatial data 
>>>>>>>>>>>>>>>>>>>> (manifest metadata)
>>>>>>>>>>>>>>>>>>>>    - Using soft deletes to avoid rewriting existing
>>>>>>>>>>>>>>>>>>>>    manifests (metadata DVs)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> If you’re interested in these problems, please reply!
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Ryan
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> John Zhuge
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

Re: [DISCUSS] v4 - One file commits

Reply via email to