Hey folks, Sorry for the delay, here's the recording link <https://drive.google.com/file/d/1YOmPROXjAKYAWAcYxqAFHdADbqELVVf2/view> from last week's discussion.
Thanks, Amogh Jahagirdar On Fri, Oct 10, 2025 at 9:44 AM Péter Váry <[email protected]> wrote: > Same here. > Please record if you can. > Thanks, Peter > > On Fri, Oct 10, 2025, 17:39 Fokko Driesprong <[email protected]> wrote: > >> Hey Amogh, >> >> Thanks for the write-up. Unfortunately, I won’t be able to attend. Will >> it be recorded? Thanks! >> >> Kind regards, >> Fokko >> >> Op di 7 okt 2025 om 20:36 schreef Amogh Jahagirdar <[email protected]> >> >>> Hey all, >>> >>> I've setup time this Friday at 9am PST for another sync on single file >>> commits. In terms of what would be great to focus on for the discussion: >>> >>> 1. Whether it makes sense or not to eliminate the tuple, and instead >>> representing the tuple via lower/upper boundaries. As a reminder, one of >>> the goals is to avoid tying a partition spec to a manifest; in the root we >>> can have a mix of files spanning different partition specs, and even in >>> leaf manifests avoiding this coupling can enable more desirable clustering >>> of metadata. >>> In the vast majority of cases, we could leverage the property that a >>> file is effectively partitioned if the lower/upper for a given field is >>> equal. The nuance here is with the particular case of identity partitioned >>> string/binary columns which can be truncated in stats. One approach is to >>> require that writers must not produce truncated stats for identity >>> partitioned columns. It's also important to keep in mind that all of this >>> is just for the purpose of reconstructing the partition tuple, which is >>> only required during equality delete matching. Another area we need to >>> cover as part of this is on exact bounds on stats. There are other options >>> here as well such as making all new equality deletes in V4 be global and >>> instead match based on bounds, or keeping the tuple but each tuple is >>> effectively based off a union schema of all partition specs. I am adding a >>> separate appendix section outlining the span of options here and the >>> different tradeoffs. >>> Once we get this more to a conclusive state, I'll move a summarized >>> version to the main doc. >>> >>> 2. @[email protected] <[email protected]> has updated the doc >>> with a section >>> <https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.rrpksmp8zkb#heading=h.qau0y5xkh9mn> >>> on >>> how we can do change detection from the root in a variety of write >>> scenarios. I've done a review on it, and it covers the cases I would >>> expect. It'd be good for folks to take a look and please give feedback >>> before we discuss. Thank you Steven for adding that section and all the >>> diagrams. >>> >>> Thanks, >>> Amogh Jahagirdar >>> >>> On Thu, Sep 18, 2025 at 3:19 PM Amogh Jahagirdar <[email protected]> >>> wrote: >>> >>>> Hey folks just following up from the discussion last Friday with a >>>> summary and some next steps: >>>> >>>> 1.) For the various change detection cases, we concluded it's best just >>>> to go through those in an offline manner on the doc since it's hard to >>>> verify all that correctness in a large meeting setting. >>>> 2.) We mostly discussed eliminating the partition tuple. On the >>>> original proposal, I was mostly aiming for the ability to re-constructing >>>> the tuple from the stats for the purpose of equality delete matching (a >>>> file is partitioned if the lower and upper bounds are equal); There's some >>>> nuance in how we need to handle identity partition values since for >>>> string/binary they cannot be truncated. Another potential option is to >>>> treat all equality deletes as effectively global and narrow their >>>> application based on the stats values. This may require defining tight >>>> bounds. I'm still collecting my thoughts on this one. >>>> >>>> Thanks folks! Please also let me know if any of the following links are >>>> inaccessible for any reason. >>>> >>>> Meeting recording link: >>>> https://drive.google.com/file/d/1gv8TrR5xzqqNxek7_sTZkpbwQx1M3dhK/view >>>> Meeting summary: >>>> https://docs.google.com/document/d/131N0CDpzZczURxitN0HGS7dTqRxQT_YS9jMECkGGvQU >>>> >>>> On Mon, Sep 8, 2025 at 3:40 PM Amogh Jahagirdar <[email protected]> >>>> wrote: >>>> >>>>> Update: I moved the discussion time to this Friday at 9 am PST since I >>>>> found out that quite a few folks involved in the proposals will be out >>>>> next >>>>> week, and I also know some folks will also be out the week after that. >>>>> >>>>> Thanks, >>>>> Amogh J >>>>> >>>>> On Mon, Sep 8, 2025 at 8:57 AM Amogh Jahagirdar <[email protected]> >>>>> wrote: >>>>> >>>>>> Hey folks sorry for the late follow up here, >>>>>> >>>>>> Thanks @Kevin Liu <[email protected]> for sharing the recording >>>>>> link of the previous discussion! I've set up another sync for next >>>>>> Tuesday >>>>>> 09/16 at 9am PST. This time I've set it up from my corporate email so we >>>>>> can get recordings and transcriptions (and I've made sure to keep the >>>>>> meeting invite open so we don't have to manually let people in). >>>>>> >>>>>> In terms of next steps of areas which I think would be good to focus >>>>>> on for establishing consensus: >>>>>> >>>>>> 1. How do we model the manifest entry structure so that changes to >>>>>> manifest DVs can be obtained easily from the root? There are a few >>>>>> options >>>>>> here; the most promising approach is to keep an additional DV which >>>>>> encodes >>>>>> the diff in additional positions which have been removed from a leaf >>>>>> manifest. >>>>>> >>>>>> 2. Modeling partition transforms via expressions and establishing a >>>>>> unified table ID space so that we can simplify how partition tuples may >>>>>> be >>>>>> represented via stats and also have a way in the future to store stats on >>>>>> any derived column. I have a short proposal >>>>>> <https://docs.google.com/document/d/1oV8dapKVzB4pZy5pKHUCj5j9i2_1p37BJSeT7hyKPpg/edit?tab=t.0> >>>>>> for >>>>>> this that probably still needs some tightening up on the expression >>>>>> modeling itself (and some prototyping) but the general idea for >>>>>> establishing a unified table ID space is covered. All feedback welcome! >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Amogh Jahagirdar >>>>>> >>>>>> On Mon, Aug 25, 2025 at 1:34 PM Kevin Liu <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Thanks Amogh. Looks like the recording for last week's sync is >>>>>>> available on Youtube. Here's the link, >>>>>>> https://www.youtube.com/watch?v=uWm-p--8oVQ >>>>>>> >>>>>>> Best, >>>>>>> Kevin Liu >>>>>>> >>>>>>> On Tue, Aug 12, 2025 at 9:10 PM Amogh Jahagirdar <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hey folks, >>>>>>>> >>>>>>>> Just following up on this to give the community as to where we're >>>>>>>> at and my proposed next steps. >>>>>>>> >>>>>>>> I've been editing and merging the contents from our proposal into >>>>>>>> the proposal >>>>>>>> <https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.0#heading=h.unn922df0zzw> >>>>>>>> from >>>>>>>> Russell and others. For any future comments on docs, please comment on >>>>>>>> the >>>>>>>> linked proposal. I've also marked it on our doc in red text so it's >>>>>>>> clear >>>>>>>> to redirect to the other proposal as a source of truth for comments. >>>>>>>> >>>>>>>> In terms of next steps, >>>>>>>> >>>>>>>> 1. An important design decision point is around inline manifest >>>>>>>> DVs, external manifest DVs or enabling both. I'm working on >>>>>>>> measuring different approaches for representing the compressed DV >>>>>>>> representation since that will inform how many entries can reasonably >>>>>>>> fit >>>>>>>> in a small root manifest; from that we can derive implications on >>>>>>>> different >>>>>>>> write patterns and determine the right approach for storing these >>>>>>>> manifest >>>>>>>> DVs. >>>>>>>> >>>>>>>> 2. Another key point is around determining if/how we can reasonably >>>>>>>> enable V4 to represent changes in the root manifest so that readers can >>>>>>>> effectively just infer file level changes from the root. >>>>>>>> >>>>>>>> 3. One of the aspects of the proposal is getting away from >>>>>>>> partition tuple requirement in the root which currently holds us to >>>>>>>> have >>>>>>>> associativity between a partition spec and a manifest. These aspects >>>>>>>> can be >>>>>>>> modeled as essentially column stats which gives a lot of flexibility >>>>>>>> into >>>>>>>> the organization of the manifest. There are important details around >>>>>>>> field >>>>>>>> ID spaces here which tie into how the stats are structured. What we're >>>>>>>> proposing here is to have a unified expression ID space that could also >>>>>>>> benefit us for storing things like virtual columns down the line. I go >>>>>>>> into >>>>>>>> this in the proposal but I'm working on separating the appropriate >>>>>>>> parts so >>>>>>>> that the original proposal can mostly just focus on the organization >>>>>>>> of the >>>>>>>> content metadata tree and not how we want to solve this particular ID >>>>>>>> space >>>>>>>> problem. >>>>>>>> >>>>>>>> 4. I'm planning on scheduling a recurring community sync starting >>>>>>>> next Tuesday at 9am PST, every 2 weeks. If I get feedback from folks >>>>>>>> that >>>>>>>> this time will never work, I can certainly adjust. For some reason, I >>>>>>>> don't >>>>>>>> have the ability to add to the Iceberg Dev calendar, so I'll figure >>>>>>>> that >>>>>>>> out and update the thread when the event is scheduled. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Amogh Jahagirdar >>>>>>>> >>>>>>>> On Tue, Jul 22, 2025 at 11:47 AM Russell Spitzer < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> I think this is a great way forward, starting out with this much >>>>>>>>> parallel development shows that we have a lot of consensus already :) >>>>>>>>> >>>>>>>>> On Tue, Jul 22, 2025 at 12:42 PM Amogh Jahagirdar < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hey folks, just following up on this. It looks like our proposal >>>>>>>>>> and the proposal that @Russell Spitzer >>>>>>>>>> <[email protected]> shared are pretty aligned. I was >>>>>>>>>> just chatting with Russell about this, and we think it'd be best to >>>>>>>>>> combine >>>>>>>>>> both proposals and have a singular large effort on this. I can also >>>>>>>>>> set up >>>>>>>>>> a focused community discussion (similar to what we're doing on the >>>>>>>>>> other V4 >>>>>>>>>> proposals) on this starting sometime next week just to get things >>>>>>>>>> moving, >>>>>>>>>> if that works for people. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Amogh Jahagirdar >>>>>>>>>> >>>>>>>>>> On Mon, Jul 14, 2025 at 9:48 PM Amogh Jahagirdar < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hey Russell, >>>>>>>>>>> >>>>>>>>>>> Thanks for sharing the proposal! A few of us (Ryan, Dan, Anoop >>>>>>>>>>> and I) have also been working on a proposal for an adaptive >>>>>>>>>>> metadata tree >>>>>>>>>>> structure as part of enabling more efficient one file commits. >>>>>>>>>>> >From a read >>>>>>>>>>> of the summary, it's great to see that we're thinking along the >>>>>>>>>>> same lines >>>>>>>>>>> about how to tackle this fundamental area! >>>>>>>>>>> >>>>>>>>>>> Here is our proposal: >>>>>>>>>>> https://docs.google.com/document/d/1q2asTpq471pltOTC6AsTLQIQcgEsh0AvEhRWnCcvZn0 >>>>>>>>>>> <https://docs.google.com/document/d/1q2asTpq471pltOTC6AsTLQIQcgEsh0AvEhRWnCcvZn0> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Amogh Jahagirdar >>>>>>>>>>> >>>>>>>>>>> On Mon, Jul 14, 2025 at 8:08 PM Russell Spitzer < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hey y'all! >>>>>>>>>>>> >>>>>>>>>>>> We (Yi Fang, Steven Wu and Myself) wanted to share some >>>>>>>>>>>> of the thoughts we had on how one-file commits could work in >>>>>>>>>>>> Iceberg. This is pretty >>>>>>>>>>>> much just a high level overview of the concepts we think we >>>>>>>>>>>> need and how Iceberg would behave. >>>>>>>>>>>> We haven't gone very far into the actual implementation and >>>>>>>>>>>> changes that would need to occur in the >>>>>>>>>>>> SDK to make this happen. >>>>>>>>>>>> >>>>>>>>>>>> The high level summary is: >>>>>>>>>>>> >>>>>>>>>>>> Manifest Lists are out >>>>>>>>>>>> Root Manifests take their place >>>>>>>>>>>> A Root manifest can have data manifests, delete manifests, >>>>>>>>>>>> manifest delete vectors, data delete vectors and data files >>>>>>>>>>>> Manifest delete vectors allow for modifying a manifest >>>>>>>>>>>> without deleting it entirely >>>>>>>>>>>> Data files let you append without writing an intermediary >>>>>>>>>>>> manifest >>>>>>>>>>>> Having child data and delete manifests lets you still scale >>>>>>>>>>>> >>>>>>>>>>>> Please take a look if you like, >>>>>>>>>>>> >>>>>>>>>>>> https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.0 >>>>>>>>>>>> >>>>>>>>>>>> I'm excited to see what other proposals and Ideas are floating >>>>>>>>>>>> around the community, >>>>>>>>>>>> Russ >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jul 2, 2025 at 6:29 PM John Zhuge <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Very excited about the idea! >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Jul 2, 2025 at 1:17 PM Anoop Johnson < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I'm very interested in this initiative. Micah Kornfield and I >>>>>>>>>>>>>> presented >>>>>>>>>>>>>> <https://youtu.be/4d4nqKkANdM?si=9TXgaUIXbq-l8idi&t=1405> on >>>>>>>>>>>>>> high-throughput ingestion for Iceberg tables at the 2024 Iceberg >>>>>>>>>>>>>> Summit, >>>>>>>>>>>>>> which leveraged Google infrastructure like Colossus for >>>>>>>>>>>>>> efficient appends. >>>>>>>>>>>>>> >>>>>>>>>>>>>> This new proposal is particularly exciting because it offers >>>>>>>>>>>>>> significant advancements in commit latency and metadata storage >>>>>>>>>>>>>> footprint. >>>>>>>>>>>>>> Furthermore, a consistent manifest structure promises to >>>>>>>>>>>>>> simplify the >>>>>>>>>>>>>> design and codebase, which is a major benefit. >>>>>>>>>>>>>> >>>>>>>>>>>>>> A related idea I've been exploring is having a loose affinity >>>>>>>>>>>>>> between data and delete manifests. While the current separation >>>>>>>>>>>>>> of data and >>>>>>>>>>>>>> delete manifests in Iceberg is valuable for avoiding data file >>>>>>>>>>>>>> rewrites >>>>>>>>>>>>>> (and stats updates) when deletes change, it does necessitate a >>>>>>>>>>>>>> join >>>>>>>>>>>>>> operation during reads. I'd be keen to discuss approaches that >>>>>>>>>>>>>> could >>>>>>>>>>>>>> potentially reduce this read-side cost while retaining the >>>>>>>>>>>>>> benefits of >>>>>>>>>>>>>> separate manifests. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> Anoop >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jun 13, 2025 at 11:06 AM Jagdeep Sidhu < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am new to the Iceberg community but would love to >>>>>>>>>>>>>>> participate in these discussions to reduce the number of file >>>>>>>>>>>>>>> writes, >>>>>>>>>>>>>>> especially for small writes/commits. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you! >>>>>>>>>>>>>>> -Jagdeep >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Jun 5, 2025 at 4:02 PM Anurag Mantripragada >>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We have been hitting all the metadata problems you >>>>>>>>>>>>>>>> mentioned, Ryan. I’m on-board to help however I can to improve >>>>>>>>>>>>>>>> this area. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ~ Anurag Mantripragada >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Jun 3, 2025, at 2:22 AM, Huang-Hsiang Cheng >>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am interested in this idea and looking forward to >>>>>>>>>>>>>>>> collaboration. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Huang-Hsiang >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Jun 2, 2025, at 10:14 AM, namratha mk <[email protected]> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am interested in contributing to this effort. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Namratha >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, May 29, 2025 at 1:36 PM Amogh Jahagirdar < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks for kicking this thread off Ryan, I'm interested in >>>>>>>>>>>>>>>>> helping out here! I've been working on a proposal in this >>>>>>>>>>>>>>>>> area and it would >>>>>>>>>>>>>>>>> be great to collaborate with different folks and exchange >>>>>>>>>>>>>>>>> ideas here, since >>>>>>>>>>>>>>>>> I think a lot of people are interested in solving this >>>>>>>>>>>>>>>>> problem. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Amogh Jahagirdar >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Thu, May 29, 2025 at 2:25 PM Ryan Blue < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Like Russell’s recent note, I’m starting a thread to >>>>>>>>>>>>>>>>>> connect those of us that are interested in the idea of >>>>>>>>>>>>>>>>>> changing Iceberg’s >>>>>>>>>>>>>>>>>> metadata in v4 so that in most cases committing a change >>>>>>>>>>>>>>>>>> only requires >>>>>>>>>>>>>>>>>> writing one additional metadata file. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> *Idea: One-file commits* >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The current Iceberg metadata structure requires writing >>>>>>>>>>>>>>>>>> at least one manifest and a new manifest list to produce a >>>>>>>>>>>>>>>>>> new snapshot. >>>>>>>>>>>>>>>>>> The goal of this work is to allow more flexibility by >>>>>>>>>>>>>>>>>> allowing the manifest >>>>>>>>>>>>>>>>>> list layer to store data and delete files. As a result, only >>>>>>>>>>>>>>>>>> one file write >>>>>>>>>>>>>>>>>> would be needed before committing the new snapshot. In >>>>>>>>>>>>>>>>>> addition, this work >>>>>>>>>>>>>>>>>> will also try to explore: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - Avoiding small manifests that must be read in >>>>>>>>>>>>>>>>>> parallel and later compacted (metadata maintenance >>>>>>>>>>>>>>>>>> changes) >>>>>>>>>>>>>>>>>> - Extend metadata skipping to use aggregated column >>>>>>>>>>>>>>>>>> ranges that are compatible with geospatial data (manifest >>>>>>>>>>>>>>>>>> metadata) >>>>>>>>>>>>>>>>>> - Using soft deletes to avoid rewriting existing >>>>>>>>>>>>>>>>>> manifests (metadata DVs) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> If you’re interested in these problems, please reply! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Ryan >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> John Zhuge >>>>>>>>>>>>> >>>>>>>>>>>>
