Hey Anurag, I wasn't able to make it to the sync but was hoping to watch the recording afterwards. I'm curious what the reasons were for discarding the Parquet-native approach. Could you share a summary from what was discussed in the sync please on that topic?
On Tue, Feb 10, 2026 at 8:20 PM Anurag Mantripragada < [email protected]> wrote: > Hi all, > > Thank you for attending today's sync. Please find the meeting notes below. > I apologize that we were unable to record the session due to attendees not > having record access. > > Key updates and discussion points: > > *Decisions:* > > - Table Format vs. Parquet: There is a general consensus that column > update support should reside in the table format. Consequently, we have > discarded the Parquet-native approach. > - Metadata Representation: To maintain clean metadata and avoid > complex resolution logic for readers, the goal is to keep only one metadata > file per column. However, achieving this is challenging if we support > partial updates, as multiple column files may exist for the same column > (See open questions). > - Data Representation: Sparse column files are preferred for compact > representation and are better suited for partial column updates. We can > optimize sparse representation for vectorized reads by filling in null or > default values at read time for missing positions from the base file, which > avoids joins during reads. > > > *Open Questions: * > > - We are still determining what restrictions are necessary when > supporting partial updates. For instance, we need to decide whether to add > a new column and subsequently allow partial updates on it. This would > involve managing both a base column file and subsequent update files. > - We need a better understanding of the use cases for partial updates. > - We need to further discuss the handling of equality deletes. > > If I missed anything, or if others took notes, please share them here. > Thanks! > > I will go ahead and update the doc with what we have discussed so we can > continue next time from where we left off. > > ~ Anurag > > On Mon, Feb 9, 2026 at 11:55 AM Anurag Mantripragada < > [email protected]> wrote: > >> Hi all, >> >> This design >> <https://docs.google.com/document/d/1Bd7JVzgajA8-DozzeEE24mID_GLuz6iwj0g4TlcVJcs/edit?tab=t.0> >> will be discussed tomorrow in a dedicated sync. >> >> Efficient column updates sync >> Tuesday, February 10 · 9:00 – 10:00am >> Time zone: America/Los_Angeles >> Google Meet joining info >> Video call link: https://meet.google.com/xsd-exug-tcd >> >> ~ Anurag >> >> On Fri, Feb 6, 2026 at 8:30 AM Anurag Mantripragada < >> [email protected]> wrote: >> >>> Hi Gabor, >>> >>> Thanks for the detailed example. >>> >>> I agree with Steven that Option 2 seems reasonable. I will add a section >>> to the design doc regarding equality delete handling, and we can discuss >>> this further during our meeting on Tuesday. >>> >>> ~Anurag >>> >>> On Fri, Feb 6, 2026 at 7:08 AM Steven Wu <[email protected]> wrote: >>> >>>> > 1) When deleting with eq-deletes: If there is a column update on the >>>> equality-filed ID we use for the delete, reject deletion >>>> > 2) When adding a column update on a column that is part of the >>>> equality field IDs in some delete, we reject the column update >>>> >>>> Gabor, this is a good scenario. The 2nd option makes sense to me, since >>>> equality ids are like primary key fields. If we have the 2nd rule enforced, >>>> the first option is not applicable anymore. >>>> >>>> On Fri, Feb 6, 2026 at 3:13 AM Gábor Kaszab <[email protected]> >>>> wrote: >>>> >>>>> Hey, >>>>> >>>>> Thank you for the proposal, Anurag! I made a pass recently and I think >>>>> there is some interference between column updates and equality deletes. >>>>> Let >>>>> me describe below: >>>>> >>>>> Steps: >>>>> >>>>> CREATE TABLE tbl (int a, int b); >>>>> >>>>> INSERT INTO tbl VALUES (1, 11), (2, 22); -- creates the base data file >>>>> >>>>> DELETE FROM tbl WHERE b=11; -- creates an equality >>>>> delete file >>>>> >>>>> UPDATE tbl SET b=11; -- writes >>>>> column update >>>>> >>>>> >>>>> >>>>> SELECT * FROM tbl; >>>>> >>>>> Expected result: >>>>> >>>>> (2, 11) >>>>> >>>>> >>>>> >>>>> Data and metadata created after the above steps: >>>>> >>>>> Base file >>>>> >>>>> (1, 11), (2, 22), >>>>> >>>>> seqnum=1 >>>>> >>>>> EQ-delete >>>>> >>>>> b=11 >>>>> >>>>> seqnum=2 >>>>> >>>>> Column update >>>>> >>>>> Field ids: [field_id_for_col_b] >>>>> >>>>> seqnum=3 >>>>> >>>>> Data file content: (dummy_value),(11) >>>>> >>>>> >>>>> >>>>> Read steps: >>>>> >>>>> 1. Stitch base file with column updates in reader: >>>>> >>>>> Rows: (1,dummy_value), (2,11) (Note, dummy value can be either null, >>>>> or 11, see the proposal for more details) >>>>> >>>>> Seqnum for base file=1 >>>>> >>>>> Seqnum for column update=3 >>>>> >>>>> 2. Apply eq-delete b=11, seqnum=3 on the stitched result >>>>> 3. Query result depends on which seqnum we carry forward to >>>>> compare with the eq-delete's seqnum, but it's not correct in any of the >>>>> cases >>>>> 1. Use seqnum from base file: we get either an empty result if >>>>> 'dummy_value' is 11 or we get (1, null) otherwise >>>>> 2. Use seqnum from last update file: don't delete any rows, >>>>> result set is (1, dummy_value),(2,11) >>>>> >>>>> >>>>> >>>>> Problem: >>>>> >>>>> EQ-delete should be applied midway applying the column updates to the >>>>> base file based on sequence number, during the stitching process. If I'm >>>>> not mistaken, this is not feasible with the way readers work. >>>>> >>>>> >>>>> Proposal: >>>>> >>>>> Don't allow equality deletes together with column updates. >>>>> >>>>> 1) When deleting with eq-deletes: If there is a column update on the >>>>> equality-filed ID we use for the delete, reject deletion >>>>> >>>>> 2) When adding a column update on a column that is part of the >>>>> equality field IDs in some delete, we reject the column update >>>>> >>>>> Alternatively, column updates could be controlled by a property of the >>>>> table (immutable), and reject eq-deletes if the property indicates column >>>>> updates are turned on for the table >>>>> >>>>> >>>>> Let me know what you think! >>>>> >>>>> Best Regards, >>>>> >>>>> Gabor >>>>> >>>>> Anurag Mantripragada <[email protected]> ezt írta (időpont: >>>>> 2026. jan. 28., Sze, 3:31): >>>>> >>>>>> Thank you everyone for the initial review comments. It is exciting to >>>>>> see so much interest in this proposal. >>>>>> >>>>>> I am currently reviewing and responding to each comment. The general >>>>>> themes of the feedback so far include: >>>>>> - Including partial updates (column updates on a subset of rows in a >>>>>> table). >>>>>> - Adding details on how SQL engines will write the update files. >>>>>> - Adding details on split planning and row alignment for update files. >>>>>> >>>>>> I will think through these points and update the design accordingly. >>>>>> >>>>>> Best >>>>>> Anurag >>>>>> >>>>>> On Tue, Jan 27, 2026 at 6:25 PM Anurag Mantripragada < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Xiangin, >>>>>>> >>>>>>> Happy to learn from your experience in supporting >>>>>>> backfill use-cases. Please feel free to review the proposal and add your >>>>>>> comments. I will wait for a couple of days more to ensure everyone has a >>>>>>> chance to review the proposal. >>>>>>> >>>>>>> ~ Anurag >>>>>>> >>>>>>> On Tue, Jan 27, 2026 at 6:42 AM Xianjin Ye <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Anurag and Peter, >>>>>>>> >>>>>>>> It’s great to see the partial column update has gained great >>>>>>>> interest in the community. I internally built a BackfillColumns action >>>>>>>> to >>>>>>>> efficiently backfill columns(by writing the partial columns only and >>>>>>>> copies >>>>>>>> the binary data of other columns into a new DataFile). The speedup >>>>>>>> could be >>>>>>>> 10x for wide tables but the write amplification is still there. I >>>>>>>> would be >>>>>>>> happy to collaborate on the work and eliminate the write amplification. >>>>>>>> >>>>>>>> On 2026/01/27 10:12:54 Péter Váry wrote: >>>>>>>> > Hi Anurag, >>>>>>>> > >>>>>>>> > It’s great to see how much interest there is in the community >>>>>>>> around this >>>>>>>> > potential new feature. Gábor and I have actually submitted an >>>>>>>> Iceberg >>>>>>>> > Summit talk proposal on this topic, and we would be very happy to >>>>>>>> > collaborate on the work. I was mainly waiting for the File Format >>>>>>>> API to be >>>>>>>> > finalized, as I believe this feature should build on top of it. >>>>>>>> > >>>>>>>> > For reference, our related work includes: >>>>>>>> > >>>>>>>> > - *Dev list thread:* >>>>>>>> > >>>>>>>> https://lists.apache.org/thread/h0941sdq9jwrb6sj0pjfjjxov8tx7ov9 >>>>>>>> > - *Proposal document:* >>>>>>>> > >>>>>>>> https://docs.google.com/document/d/1OHuZ6RyzZvCOQ6UQoV84GzwVp3UPiu_cfXClsOi03ww >>>>>>>> > (not shared widely yet) >>>>>>>> > - *Performance testing PR for readers and writers:* >>>>>>>> > https://github.com/apache/iceberg/pull/13306 >>>>>>>> > >>>>>>>> > During earlier discussions about possible metadata changes, >>>>>>>> another option >>>>>>>> > came up that hasn’t been documented yet: separating planner >>>>>>>> metadata from >>>>>>>> > reader metadata. Since the planner does not need to know about >>>>>>>> the actual >>>>>>>> > files, we could store the file composition in a separate file >>>>>>>> (potentially >>>>>>>> > a Puffin file). This file could hold the column_files metadata, >>>>>>>> while the >>>>>>>> > manifest would reference the Puffin file and blob position >>>>>>>> instead of the >>>>>>>> > data filename. >>>>>>>> > This approach has the advantage of keeping the existing metadata >>>>>>>> largely >>>>>>>> > intact, and it could also give us a natural place later to add >>>>>>>> file-level >>>>>>>> > indexes or Bloom filters for use during reads or secondary >>>>>>>> filtering. The >>>>>>>> > downsides are the additional files and the increased complexity of >>>>>>>> > identifying files that are no longer referenced by the table, so >>>>>>>> this may >>>>>>>> > not be an ideal solution. >>>>>>>> > >>>>>>>> > I do have some concerns about the MoR metadata proposal described >>>>>>>> in the >>>>>>>> > document. At first glance, it seems to complicate distributed >>>>>>>> planning, as >>>>>>>> > all entries for a given file would need to be collected and >>>>>>>> merged to >>>>>>>> > provide the information required by both the planner and the >>>>>>>> reader. >>>>>>>> > Additionally, when a new column is added or updated, we would >>>>>>>> still need to >>>>>>>> > add a new metadata entry for every existing data file. If we >>>>>>>> immediately >>>>>>>> > write out the merged metadata, the total number of entries >>>>>>>> remains the >>>>>>>> > same. The main benefit is avoiding rewriting statistics, which >>>>>>>> can be >>>>>>>> > significant, but this comes at the cost of increased planning >>>>>>>> complexity. >>>>>>>> > If we choose to store the merged statistics in the >>>>>>>> column_families entry, I >>>>>>>> > don’t see much benefit in excluding the rest of the metadata, >>>>>>>> especially >>>>>>>> > since including it would simplify the planning process. >>>>>>>> > >>>>>>>> > As Anton already pointed out, we should also discuss how this >>>>>>>> change would >>>>>>>> > affect split handling, particularly how to avoid double reads >>>>>>>> when row >>>>>>>> > groups are not aligned between the original data files and the >>>>>>>> new column >>>>>>>> > files. >>>>>>>> > >>>>>>>> > Finally, I’d like to see some discussion around the Java API >>>>>>>> implications. >>>>>>>> > In particular, what API changes are required, and how SQL engines >>>>>>>> would >>>>>>>> > perform updates. Since the new column files must have the same >>>>>>>> number of >>>>>>>> > rows as the original data files, with a strict one-to-one >>>>>>>> relationship, SQL >>>>>>>> > engines would need access to the source filename, position, and >>>>>>>> deletion >>>>>>>> > status in the DataFrame in order to generate the new files. This >>>>>>>> is more >>>>>>>> > involved than a simple update and deserves some explicit >>>>>>>> consideration. >>>>>>>> > >>>>>>>> > Looking forward to your thoughts. >>>>>>>> > Best regards, >>>>>>>> > Peter >>>>>>>> > >>>>>>>> > On Tue, Jan 27, 2026, 03:58 Anurag Mantripragada < >>>>>>>> [email protected]> >>>>>>>> > wrote: >>>>>>>> > >>>>>>>> > > Thanks Anton and others, for providing some initial feedback. I >>>>>>>> will >>>>>>>> > > address all your comments soon. >>>>>>>> > > >>>>>>>> > > On Mon, Jan 26, 2026 at 11:10 AM Anton Okolnychyi < >>>>>>>> [email protected]> >>>>>>>> > > wrote: >>>>>>>> > > >>>>>>>> > >> I had a chance to see the proposal before it landed and I >>>>>>>> think it is a >>>>>>>> > >> cool idea and both presented approaches would likely work. I >>>>>>>> am looking >>>>>>>> > >> forward to discussing the tradeoffs and would encourage >>>>>>>> everyone to >>>>>>>> > >> push/polish each approach to see what issues can be mitigated >>>>>>>> and what are >>>>>>>> > >> fundamental. >>>>>>>> > >> >>>>>>>> > >> [1] Iceberg-native approach: better visibility into column >>>>>>>> files from the >>>>>>>> > >> metadata, potentially better concurrency for non-overlapping >>>>>>>> column >>>>>>>> > >> updates, no dep on Parquet. >>>>>>>> > >> [2] Parquet-native approach: almost no changes to the table >>>>>>>> format >>>>>>>> > >> metadata beyond tracking of base files. >>>>>>>> > >> >>>>>>>> > >> I think [1] sounds a bit better on paper but I am worried >>>>>>>> about the >>>>>>>> > >> complexity in writers and readers (especially around keeping >>>>>>>> row groups >>>>>>>> > >> aligned and split planning). It would be great to cover this >>>>>>>> in detail in >>>>>>>> > >> the proposal. >>>>>>>> > >> >>>>>>>> > >> пн, 26 січ. 2026 р. о 09:00 Anurag Mantripragada < >>>>>>>> > >> [email protected]> пише: >>>>>>>> > >> >>>>>>>> > >>> Hi all, >>>>>>>> > >>> >>>>>>>> > >>> "Wide tables" with thousands of columns present significant >>>>>>>> challenges >>>>>>>> > >>> for AI/ML workloads, particularly when only a subset of >>>>>>>> columns needs to be >>>>>>>> > >>> added or updated. Current Copy-on-Write (COW) and >>>>>>>> Merge-on-Read (MOR) >>>>>>>> > >>> operations in Iceberg apply at the row level, which leads to >>>>>>>> substantial >>>>>>>> > >>> write amplification in scenarios such as: >>>>>>>> > >>> >>>>>>>> > >>> - Feature Backfilling & Column Updates: Adding new feature >>>>>>>> columns >>>>>>>> > >>> (e.g., model embeddings) to petabyte-scale tables. >>>>>>>> > >>> - Model Score Updates: Refresh prediction scores after >>>>>>>> retraining. >>>>>>>> > >>> - Embedding Refresh: Updating vector embeddings, which >>>>>>>> currently >>>>>>>> > >>> triggers a rewrite of the entire row. >>>>>>>> > >>> - Incremental Feature Computation: Daily updates to a >>>>>>>> small fraction >>>>>>>> > >>> of features in wide tables. >>>>>>>> > >>> >>>>>>>> > >>> With the Iceberg V4 proposal introducing single-file commits >>>>>>>> and column >>>>>>>> > >>> stats improvements, this is an ideal time to address >>>>>>>> column-level updates >>>>>>>> > >>> to better support these use cases. >>>>>>>> > >>> >>>>>>>> > >>> I have drafted a proposal that explores both table-format >>>>>>>> enhancements >>>>>>>> > >>> and file-format (Parquet) changes to enable more efficient >>>>>>>> updates. >>>>>>>> > >>> >>>>>>>> > >>> Proposal Details: >>>>>>>> > >>> - GitHub Issue: #15146 < >>>>>>>> https://github.com/apache/iceberg/issues/15146> >>>>>>>> > >>> - Design Document: Efficient Column Updates in Iceberg >>>>>>>> > >>> < >>>>>>>> https://docs.google.com/document/d/1Bd7JVzgajA8-DozzeEE24mID_GLuz6iwj0g4TlcVJcs/edit?tab=t.0 >>>>>>>> > >>>>>>>> > >>> >>>>>>>> > >>> Next Steps: >>>>>>>> > >>> I plan to create POCs to benchmark the approaches described >>>>>>>> in the >>>>>>>> > >>> document. >>>>>>>> > >>> >>>>>>>> > >>> Please review the proposal and share your feedback. >>>>>>>> > >>> >>>>>>>> > >>> Thanks, >>>>>>>> > >>> Anurag >>>>>>>> > >>> >>>>>>>> > >> >>>>>>>> > >>>>>>>> >>>>>>>
