That generally aligns with my sensibilities as well (avoiding overriding existing fields' meaning). The fact that adding a CSN requires changes to the spec is notable. What's the process that would be required to get that landed in v4?
On Sun, Nov 9, 2025 at 2:40 PM Ryan Blue <[email protected]> wrote: > I am fairly strongly opposed to repurposing the timestamp field for this. > To move forward, I'd recommend working on catalog sequence numbers. > > On Sat, Nov 8, 2025 at 6:54 PM Dov Alperin > <[email protected]> wrote: > >> Hi Iceberg community! >> (I initially opened this message as it's own thread in error, sorry about >> that) >> I’m curious where this proposal landed? I work at Materialize >> <http://materialize.com/> and we are keenly interested both in seeing >> this >> proposal come to fruition but possibly also helping to implement it. >> >> I see there was a call in May, but I’m not sure what the conclusion was. >> As >> spec v4 nears closer, I am curious which of the two proposals the >> community >> favors here? >> >> Best, >> Dov >> >> On Tue, May 27, 2025 at 01:09:05AM -0700, Maninderjit Singh wrote: >> > Forgot to attach a link to the update proposal >> > < >> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0#heading=h.ypbwvr181qn4 >> > >> > . >> > >> > On Tue, May 27, 2025 at 1:06 AM Maninderjit Singh < >> > [email protected]> wrote: >> > >> > > Hi community, >> > > >> > > I have updated the proposal with both the options (overwriting >> existing >> > > timestamps-ms vs introducing a new sequence/timestamp field) as we >> have >> > > initial consensus on using catalog authored sequence/timestamp. >> Jagdeep, >> > > please review to ensure that the options are correctly captured. I >> have >> > > also added additional arguments on why we can't assume timestamp to be >> > > "informational" since it's being used in critical paths and >> > > incorrect values can take the table offline. >> > > >> > > Also, I'm moving the meeting to Thursday to better accommodate >> conflicts. >> > > I would also record the meeting in case anyone misses and is >> interested in >> > > the discussion. >> > > >> > > Sync for iceberg multi-table transactions >> > > Thursday, May 29 · 9:00 – 10:00am >> > > Time zone: America/Los_Angeles >> > > Google Meet joining info >> > > Video call link: https://meet.google.com/ffc-ttjs-vti >> > > >> > > Thanks, >> > > Maninder >> > > >> > > >> > > >> > > On Mon, May 26, 2025 at 12:47 AM Péter Váry < >> [email protected]> >> > > wrote: >> > > >> > >> I'm interested, but can't be there, but please record the meeting. >> > >> Thanks, >> > >> Peter >> > >> >> > >> Maninderjit Singh <[email protected]> ezt írta (időpont: >> > >> 2025. máj. 24., Szo, 2:30): >> > >> >> > >>> Hi dev community, >> > >>> I was wondering if we could join a call next week for discussing the >> > >>> multi-table transactions so we can make progress. I have shared a >> meeting >> > >>> invite where anyone who's interested in the discussion can join. >> Please let >> > >>> me know if this works. >> > >>> >> > >>> Thanks, >> > >>> Maninder >> > >>> >> > >>> Sync for iceberg multi-table transactions >> > >>> Friday, May 30 · 9:00 – 10:00am >> > >>> Time zone: America/Los_Angeles >> > >>> Google Meet joining info >> > >>> Video call link: https://meet.google.com/ffc-ttjs-vti >> > >>> >> > >>> >> > >>> On Wed, May 21, 2025 at 10:25 AM Maninderjit Singh < >> > >>> [email protected]> wrote: >> > >>> >> > >>>> Hi dev community, >> > >>>> Following up on the thread here to continue the discussion and get >> > >>>> feedback since we couldn't get to it in sync. I think we have made >> some >> > >>>> progress in the discussion that I want to capture while >> highlighting the >> > >>>> items where we need to create consensus along with pros and cons. >> I would >> > >>>> need help to add clarity and to make sure the arguments are >> captured >> > >>>> correctly. >> > >>>> >> > >>>> *Things we agree on* >> > >>>> >> > >>>> 1. Don't maintain server side state for tracking the >> transactions. >> > >>>> 2. Need global (catalog-wide) ordering of snapshots via some >> > >>>> (hybrid/logical) clock/CSN >> > >>>> 3. Optionally expose the catalog's clock/CSN information without >> > >>>> changing how tables load >> > >>>> 4. Loading consistent snapshot across multiple tables and >> > >>>> repeatable reads based on the reference clock/CSN >> > >>>> >> > >>>> >> > >>>> *Things we disagree on* >> > >>>> >> > >>>> 1. Reuse existing timestamp field vs introduce a new field CSN >> > >>>> >> > >>>> >> > >>>> *Reusing timestamp field approach* >> > >>>> >> > >>>> - Pros: >> > >>>> >> > >>>> >> > >>>> 1. Backwards compatibility, no change to table metadata spec so >> > >>>> could be used by existing v2 tables. >> > >>>> 2. Fixes existing time travel and ordering issues >> > >>>> 3. Simplifies and clarifies the spec (no new id for snapshots) >> > >>>> 4. Common notion of timestamp that could be used to evaluate >> causal >> > >>>> relationships in other proposals like events or commit reports. >> > >>>> >> > >>>> >> > >>>> - Cons >> > >>>> >> > >>>> >> > >>>> 1. Unique timestamp generation in milliseconds. Potential >> > >>>> mitigations: >> > >>>> >> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&disco=AAABjwaxXeg >> > >>>> 2. Concerns about client side timestamp being overridden. >> > >>>> >> > >>>> *Adding new CSN field* >> > >>>> >> > >>>> - Pros: >> > >>>> >> > >>>> >> > >>>> 1. Flexibility to use logical or hybrid clocks. Not sure how >> > >>>> clients can generate a hybrid clock timestamp here without >> suffering from >> > >>>> clock skew (Would be good to clarify this)? >> > >>>> 2. No client side overriding concerns. >> > >>>> >> > >>>> >> > >>>> - Cons: >> > >>>> >> > >>>> >> > >>>> 1. Not backwards compatible, requires new field in table >> metadata >> > >>>> so need to wait for v4 >> > >>>> 2. Does not fix time travel and snapshot-log ordering issues >> > >>>> 3. Adds another id for snapshots that clients need to generate >> and >> > >>>> reason about. >> > >>>> 4. Could not be extended to use in other proposals for causal >> > >>>> reasoning. >> > >>>> >> > >>>> >> > >>>> Thanks, >> > >>>> Maninder >> > >>>> >> > >>>> On Tue, May 20, 2025 at 8:16 PM Maninderjit Singh < >> > >>>> [email protected]> wrote: >> > >>>> >> > >>>>> Appreciate the feedback on the "catalog-authored timestamp" >> document >> > >>>>> < >> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0 >> > >> > >>>>> ! >> > >>>>> >> > >>>>> Ryan, I don't think we can get consistent time travel queries in >> > >>>>> iceberg without fixing the timestamp field since it's what the >> spec >> > >>>>> <https://iceberg.apache.org/spec/#point-in-time-reads-time-travel >> > >> > >>>>> prescribes for time travel. Hence I took the liberty to re-use it >> for the >> > >>>>> catalog timestamp which ensures that snapshot-log is correctly >> ordered for >> > >>>>> time travel. Additionally, the timestamp field needs to be fixed >> to avoid >> > >>>>> breaking commits to the table due to accidental large skews as >> per current >> > >>>>> spec, the scenario is described in detail here >> > >>>>> < >> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0#bookmark=id.6avx66vzo168 >> > >> > >>>>> . >> > >>>>> The other benefit of reusing the timestamp field is spec >> simplicity >> > >>>>> and clarity on timestamp generation responsibilities without >> requiring the >> > >>>>> need to manage yet another identifier (in addition to sequence >> number, >> > >>>>> snapshot id and timestamp) for snapshots. >> > >>>>> >> > >>>>> Jagdeep, your concerns about overriding the timestamp field are >> valid >> > >>>>> but the reason I'm not too worried about it is because client >> can't assume >> > >>>>> a commit is successful without their response being acknowledged >> by the >> > >>>>> catalog which returns the CommitTableResponse >> > >>>>> < >> https://github.com/apache/iceberg/blob/c2478968e65368c61799d8ca4b89506a61ca3e7c/open-api/rest-catalog-open-api.yaml#L3997> >> with >> > >>>>> new metadata (that has catalog authored timestamps in the >> proposal). I'm >> > >>>>> happy to work with you to put something common together and get >> the best >> > >>>>> out of the proposals. >> > >>>>> >> > >>>>> Thanks, >> > >>>>> Maninder >> > >>>>> >> > >>>>> >> > >>>>> >> > >>>>> >> > >>>>> On Tue, May 20, 2025 at 5:48 PM Jagdeep Sidhu < >> [email protected]> >> > >>>>> wrote: >> > >>>>> >> > >>>>>> Thank you Ryan, Maninder and the rest of the community for >> feedback >> > >>>>>> and ideas! >> > >>>>>> Drew and I will take another pass and remove the catalog >> > >>>>>> co-ordination requirement for LoadTable API, and bring the >> proposal closer >> > >>>>>> to "catalog-authored timestamp" in the sense that clients can >> use CSN to >> > >>>>>> find the right snapshot, but still leave upto Catalog on what it >> want to >> > >>>>>> use for CSN (Hybrid clock timestamp or another monotonically >> increasing >> > >>>>>> number). >> > >>>>>> >> > >>>>>> If more folks have feedback, please leave it in the doc or email >> > >>>>>> list, so we can address it as well in the document update. >> > >>>>>> >> > >>>>>> Maninder, one reason we proposed a new field for >> CommitSequenceNumber >> > >>>>>> instead of using an existing field is for backwards >> compatibility. Catalogs >> > >>>>>> can start optionally exposing the new field, and interested >> clients can use >> > >>>>>> the new field, but existing clients keep working as is. Existing >> and new >> > >>>>>> clients can also keep working as is against the same tables in >> the >> > >>>>>> same Catalog. My one worry is that having Catalog override the >> timestamp >> > >>>>>> field for commits may break some existing clients? Today all >> Iceberg >> > >>>>>> engines/clients do not expect the timestamp field in >> metadata/snapshot-log >> > >>>>>> to be overwritten by the Catalog. >> > >>>>>> >> > >>>>>> How do you feel about taking the best from each proposal?, i.e. >> > >>>>>> monotonically increasing commit sequence numbers (some catalogs >> can use >> > >>>>>> timestamps, some can use logical clock but we don't have to >> enforce it - >> > >>>>>> leave it up to Catalog), but keep client side logic for >> resolving the right >> > >>>>>> snapshot using sequence numbers instead of adding that >> functionality to >> > >>>>>> Catalog. Let me know! >> > >>>>>> >> > >>>>>> Thank you! >> > >>>>>> -Jagdeep >> > >>>>>> >> > >>>>>> On Tue, May 20, 2025 at 2:45 PM Ryan Blue <[email protected]> >> wrote: >> > >>>>>> >> > >>>>>>> Thanks for the proposals! There are things that I think are good >> > >>>>>>> about both of them. I think that the catalog-authored >> timestamps proposal >> > >>>>>>> misunderstands the purpose of the timestamp field, but does get >> right that >> > >>>>>>> a monotonically increasing "time" field (really a sequence >> number) across >> > >>>>>>> tables enables the coordination needed for snapshot isolated >> reads. I like >> > >>>>>>> that the sequence number proposal leaves the meaning of the >> field to the >> > >>>>>>> catalog for coordination, but it still proposes catalog >> coordination by >> > >>>>>>> loading tables "at" some sequence number. Ideally, we would be >> able to >> > >>>>>>> (optionally) expose this extra catalog information to clients >> and not need >> > >>>>>>> to change how loading works. >> > >>>>>>> >> > >>>>>>> Ryan >> > >>>>>>> >> > >>>>>>> On Tue, May 20, 2025 at 9:45 AM Ryan Blue <[email protected]> >> wrote: >> > >>>>>>> >> > >>>>>>>> Hi everyone, >> > >>>>>>>> >> > >>>>>>>> To avoid passing copies of a file around for comments, I put >> the >> > >>>>>>>> doc for commit sequence numbers into Google so we can comment >> on a central >> > >>>>>>>> copy: >> > >>>>>>>> >> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=100239850723655533404&rtpof=true&sd=true >> > >>>>>>>> >> > >>>>>>>> Ryan >> > >>>>>>>> >> > >>>>>>>> On Fri, May 16, 2025 at 2:51 AM Maninderjit Singh < >> > >>>>>>>> [email protected]> wrote: >> > >>>>>>>> >> > >>>>>>>>> Thanks for the updated proposal Drew! >> > >>>>>>>>> My preference for using the catalog authored timestamp is to >> > >>>>>>>>> minimize changes to the REST spec so we can have good >> backwards >> > >>>>>>>>> compatibility. I have quickly put together a draft proposal >> on how this >> > >>>>>>>>> should work. Looking forward to feedback and discussion. >> > >>>>>>>>> >> > >>>>>>>>> Draft Proposal: Catalog‑Authored Timestamps for >> Apache Iceberg >> > >>>>>>>>> REST Catalog >> > >>>>>>>>> < >> https://drive.google.com/open?id=1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE >> > >> > >>>>>>>>> >> > >>>>>>>>> Thanks, >> > >>>>>>>>> Maninder >> > >>>>>>>>> >> > >>>>>>>>> On Wed, May 14, 2025 at 6:12 PM Drew <[email protected]> >> wrote: >> > >>>>>>>>> >> > >>>>>>>>>> Hi everyone, >> > >>>>>>>>>> >> > >>>>>>>>>> Thank you for feedback on the MTT proposal and during >> community >> > >>>>>>>>>> sync. Based on it, Jagdeep and I have iterated on the >> document and added a >> > >>>>>>>>>> second option to use *Catalog CommitSequenceNumbers*. Looking >> > >>>>>>>>>> forward to getting more feedback on the proposal, where to >> add more details >> > >>>>>>>>>> or approach/changes to consider. We appreciate everyone's >> time on this! >> > >>>>>>>>>> >> > >>>>>>>>>> The option introduces *Catalog CommitSequenceNumbers(CSNs)*, >> > >>>>>>>>>> which allow clients/engines to read a consistent view of >> multiple tables >> > >>>>>>>>>> without needing to register a transaction context with the >> catalog. This >> > >>>>>>>>>> removes the need of registering a transaction context with >> Catalog, thus >> > >>>>>>>>>> removing the need of transaction bookkeeping on the catalog >> side. For >> > >>>>>>>>>> aborting transactions early, clients can use LoadTable with >> and without CSN >> > >>>>>>>>>> to figure out if there is already a conflicting write on any >> of the tables >> > >>>>>>>>>> being modified. Also removed the section where transactions >> were staging >> > >>>>>>>>>> commits on Catalog, and changed the proposal to align with >> Eduard's PR >> > >>>>>>>>>> around staging changes locally before commit ( >> > >>>>>>>>>> https://github.com/apache/iceberg/pull/6948). >> > >>>>>>>>>> >> > >>>>>>>>>> Jagdeep also clarified in an example in a previous email >> where a >> > >>>>>>>>>> workload may require multi table snapshot isolation, even if >> the tables are >> > >>>>>>>>>> being updated without Multi-Table commit API. Though most >> MTT transactions >> > >>>>>>>>>> will commit using the multi table commit API. >> > >>>>>>>>>> >> > >>>>>>>>>> Maninder, for the approach of "common notion of time between >> > >>>>>>>>>> clients and catalog" - I spent some time thinking about it, >> but cannot find >> > >>>>>>>>>> a feasible way to do this. Yes, the catalogs can use a high >> precision >> > >>>>>>>>>> clock, but clients cannot use Catalog Timestamp from API >> calls to set local >> > >>>>>>>>>> clock due to network latency for request/response. For >> example, different >> > >>>>>>>>>> requests to the same Catalog servers can return different >> timestamps based >> > >>>>>>>>>> on network latency. Also what if a client works with more >> than 1 Catalog. >> > >>>>>>>>>> If you want to do a rough write-up or share a reference >> implementation that >> > >>>>>>>>>> uses such an approach, I will be happy to brainstorm it >> more. Let us know! >> > >>>>>>>>>> >> > >>>>>>>>>> Here is the link to updated proposal >> > >>>>>>>>>> >> > >>>>>>>>>> >> > >>>>>>>>>> < >> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=100384647237395649950&rtpof=true&sd=true >> > >> > >>>>>>>>>> Thanks Again! >> > >>>>>>>>>> - Drew >> > >>>>>>>>>> >> > >>>>>>>>> >> >
