Thanks, Maninder! Good idea. Is any meeting for this already scheduled?
Cheers, Dmitri. On Fri, Jun 27, 2025 at 1:52 AM Maninderjit Singh < parmar.maninder...@gmail.com> wrote: > Thanks Dmitri! > I will add this to the doc. Also, it might be a good idea to discuss it in > a meeting so we can hash out the details. > > On Wed, Jun 25, 2025, 9:23 AM Dmitri Bourlatchkov <di...@apache.org> > wrote: > >> Hi Maninder, >> >> Thanks for adding a section on opaque IDs and apologies for delayed reply >> from my side. I could not find a place where to fit my text in the doc, so >> I'm sending it in this email :) >> >> This option is mostly related to option 2 (CSN) but proposes to use >> commit IDs (alternative to CSN) that are opaque to clients - this is the >> same as in your opaque ID section in the doc, but I hope that thoughts >> below might help to clarity how it is intended to work. The main difference >> is delegating the resolution of commit IDs to snapshots to catalog servers. >> >> Catalog Servers are free to use any implementation for commit IDs, >> including monotonically increasing numbers (but they are not limited to >> CSN). >> >> Catalog Servers produce a commit ID for every change, which will be >> exposed to clients as a reasonably short string. Multi-table changes >> naturally get the same commit ID. >> >> Commit IDs are part of REST Catalog responses, but do not have to be in >> the metadata files. No Iceberg spec changes are required. REST API changes >> are needed, but they are optional and transparent to clients, unless the >> client wishes extra consistency guarantees. >> >> Clients can request table metadata for any table using a particular >> commit ID. This mechanism can be used to ensure consistency in time-travel >> queries. >> >> An engine can proceed as follows, while executing a multi-table change: >> 1. Load table A - receive metadata and commit ID C1 >> 2. Load table B by providing C1 as a request parameter to the Catalog >> server >> 3. Load table C by providing C1 as a request parameter to the server >> 4. Process data in tables A, B, C >> 5. Update table A >> 6. Update table B >> 7. Submit metadata updates for A and B to the Catalog, passing C1 as the >> “base” commit ID to the server. Additionally submit the name C as a “read >> but not changed” table. >> 8. The Catalog server checks whether the change has any conflicts between >> C1 and the current state of the catalog (including validating that C has >> not changed) >> 9. The Catalog commits changes and returns commit ID C2 to the client >> (this commit ID represents the committed state of the submitted metadata >> changes). >> >> If the commit fails due to conflicts, the client receives a “conflict” >> error and a commit ID C3, which represents the most up-to-date state of the >> catalog (the state that was conflicting with the submitted changes). The >> client then re-loads tables based on C3 and retries its workflows. >> >> Load table responses when a commit ID is provided do not have to return >> all of the table's metadata. It is sufficient to return only the most >> relevant snapshots (usually the latest plus its parent). This is similar to >> the partial metadata loading proposal, but not critical for consistency >> guarantees. The critical part is that the Catalog communicates to engines >> what snapshot is current for a particular commit ID. >> >> Resolving Time Travels Queries: When a client executes a time travel >> query, the client provides a timestamp when loading the first table that is >> included in the query. The Catalog will resolve the timestamp to a commit >> ID and include it in the response. Client using the returned commit ID to >> load subsequent tables. >> >> Optionally a new endpoint may be added to the REST Catalog API to handle >> the resolution of timestamps to commit IDs. >> >> Caching Metadata on the Client Side: Reloading table metadata for a >> particular snapshot could leverage the ETag mechanism to reduce the amount >> of network traffic. >> >> Servers do not need to keep any in-progress state for transactions. The >> same multi-table commit mechanism servers have for the existing commit >> endpoint can be extended to also produce commit IDs. Resolving timestamps >> to commit ID is an implementation detail. Some changes in existing servers >> will probably be required for that. Conceptually this problem does not >> appear to be more complex than providing a monotonic CSN or implementing >> the existing multi-table commit endpoint. >> >> Retention of the data related to time-travel is a server-side concern. If >> a client wishes to time travel to a point that no longer has commit >> tracking information and error is returned. >> >> WDYT? >> >> Thanks, >> Dmitri. >> >> On Thu, Jun 19, 2025 at 6:24 PM Maninderjit Singh < >> parmar.maninder...@gmail.com> wrote: >> >>> Thanks Dmitri for the review! >>> >>> We have been deliberate about not including server side implementation >>> for brevity and to allow each vendor to choose the best option for them. >>> Having said that, I have included a few papers that you can reference. >>> >>> I have also added a new >>> <https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0#bookmark=id.typa1ivjs7pw> >>> section under alternative to explore opaque ids further. Could you validate >>> and fill in the details? There are a few open questions and >>> dependencies that would be required for this proposal: >>> >>> Why do we even need an opaque id, could we use tableIdentifier + >>> Sequence number as an implicit opaque id? >>> How are opaque ids compared across tables and with time? >>> Not clear on who issues the timestamp for opaque ids and how it >>> is achieving consistency beyond repeatable reads? >>> Would this require dependency on the partial metadata load proposal >>> <https://docs.google.com/document/d/1eXnT0ZiFvdm_Zvk6fLGT_UxVWO-HsiqVywqu1Uk8s7E/edit?tab=t.0#heading=h.t6emwabb4tkr> >>> ? >>> >>> Regards, >>> Maninder >>> >>> On Thu, Jun 19, 2025 at 12:12 PM Dmitri Bourlatchkov <di...@apache.org> >>> wrote: >>> >>>> Thanks for the quick response, Jagdeep! >>>> >>>> I can certainly add a section to the doc. Could you clarify what you >>>> mean by "chatty protocol", though. I did not find that term in the >>>> linked email discussion :) >>>> >>>> Thanks, >>>> Dmitri. >>>> >>>> On Thu, Jun 19, 2025 at 2:28 PM Jagdeep Sidhu <sidhujagde...@gmail.com> >>>> wrote: >>>> >>>>> Hi Dmitri, >>>>> >>>>> Thank you for reviewing. As you said, we previously explored and >>>>> dropped TransactionContext APIs with opaque IDs because it created a very >>>>> chatty protocol and also led to complex transaction state management on >>>>> Server side, link to old thread below. >>>>> >>>>> Would you add a section to the existing document on the approach you >>>>> are thinking - Opaque IDs without the chatty protocol and complex >>>>> transaction state management on Catalog Server? Then we can compare all of >>>>> them and discuss the best path forward. Thank you! >>>>> >>>>> Older thread - >>>>> https://lists.apache.org/thread/q7vgnfwdxng5q6mq45m0psghzy7553r7 >>>>> >>>>> -Jagdeep >>>>> >>>>> On Thu, Jun 19, 2025 at 10:42 AM Dmitri Bourlatchkov <di...@apache.org> >>>>> wrote: >>>>> >>>>>> Thanks for driving this proposal, Maninder! >>>>>> >>>>>> From my POV the need for Catalogs to provide a monotonic sequence >>>>>> number has deep implications on the catalog implementations. I added a >>>>>> related comment to the doc as well. >>>>>> >>>>>> The document does a good job at discussing the client operation. I'd >>>>>> appreciate it if the server-side impact were considered in more depth >>>>>> too, >>>>>> since the proposal implies changes on both sides. >>>>>> >>>>>> I know that an opaque "commit ID" was considered before, however, if >>>>>> I'm not mistaken previous discussions revolved around the idea of >>>>>> a TransactionContext as an entity exposed via new APIs for sharing state >>>>>> between clients/engines and the catalog. I'd like to revisit the idea of >>>>>> opaque transaction IDs (managed by the catalog) but without the use >>>>>> of TransactionContext. I made a brief comment about that in the doc, and >>>>>> I'm willing to expand on this. I believe it can be implemented >>>>>> without having a durable context object to represent a transaction >>>>>> between >>>>>> the client and the catalog. >>>>>> >>>>>> The main idea for "opaque commit IDs" is to allow more flexibility >>>>>> for Catalog implementations, while keeping the same client-side >>>>>> guarantees >>>>>> (snapshot isolation, causally consistent multi-table changes, etc.). >>>>>> >>>>>> Thanks, >>>>>> Dmitri. >>>>>> >>>>>> On Mon, Jun 16, 2025 at 9:32 PM Maninderjit Singh < >>>>>> parmar.maninder...@gmail.com> wrote: >>>>>> >>>>>>> Hi Iceberg dev community, >>>>>>> >>>>>>> We have been iterating on the Multi Table Transactions proposal and >>>>>>> have merged the proposals for using catalog authored timestamps and >>>>>>> sequence numbers together as well incorporated feedback from the >>>>>>> community: >>>>>>> Proposal: Multi-table multi-statement transactions for >>>>>>> Apache Iceberg REST Catalog >>>>>>> <https://drive.google.com/open?id=1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE> >>>>>>> >>>>>>> We have captured the tradeoffs involved with each approach as well >>>>>>> as the reasoning for making those choices. We would love to hear your >>>>>>> opinions on the consolidated proposal and which approach is more >>>>>>> suitable >>>>>>> for your requirements and why. >>>>>>> >>>>>>> Thank you in advance! >>>>>>> >>>>>>