Summarizing discussion from today's Iceberg Catalog Community Sync, here
were some of the key points:
- General agreement on the need for some flavors of mechanisms for
catalog federation in-line with this proposal
- We should come up with a more fitting name for the endpoint other than
"notifications"
- Some debate over whether to just add behaviors to updateTable or
registerTable endpoints; ultimately agreed that the behavior of these
tables is intended to be fundamentally different, and want to avoid
accidentally dangerous implementations, so it's better to have a
different
endpoint
- The idea of "Notifications" in itself is too general for this
purpose, and we might want something in the future that is more in-line
with much more generalized Notifications and don't want a conflict
- This endpoint focuses on the semantic of "force-update" without a
standard Iceberg commit protocol
- The endpoint should potentially be a "bulk endpoint" since the use
case is more likely to want to reflect batches at a time
- Some debate over whether this is strictly necessary, and whether
there would be any implicit atomicity expectations
- For this use case the goal is explicitly *not* to perform a
heavyweight commit protocol, so a bulk API is just an
optimization to avoid
making a bunch of individual calls; some or all of the requests
in the bulk
request could succeed or fail
- The receiving side should not have structured failure modes relating
to out-of-sync state -- e.g. the caller should not be depending on response
state to determine consistency on the sending side
- This was debated with pros/cons of sending meaningful response
errors
- Pro: Useful for the caller to receive some amount of feedback to
know whether the force-update made it through, whether there are other
issues preventing syncing, etc
- Con: This is likely a slippery-slope of scope creep that still
fundamentally only partially addresses failure modes; instead,
the overall
system must be designed for idempotency of declared updated state and if
consistency is desired, the caller must not rely only on responses to
reconcile state anyways
- We want to separate out the discussion of the relative merits of a
push vs pull model of federation, so the merits of pull/polling/readthrough
don't preclude adding this push-based endpoint
- In-depth discussion of relative pros/cons, but agreed that one
doesn't necessarily preclude the other, and this push endpoint targets a
particular use case
- Keep the notion of "external tables" only "implicit" instead of having
to plumb a new table type everywhere (for now?)
- We could document the intended behavior of tables that come into
existence from this endpoint having a different "ownership" semantic than
those created by createTable/registerTable, but it REST spec
itself doesn't
necessarily need to expose any specific syntax/properties/etc about these
tables
Thanks everyone for the inputs to the discussion! Please feel free to chime
in here if I missed anything or got anything wrong from today's discussion.
On Fri, Sep 20, 2024 at 9:05 PM Dennis Huo <[email protected]> wrote:
> Thanks for the input, Christian!
>
> I agree a comprehensive solution would likely require some notion of
> pull-based approaches (and even federated read-through on-demand). I see
> some pros/cons to both push and pull approaches, and it seems in part to
> relate to:
>
> - Whether only the "reflecting catalog" is an Iceberg REST server, or
> only the "owning catalog" is an Iceberg REST server, or both
> - Whether it's "easier" to put the complexity of
> connection/credential/state management in the "owning catalog" or in the
> "reflecting catalog"
>
> Though the "push" approach glosses over some of the potential complexity
> on the "owning catalog" side, it does seem like a more minimal starting
> point the doesn't require any additional state or data model within the
> Iceberg REST server, but can still be useful as a building block even where
> integrations aren't necessarily formally defined via the REST spec. For
> example, for a single-tenant internal deployment of an Iceberg REST server
> whose goal is to reflect a subset of a large legacy Hive metastore (which
> is the "owning" catalog in this case) where engines are using the Iceberg
> HiveCatalog, it may not be easy to retrofit a shim to expose a compatible
> "/changes" endpoint, but might be possible to add a
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java
> with some one-off code that pushes to an Iceberg REST endpoint.
>
> You raise a good point about failure scenarios though; this flavor of
> federation doesn't provide strong consistency guarantees without some
> sophisticated cooperation within the query layer to somehow reconcile
> whether things are "up to date". Any "periodically-scheduled" pull-based
> model would have the same lack of strong consistency though, so it seems
> like query-time read-through would be a necessary component of a long-term
> "strongly consistent" model.
>
> Ultimately I could see a hybrid approach being useful for production use
> cases, especially if there are things like caching/prefetching that are
> important for performance at the "reflecting" Iceberg REST catalog;
> push-based for a "fast-path", polling pull-based for scenarios where push
> isn't possible from the source or is unreliable, and read-through as the
> "authoritative" state.
>
> On Fri, Sep 20, 2024 at 1:23 AM Christian Thiel
> <[email protected]> wrote:
>
>> Hi Dennis,
>>
>>
>>
>> thanks for your initiative!
>> I believe externally owned Tables / Federation would enable a variety of
>> new use-cases in Data Mesh scenarios.
>>
>>
>>
>> Personally, I currently favor pull based over push-based approaches
>> (think /changes Endpoint) for the following reasons:
>>
>> - Less ambiguity for failure / missing messages. What does the sender
>> do if a POST fails? How often is it retried? What is the fallback
>> behavior?
>> If a message is missed, how would the reflecting catalog ever get back to
>> the correct state?. In contrast, a pull-based approach is quite clear: The
>> reflecting catalog is responsible to store a pointer and can handle
>> retries
>> internally.
>> - Changes are not only relevant for other catalogs, but for a variety
>> of other systems that might want to act based on them. They might not have
>> a REST API and certainly don’t want to implement the Iceberg REST protocol
>> (i.e. /config).
>> - Pull-based approaches need less configuration – only the reflecting
>> catalog needs to be configured. This follows the behavior we already
>> implement in the other endpoints with other clients. I don’t think the
>> “owning” catalog must know where it’s federated to – very much like it
>> doesn’t need to know which query engines access it.
>> - The "Push" feature itself not part of spec, thus making it easier
>> for Catalogs to just implement the receiving end without the actual "push"
>> and still be 100% spec compliant - without being fully integrable with
>> other catalogs. This is also a problem regarding my first point: push &
>> receive behaviors and expectations must match between sender and receiver
>> –
>> and we don’t have a good place to document the “push” part.
>>
>>
>>
>> I would design a /changes endpoint to only contain the information THAT
>> something changed, not WHAT changed – to keep it lightweight.
>> For full change tracking I believe event queues / streaming solutions
>> such as kafka, nats are better suited. Standardizing events could be a
>> second step. In our catalog we are just using CloudEvents wrapped around
>> `TableUpdates` enriched with a few extensions.
>>
>>
>>
>> For both pull and push based approaches, your building block 1) is needed
>> anyway – so that’s surely a common ground.
>>
>>
>>
>> I would be interested to hear some more motivation from your side @Dennis
>> to choose the pull-based approach – maybe I am looking at this too specific
>> for my own use-case.
>>
>>
>>
>> Thanks!
>> Christian
>>
>>
>>
>>
>>
>> *From: *Dennis Huo <[email protected]>
>> *Date: *Thursday, 19. September 2024 at 05:46
>> *To: *[email protected] <[email protected]>
>> *Subject: *[DISCUSS] Defining a concept of "externally owned" tables in
>> the REST spec
>>
>> Hi all,
>>
>>
>>
>> I wanted to follow up on some discussions that came up in one of the
>> Iceberg Catalog community syncs awhile back relating to the concept of
>> tables that can be registered in an Iceberg REST Catalog but which have
>> their "source of truth" in some external Catalog.
>>
>>
>>
>> The original context was that Apache Polaris currently adds a
>> Polaris-specific method "sendNotification" on top of the otherwise standard
>> Iceberg REST API (
>> https://github.com/apache/polaris/blob/0547e8b3a9e38fedc466348d05f3d448f4a03930/spec/rest-catalog-open-api.yaml#L977)
>> but the goal is to come up with something that the broader community can
>> align on to ensure standardization long term.
>>
>>
>>
>> This relates closely to a couple other more ambitious areas of discussion
>> that have also come up in community syncs:
>>
>> 1. Catalog Federation - defining the protocol(s) by which all our
>> different Iceberg REST Catalog implementations can talk to each other
>> cooperatively, where entity metadata might be read-through, pushed, or
>> pulled in various ways
>> 2. Generalized events and notifications - beyond serving the purpose
>> of federation, folks have proposed a generalized model that could also be
>> applied to things like workflow triggering
>>
>> In the narrowest formulation there are two building blocks to consider:
>>
>> 1. Expressing the concept of an "externally owned table" in an
>> Iceberg REST Catalog
>>
>>
>> 1. At the most basic level, this could just mean that the target REST
>> Catalog should refuse to perform mutation dances on the table (i.e.
>> reject
>> updateTable/commitTransaction calls on such tables) because it knows
>> there's an external "source of truth" and wants to avoid causing a
>> split-brain problem
>>
>>
>> 2. Endpoint for doing a "simple" register/update of a table by
>> "forcing" the table metadata to the latest incarnation
>>
>>
>> 1. Instead of updates being something for this target REST Catalog to
>> perform a transaction protocol for, the semantic is that the "source of
>> truth" transaction is already committed in the external source, so this
>> target catalog's job is simply to "trust" the latest metadata (modulo
>> some
>> watermark semantics to deal with transient errors and out-of-order
>> deliveries)
>>
>> Interestingly, it appears there was a github issue filed awhile back for
>> some formulation of (2) that was closed silently:
>> https://github.com/apache/iceberg/issues/7261
>>
>>
>>
>> It seems like there's an opportunity to find a good balance between
>> breadth of scope, generalizability and practicality in terms of what
>> building blocks can be defined in the core spec and what broader/ambitious
>> features can be built on top of it.
>>
>>
>>
>> Would love to hear everyone's thoughts on this.
>>
>>
>>
>> Cheers,
>>
>> Dennis
>>
>