Re: [DISCUSS] Facilitate the forwarding use cases of Iceberg Scan and Commit Metrics via Event

Alexandre Dutra Mon, 15 Jun 2026 06:45:14 -0700

Hi Yong, hi all,

Since the ability to filter seems to be of concern, I went ahead and
implemented an EventFilter API with a first implementation based on
Jakarta EL:


https://github.com/apache/polaris/pull/4773

In the above PR, event filters form a composable chain:

emitter -> filters (0..N) -> sanitizer (0..1) -> listener

It's easy enough to create another EventFilter implementation to do
some event sampling, as you suggested.

(Note: the goal is to do the same with sanitizers and make them 0..N
as well in the delivery pipeline.)

Thanks,
Alex

On Mon, Jun 15, 2026 at 8:13 AM Yong Zheng <[email protected]> wrote:
>
> Hello team,
>
> Thanks Yufei for the summary and for raising this discussion.
>
> One concern I have is around filtering. Without some form of filtering or
> sampling, blindly writing every metrics event to logs (when debug logging
> is enabled) or persisting every metric to a backend could introduce
> significant overhead.
>
> For Polaris itself, processing and forwarding large volumes of scan metrics
> can consume resources that would otherwise be available for serving catalog
> requests. For deployments that persist metrics to a database, the
> additional storage, indexing, and write workload can also consume compute
> resources that could be used for core catalog operations instead.
>
> This is one of the reasons I think filtering should remain part of the
> discussion regardless of whether metrics are delivered through JDBC
> persistence, the event framework, or another implementation. High-volume
> scan metrics can easily generate far more traffic than many deployments
> actually need to retain or analyze.
>
> I agree with the SPI direction, but I do not think it addresses the
> underlying scalability concern by itself. The SPI provides flexibility in
> how metrics are handled, but the load is ultimately determined by the
> implementation behind it. Without some form of filtering or sampling,
> high-volume scan metrics can still generate substantial load regardless of
> whether the implementation uses JDBC persistence, event forwarding,
> logging, or something else.
>
> Thanks,
> Yong Zheng
>
> On Thu, Jun 11, 2026 at 10:30 PM EJ Wang <[email protected]>
> wrote:
>
> > Hi Yufei, Alex,
> >
> > Thanks Yufei for writing this up, and Alex for spelling out the operational
> > concerns. My read is that both points are compatible if we are clear about
> > the layering.
> >
> > I agree that Iceberg scan/commit metrics often behave like structured
> > telemetry events: append-only, high-volume, usually consumed
> > asynchronously, and often forwarded to external systems. Events/listeners
> > are a natural fit for that kind of delivery path.
> >
> > I also agree with Alex that event delivery does not make persistence,
> > filtering, retention, payload sizing, or performance free. Those are real
> > concerns, especially for high-volume scan reports.
> >
> > The way I would reconcile these is to distinguish the default battery from
> > extension implementations.
> >
> > The latest metrics sync alignment
> > <
> > https://docs.google.com/document/d/100h7c4damrUzVuquYbBHM0EvA4LSWuW2IT2dN_7nYVA/edit?pli=1&tab=t.k96s2xyqr5u1#heading=h.uvb454otvxc0
> > >
> > was not that Polaris should pick JDBC, events, or external telemetry as the
> > one built-in metrics subsystem. It was closer to: Polaris should define a
> > clean metrics reporting/emitting boundary, ship a small safe default, and
> > let deployments choose implementation paths behind that boundary.
> >
> > Under that framing, I would not make event
> > forwarding/Prometheus/Grafana/custom routing the default battery itself. I
> > would frame it as a useful non-default extension implementation of the
> > metrics reporting/emitting path.
> >
> > Concretely, I think the split could be:
> >
> > 1.  Polaris exposes a stable Iceberg metrics reporting/emitting SPI.
> > 2.  The built-in default battery stays minimal: based on the latest notes,
> > no-op or log-only is enough as the safe OSS default.
> > 3.  Durable JDBC metrics storage is one named extension implementation of
> > that SPI, not part of core persistence.
> > 4.  Event-based forwarding can be another named extension implementation of
> > that SPI, where the listener/extension owns delivery, filtering, retention,
> > payload handling, and destination-specific behavior.
> >
> > That keeps the useful part of Yufei's proposal: deployments that want
> > Grafana/dashboard integration or custom telemetry routing can choose an
> > event/listener-based implementation. It also keeps Alex's concerns scoped
> > to the implementation that chooses that delivery model, instead of making
> > them requirements for every Polaris deployment or for the built-in default.
> > So I am generally +1 on exploring the event-forwarding path, with the
> > layering caveat that I would treat it as an extension implementation of the
> > metrics reporting/emitting SPI, not as replacing the default battery or
> > collapsing metrics into core event persistence.
> >
> > Once that boundary is clear, which I'm pushing in PR4115
> > <https://github.com/apache/polaris/pull/4115#pullrequestreview-4481873839
> > >,
> > integrations become implementation choices rather than architectural
> > changes.
> >
> > Thanks,
> > -ej
> >
> > On Thu, Jun 11, 2026 at 5:41 AM Alexandre Dutra <[email protected]> wrote:
> >
> > > > listeners can already implement whatever filtering logic they need
> > >
> > > True, but I think they would be reinventing the wheel quite often.
> > > There are some common filtering patterns such as filtering by catalog,
> > > namespace or table names or IDs. If we could provide this filter out
> > > of the box, that would be beneficial to many listeners.
> > >
> > > Thanks,
> > > Alex
> > >
> > >
> > > On Thu, Jun 11, 2026 at 3:35 AM Yufei Gu <[email protected]> wrote:
> > > >
> > > > Thanks all for the feedback! It seems we have some initial consensus
> > that
> > > > using the event framework for metrics delivery is a reasonable
> > direction
> > > > worth exploring. Most of the discussion now appears to be around impl
> > > > details and operational considerations.
> > > >
> > > > 1. Benchmarking is a great idea, using the existing tool makes sense. I
> > > > don't see it as a blocker though. The volume of scan metrics should be
> > > > similar to, or even lower than, the volume of LoadTable requests. Some
> > > > clients may not send scan metrics at all. If we're comfortable
> > supporting
> > > > LoadTable events, I'm not sure why metrics events would require a
> > > > fundamentally different validation path, though benchmarking would
> > > > certainly help us tune the event bus and listener configuration.
> > > >
> > > > 2. I agree that separating the datasource for event and metrics
> > > persistence
> > > > is an active and worthwhile discussion. I think we should continue that
> > > > work regardless of the direction we take here.
> > > >
> > > > 3. Agreed on evaluating payload sizes. That said, it doesn't seem like
> > a
> > > > major concern to me given that we already support larger payloads in
> > some
> > > > existing events.
> > > >
> > > > 4. Filtering is a valid use case. My thinking is that custom event
> > > > listeners can already implement whatever filtering logic they need. I'm
> > > not
> > > > sure we need a generic filtering framework in Polaris itself yet, but
> > I'm
> > > > open to further discussion if we find common requirements across
> > > > deployments.
> > > >
> > > > 5. Schema migration is a good point and something we should keep in
> > mind
> > > if
> > > > metrics are persisted.
> > > >
> > > > 6. I also agree with Dmitri that we can continue improving the RDBMS
> > > schema
> > > > evolution story. That feels largely orthogonal to this proposal, so
> > > perhaps
> > > > it's best discussed in a separate thread.
> > > >
> > > > Thanks,
> > > > Yufei
> > > >
> > > >
> > > > On Wed, Jun 10, 2026 at 12:56 PM Dmitri Bourlatchkov <[email protected]
> > >
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > +1 to all points from Alex's email.
> > > > >
> > > > > Re: Metrics Persistence I believe we ought to make it as smooth as
> > > possible
> > > > > from the Polaris code maintenance perspective. Therefore, I propose
> > > > > starting the work to isolate the existing metrics schema from the
> > > MetaStore
> > > > > schema in parallel with the event bus work. I think it will be
> > > beneficial
> > > > > in its own right, regardless of how the event bus work progresses.
> > > > >
> > > > > PR [4397] is but the first step in that direction.
> > > > >
> > > > > Side note: we probably do not need to copy the whole schema SQL file
> > on
> > > > > every revision, but I'm contemplating starting a separate thread on
> > > that.
> > > > >
> > > > > Once a separate metrics schema is established, I think it will be
> > > natural
> > > > > to also allow it to be on a different JDBC DataSource than the
> > > MetaStore
> > > > > schema.
> > > > >
> > > > > If the event bus work is successful, JDBC Metrics Persistence can
> > > become
> > > > > one of possibly many consumers for metrics events.
> > > > >
> > > > > With this approach, it should also be possible to write metrics to
> > the
> > > > > database in batches. IIRC, Venkateshwaran brought this point up in
> > the
> > > > > latest Metrics Sync meeting.
> > > > >
> > > > > Metrics filtering can probably progress in parallel too. I think it
> > is
> > > a
> > > > > useful feature.
> > > > >
> > > > > [4397] https://github.com/apache/polaris/pull/4397
> > > > >
> > > > > Cheers,
> > > > > Dmitri.
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jun 10, 2026 at 9:56 AM Alexandre Dutra <[email protected]>
> > > wrote:
> > > > >
> > > > > > Hi Yufei,
> > > > > >
> > > > > > The proposal to leverage the events subsystem for metrics delivery
> > is
> > > > > > quite appealing, though it requires a thorough evaluation regarding
> > > > > > potential performance overhead.
> > > > > >
> > > > > > My primary considerations are as follows:
> > > > > >
> > > > > > 1) Given that scan reports can trigger a high volume of events, we
> > > > > > should conduct rigorous testing, potentially using the Polaris
> > > > > > benchmark tool. We need to determine what's the right configuration
> > > > > > for the event bus and for the event listener executor.
> > > > > >
> > > > > > 2) While the events subsystem handles dispatch and delivery
> > natively,
> > > > > > it doesn't give persistence for free. My recollection is that we
> > were
> > > > > > pursuing the idea of a metrics persistence system with a unique
> > > schema
> > > > > > and possibly a separate datasource, a process initiated by a
> > recently
> > > > > > merged PR [1]. Is that still the case? Furthermore, we'd need to
> > > > > > implement data retention and purging, including for the current
> > > events
> > > > > > table [2].
> > > > > >
> > > > > > 3) If we consider the events table for metrics storage, we must
> > > > > > evaluate average payload sizes. Although a PR [3] was introduced to
> > > > > > prune large payloads (such as table metadata), this functionality
> > is
> > > > > > still in its early stages and will evolve. Similar pruning would be
> > > > > > necessary for metrics reports if they are big.
> > > > > >
> > > > > > 4) As Yong suggested [4], we may still require more sophisticated
> > > > > > metrics filtering. The events subsystem currently only allows
> > > > > > filtering by event type or event category, which may not be
> > granular
> > > > > > enough for our needs (as of today, it would allow only to
> > distinguish
> > > > > > scan vs metrics reports). In that regard, I would welcome the
> > > > > > opportunity to implement a generic EventFilter interface with a
> > > > > > default implementation based on CEL.
> > > > > >
> > > > > > Thanks,
> > > > > > Alex
> > > > > >
> > > > > > [1]: https://github.com/apache/polaris/pull/4397
> > > > > > [2]:
> > > https://lists.apache.org/thread/krmddx8myov926sd0mbh4ogy8sdgrfgq
> > > > > > [3]: https://github.com/apache/polaris/pull/4225
> > > > > > [4]:
> > > https://lists.apache.org/thread/ogskc1szctkg5n0tdj0cm3pfkowcwx4z
> > > > > >
> > > > > > On Wed, Jun 10, 2026 at 2:04 AM Yufei Gu <[email protected]>
> > > wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I've been thinking about how Polaris should support Iceberg scan
> > > and
> > > > > > commit
> > > > > > > metrics. A few challenges have come up in recent discussions:
> > > > > > > 1. Sync metrics persistence chokes Polaris persistence due to the
> > > high
> > > > > > > volume of scan metrics [3].
> > > > > > > 2. We spent considerable time figuring out the metrics
> > persistence,
> > > > > > > including the schema, SPIs, REST APIs [4].
> > > > > > > 3. Metric filtering remains a challenge [1].
> > > > > > > 4. We need to figure out how to purge metrics because they keep
> > > growing
> > > > > > [2].
> > > > > > >
> > > > > > > Looking at these challenges, most of them are not really metrics
> > > > > > problems.
> > > > > > > They are transport, delivery, retention, and lifecycle problems
> > > that
> > > > > the
> > > > > > > existing event framework already addresses. I'd like to propose
> > > using
> > > > > the
> > > > > > > event system to facilitate the current use cases of Iceberg scan
> > > and
> > > > > > commit
> > > > > > > metrics rather than introducing a separate Polaris metrics
> > > subsystem.
> > > > > The
> > > > > > > metrics for current use cases are fundamentally events with
> > > structured
> > > > > > > telemetry attached. They are append only, generated by IRC
> > > endpoints,
> > > > > > > typically consumed asynchronously, and often forwarded to
> > external
> > > > > > systems.
> > > > > > > Since Polaris already needs to support them as part of IRC,
> > > treating
> > > > > them
> > > > > > > as event types seems like a natural fit.
> > > > > > >
> > > > > > > More importantly, I think Polaris should remain a catalog service
> > > and
> > > > > > > telemetry producer rather than a metrics warehouse. Instead of
> > > > > > introducing
> > > > > > > a dedicated metrics subsystem along with storage, retention,
> > > query, and
> > > > > > > scaling concerns, we could build on the existing event framework:
> > > > > > >
> > > > > > >    - Emit them through the existing event mechanism. We will do
> > > that
> > > > > > anyway
> > > > > > >    given it's an IRC endpoint.
> > > > > > >    - Let custom event listeners route them to the destination of
> > > > > choice,
> > > > > > >    such as Prometheus, Grafana, RDBMSs, or other systems.
> > > > > > >    - Reuse the existing event lifecycle, retention, and delivery
> > > > > models.
> > > > > > If
> > > > > > >    temporary persistence is still required, the existing event
> > > table
> > > > > can
> > > > > > serve
> > > > > > >    that purpose. The payload size is manageable given that we
> > have
> > > put
> > > > > > the
> > > > > > >    loadTable/LoadView response in events.
> > > > > > >
> > > > > > > This approach also gives deployments flexibility to filter,
> > > sample, or
> > > > > > > redirect high volume scan metrics without Polaris needing backend
> > > > > > specific
> > > > > > > metric storage behavior. For example, event listeners can choose
> > > which
> > > > > > > metric events to process. We don't need to implement metric
> > > filtering
> > > > > > logic
> > > > > > > [1].
> > > > > > >
> > > > > > > In short, my proposal is: Events provide the transport and
> > > lifecycle
> > > > > > > mechanism, while downstream metrics systems remain responsible
> > for
> > > > > > storage,
> > > > > > > querying, aggregation, and visualization.
> > > > > > >
> > > > > > > Curious what others think.
> > > > > > >
> > > > > > > 1.
> > > https://lists.apache.org/thread/ogskc1szctkg5n0tdj0cm3pfkowcwx4z
> > > > > > > 2.
> > > https://lists.apache.org/thread/5nst0f2ygnl2gj3j910q7m8nk2fvokc7
> > > > > > > 3.
> > > https://lists.apache.org/thread/zp2rvsdkq3mb46722o0hfl0zh7kdqyr8
> > > > > > > 4.
> > > https://lists.apache.org/thread/qj1y7cw4dygcnczmymdwkfkp4ysq41ts
> > > > > > >
> > > > > > >
> > > > > > > Yufei
> > > > > >
> > > > >
> > >
> >

Re: [DISCUSS] Facilitate the forwarding use cases of Iceberg Scan and Commit Metrics via Event

Reply via email to