Hi Noritaka, Thanks for writing this up. I think this is a good direction overall. Using OpenTelemetry as a vendor neutral integration layer makes sense here. It gives Iceberg clients a standard way to integrate with modern observability platforms without adding per backend integrations into the project. Moreover, even for REST catalogs, the catalog itself is usually not the right place to serve as a full metrics monitoring platform. In practice, metrics still need to flow into systems like Prometheus, CloudWatch, Datadog, or Grafana Cloud for storage, dashboards, alerting, and correlation. The main extra value a REST catalog could provide is enrichment with catalog specific metadata, but that feels like a relatively minor benefit compared to having a standard observability integration path overall.
Yufei On Wed, May 20, 2026 at 6:39 PM Noritaka Sekiyama via dev < [email protected]> wrote: > Hi all, > > I'd like to propose adding an OpenTelemetry-based MetricsReporter to > iceberg-core that exports ScanReport and CommitReport to any OTLP-compatible > backend. > > # Background > Iceberg ships three built-in MetricsReporter implementations today: > LoggingMetricsReporter, InMemoryMetricsReporter (Spark-internal), and > RESTMetricsReporter (REST catalog only). > None of them give users an out-of-the-box way to ship scan/commit metrics to > an external observability platform. > The gap applies to Spark users on non-REST catalogs and to all non-Spark > engines (Trino, Flink, etc.). > > # Motivation > OpenTelemetry is the vendor-neutral CNCF standard for telemetry, supported > by every major observability backend (Prometheus, CloudWatch, Datadog, > Grafana Cloud, etc.). > A single OTLP-based MetricsReporter in Iceberg lets users reach all of > these without per-vendor integrations in the project. > This is complementary to #14360, which adds OTel support to HTTPClient at > the REST-catalog HTTP layer; this proposal covers the Iceberg-level > ScanReport / CommitReport layer. > > # Proposal > Issue: https://github.com/apache/iceberg/issues/16169 > PR: https://github.com/apache/iceberg/pull/16250 > > The reporter follows the same SDK-ownership philosophy as #14360 - the > host application (Spark/Flink/Trino/...) registers an OpenTelemetrySdk via > GlobalOpenTelemetry, and the reporter just looks up a Meter from it. > The reporter has zero Iceberg-specific catalog properties; everything else > is owned by the host. > > The PR has been validated end-to-end against two unrelated OTLP backends > (Databricks Zerobus and Amazon CloudWatch) - full procedures and queries > are linked from the PR. > > # On dependencies > Given the current sensitivity around new runtime dependencies in 1.11, the > PR adds only opentelemetry-api to iceberg-core as compileOnly. > The OpenTelemetry SDK and OTLP exporters are not added to the runtime > classpath > - they come from the host application. > opentelemetry-sdk / -sdk-testing are testImplementation only. > > # Questions for the community > > Q1. Any objection to taking the opentelemetry-api compileOnly dependency > in iceberg-core? > Q2. Module placement: iceberg-core (current PR), or a > separate iceberg-opentelemetry module? > > Thanks, > Noritaka Sekiyama, Databricks >
