Hi Owen, This is a great idea. I think we could support this to some degree. For example, by running a user-generated function in the pre-commit aggregator: https://github.com/apache/iceberg/blob/15485f5523d08aae2a503c143c51b6df2debb655/flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergWriteAggregator.java#L109 This is going to be fairly limited, but would for instance allow to keep track of the watermark.
Another extension point would be https://github.com/apache/iceberg/blob/15485f5523d08aae2a503c143c51b6df2debb655/flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSinkWriter.java#L99 Cheers, Max On Tue, Nov 25, 2025 at 9:03 PM Owen Zhang via dev <[email protected]> wrote: > > Hi Team, > > I'd like to initiate a discussion on a feature that appears to be valuable > for Flink + Iceberg users. (Related issue: #14662) > > Currently, Iceberg's FlinkSink offers a set of data statistics in snapshot > summary. However, there is no mechanism for application-level code to > populate custom/application-defined statistics into Iceberg snapshot > properties at commit time. An example use case: > > A Flink job computes the event-time boundaries for data ingested in each > checkpoint (min/max event time in that batch) and aims to include this > information in the snapshot summary, alongside the built-in statistics. The > snapshot summary is a natural place for such metadata, since the statistics > directly describe the data in that specific snapshot and belong with the > snapshot itself. At the same time, the logic for computing these statistics > is application-specific, making it difficult to handle entirely within the > Iceberg framework. > > We've explored workarounds (such as static variables and external store, see > this PR) to pass these values to the committer, but these approaches are > either not robust or add unnecessary complexity. > > Is there a recommended Flink-native approach for allowing applications to > propagate custom, per-checkpoint metadata from Flink operators to the Iceberg > committer to write to snapshot summary? If not, would the community be > interested in supporting such a feature? > > Any guidance or pointers to related work would be appreciated. We’re also > happy to contribute if this aligns with project goals. > > Thanks, > Owen
