pvary commented on code in PR #6765:
URL: https://github.com/apache/iceberg/pull/6765#discussion_r1101007377
##########
docs/flink-getting-started.md:
##########
@@ -747,6 +747,44 @@ FlinkSink.builderFor(
.append();
```
+### monitoring metrics
+
+The following Flink metrics are provided by the Flink Iceberg sink.
+
+Parallel writer metrics are added under the sub group of
`IcebergStreamWriter`.
+They should have the following key-value tags.
+* table: full table name (like iceberg.my_db.my_table)
+* subtask_index: writer subtask index starting from 0
+
+ Metric name | Metric type | Description
|
+| -------------------------
|------------|-----------------------------------------------------------------------------------------------------|
+| lastFlushDurationMs | Gague | The duration (in milli) that writer
subtasks take to flush and upload the files during checkpoint. |
+| flushedDataFiles | Counter | Number of data files flushed and
uploaded. |
+| flushedDeleteFiles | Counter | Number of delete files flushed and
uploaded. |
+| flushedReferencedDataFiles| Counter | Number of data files referenced by
the flushed delete files. |
+| dataFilesSizeHistogram | Histogram | Histogram distribution of data file
sizes (in bytes). |
+| deleteFilesSizeHistogram | Histogram | Histogram distribution of delete
file sizes (in bytes). |
+
+Committer metrics are added under the sub group of `IcebergFilesCommitter`.
+They should have the following key-value tags.
+* table: full table name (like iceberg.my_db.my_table)
+
+ Metric name | Metric type | Description
|
+|---------------------------------|--------|----------------------------------------------------------------------------|
+| lastCheckpointDurationMs | Gague | The duration (in milli) that the
committer operator checkpoints its state. |
+| lastCommitDurationMs | Gague | The duration (in milli) that the
Iceberg table commit takes. |
+| committedDataFilesCount | Counter | Number of data files committed.
|
+| committedDataFilesRecordCount | Counter | Number of records contained in
the committed data files. |
+| committedDataFilesByteCount | Counter | Number of bytes contained in the
committed data files. |
+| committedDeleteFilesCount | Counter | Number of delete files
committed. |
+| committedDeleteFilesRecordCount | Counter | Number of records contained in
the committed delete files. |
+| committedDeleteFilesByteCount | Counter | Number of bytes contained in the
committed delete files. |
+| elapsedSecondsSinceLastSuccessfulCommit| Gague | Elapsed time (in seconds)
since last successful Iceberg commit. |
+
+`elapsedSecondsSinceLastSuccessfulCommit` is an ideal alerting metric for
these scenarios.
Review Comment:
+1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]