danny0405 opened a new issue, #18731: URL: https://github.com/apache/hudi/issues/18731
## Background The recently introduced the RLI support in Flink streaming, there is chance to add more metrics to monitor the runtime behavors of the RLI access. ## The Current Metrics | Area | Metric | Type | Simple remark | |---|---|---|---| | Write runtime | `recordWrittenPerSecond` | Meter | Records written per second by the Flink write task. | | Write runtime | `currentCommitWrittenRecords` | Gauge | Records written in the current checkpoint/commit window. | | Write runtime | `dataFlushCosts` | Gauge | Time spent flushing buffered data during checkpoint. | | Write runtime | `writeBufferedSize` | Gauge | Current buffered write data size in the task. | | Write runtime | `fileFlushTotalCosts` | Gauge | Total file flush time in the current checkpoint/commit window. | | Write runtime | `numOfFilesWritten` | Gauge | Number of files written in the current window. | | Write runtime | `numOfOpenHandle` | Gauge | Number of write handles opened in the current window. | | Write runtime | `numOfRecordWriteFailures` | Gauge | Number of record-level write failures in the current window. | | Write runtime | `handleSwitchPerSecond` | Meter | Rate of switching between write handles/files. | | Write runtime | `handleCreationCosts` | Histogram | Distribution of write handle creation time. | | Write runtime | `fileFlushCost` | Histogram | Distribution of single file flush time. | | Commit metadata | `<action>.totalPartitionsWritten` | Gauge | Number of partitions touched by the completed action. | | Commit metadata | `<action>.totalFilesInsert` | Gauge | New files created by the completed action. | | Commit metadata | `<action>.totalFilesUpdate` | Gauge | Existing files updated by the completed action. | | Commit metadata | `<action>.totalRecordsWritten` | Gauge | Total records written by the completed action. | | Commit metadata | `<action>.totalUpdateRecordsWritten` | Gauge | Updated records written by the completed action. | | Commit metadata | `<action>.totalInsertRecordsWritten` | Gauge | Inserted records written by the completed action. | | Commit metadata | `<action>.totalBytesWritten` | Gauge | Bytes written by the completed action. | | Commit metadata | `<action>.totalScanTime` | Gauge | Time spent scanning old data during the action. | | Commit metadata | `<action>.totalCompactedRecordsUpdated` | Gauge | Records updated through compaction. | | Commit metadata | `<action>.totalLogFilesCompacted` | Gauge | Log files compacted by the action. | | Commit metadata | `<action>.totalLogFilesSize` | Gauge | Total size of compacted log files. | | Commit metadata | `<action>.commitTime` | Gauge | Commit instant time as epoch milliseconds. | | Commit metadata | `<action>.duration` | Gauge | Elapsed time from commit instant to metric update. | | Compaction | `compaction.pendingCompactionCount` | Gauge | Number of pending compaction instants. | | Compaction | `compaction.compactionDelay` | Gauge | Age in seconds of the earliest pending compaction. | | Compaction | `compaction.compactionCost` | Gauge | Time spent executing a compaction operation. | | Compaction | `compaction.compactionStateSignal` | Gauge | `0` means completed, `1` means rolled back. | | Clustering | `clustering.pendingClusteringCount` | Gauge | Number of pending clustering instants. | | Clustering | `clustering.clusteringDelay` | Gauge | Age in seconds of the earliest pending clustering. | | Clustering | `clustering.clusteringCost` | Gauge | Time spent executing a clustering operation. | | Bulk sort | `memoryUsedSizeInBytes` | Gauge | Memory currently used by external sort. | | Bulk sort | `numSpillFiles` | Gauge | Number of spill files created by sort. | | Bulk sort | `spillInBytes` | Gauge | Bytes spilled to disk by sort. | | RocksDB index | `rocksdb.disk.total_sst_files_size` | Gauge | Total SST file size in RocksDB index state. | | RocksDB index | `rocksdb.disk.live_sst_files_size` | Gauge | Live SST file size in RocksDB index state. | | RocksDB index | `rocksdb.block_cache.capacity` | Gauge | RocksDB block cache capacity. | | RocksDB index | `rocksdb.block_cache.usage` | Gauge | Current RocksDB block cache usage. | | RocksDB index | `rocksdb.block_cache.hit_ratio` | Gauge | Overall RocksDB block cache hit ratio. | | RocksDB index | `rocksdb.block_cache.data_hit_ratio` | Gauge | Data block cache hit ratio. | | RocksDB index | `rocksdb.block_cache.index_hit_ratio` | Gauge | Index block cache hit ratio. | | RocksDB index | `rocksdb.block_cache.filter_hit_ratio` | Gauge | Filter block cache hit ratio. | | RocksDB index | `rocksdb.memtable.active_size` | Gauge | Size of the active RocksDB memtable. | | RocksDB index | `rocksdb.memtable.all_size` | Gauge | Total size of all RocksDB memtables. | | RocksDB index | `rocksdb.memtable.immutable_count` | Gauge | Number of immutable RocksDB memtables. | | RocksDB index | `rocksdb.memtable.hit_ratio` | Gauge | RocksDB memtable lookup hit ratio. | | Streaming read | `issuedInstant` | Gauge | Last Hudi instant issued by streaming read. | | Streaming read | `issuedInstantDelay` | Gauge | Delay in seconds since the last issued instant. | | Streaming read | `splitLatestCommit` | Gauge | Latest commit seen by the current read split. | | Streaming read | `splitLatestCommitDelay` | Gauge | Delay in seconds since the split’s latest commit. | <action> is usually compaction or clustering. Metadata-table compaction exposes the same compaction metrics with an mdt. prefix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
