danny0405 opened a new issue, #18731:
URL: https://github.com/apache/hudi/issues/18731

   ## Background
   The recently introduced the RLI support in Flink streaming, there is chance 
to add more metrics to monitor the runtime behavors of the RLI access.
   
   ## The Current Metrics
   
   | Area | Metric | Type | Simple remark |
   |---|---|---|---|
   | Write runtime | `recordWrittenPerSecond` | Meter | Records written per 
second by the Flink write task. |
   | Write runtime | `currentCommitWrittenRecords` | Gauge | Records written in 
the current checkpoint/commit window. |
   | Write runtime | `dataFlushCosts` | Gauge | Time spent flushing buffered 
data during checkpoint. |
   | Write runtime | `writeBufferedSize` | Gauge | Current buffered write data 
size in the task. |
   | Write runtime | `fileFlushTotalCosts` | Gauge | Total file flush time in 
the current checkpoint/commit window. |
   | Write runtime | `numOfFilesWritten` | Gauge | Number of files written in 
the current window. |
   | Write runtime | `numOfOpenHandle` | Gauge | Number of write handles opened 
in the current window. |
   | Write runtime | `numOfRecordWriteFailures` | Gauge | Number of 
record-level write failures in the current window. |
   | Write runtime | `handleSwitchPerSecond` | Meter | Rate of switching 
between write handles/files. |
   | Write runtime | `handleCreationCosts` | Histogram | Distribution of write 
handle creation time. |
   | Write runtime | `fileFlushCost` | Histogram | Distribution of single file 
flush time. |
   | Commit metadata | `<action>.totalPartitionsWritten` | Gauge | Number of 
partitions touched by the completed action. |
   | Commit metadata | `<action>.totalFilesInsert` | Gauge | New files created 
by the completed action. |
   | Commit metadata | `<action>.totalFilesUpdate` | Gauge | Existing files 
updated by the completed action. |
   | Commit metadata | `<action>.totalRecordsWritten` | Gauge | Total records 
written by the completed action. |
   | Commit metadata | `<action>.totalUpdateRecordsWritten` | Gauge | Updated 
records written by the completed action. |
   | Commit metadata | `<action>.totalInsertRecordsWritten` | Gauge | Inserted 
records written by the completed action. |
   | Commit metadata | `<action>.totalBytesWritten` | Gauge | Bytes written by 
the completed action. |
   | Commit metadata | `<action>.totalScanTime` | Gauge | Time spent scanning 
old data during the action. |
   | Commit metadata | `<action>.totalCompactedRecordsUpdated` | Gauge | 
Records updated through compaction. |
   | Commit metadata | `<action>.totalLogFilesCompacted` | Gauge | Log files 
compacted by the action. |
   | Commit metadata | `<action>.totalLogFilesSize` | Gauge | Total size of 
compacted log files. |
   | Commit metadata | `<action>.commitTime` | Gauge | Commit instant time as 
epoch milliseconds. |
   | Commit metadata | `<action>.duration` | Gauge | Elapsed time from commit 
instant to metric update. |
   | Compaction | `compaction.pendingCompactionCount` | Gauge | Number of 
pending compaction instants. |
   | Compaction | `compaction.compactionDelay` | Gauge | Age in seconds of the 
earliest pending compaction. |
   | Compaction | `compaction.compactionCost` | Gauge | Time spent executing a 
compaction operation. |
   | Compaction | `compaction.compactionStateSignal` | Gauge | `0` means 
completed, `1` means rolled back. |
   | Clustering | `clustering.pendingClusteringCount` | Gauge | Number of 
pending clustering instants. |
   | Clustering | `clustering.clusteringDelay` | Gauge | Age in seconds of the 
earliest pending clustering. |
   | Clustering | `clustering.clusteringCost` | Gauge | Time spent executing a 
clustering operation. |
   | Bulk sort | `memoryUsedSizeInBytes` | Gauge | Memory currently used by 
external sort. |
   | Bulk sort | `numSpillFiles` | Gauge | Number of spill files created by 
sort. |
   | Bulk sort | `spillInBytes` | Gauge | Bytes spilled to disk by sort. |
   | RocksDB index | `rocksdb.disk.total_sst_files_size` | Gauge | Total SST 
file size in RocksDB index state. |
   | RocksDB index | `rocksdb.disk.live_sst_files_size` | Gauge | Live SST file 
size in RocksDB index state. |
   | RocksDB index | `rocksdb.block_cache.capacity` | Gauge | RocksDB block 
cache capacity. |
   | RocksDB index | `rocksdb.block_cache.usage` | Gauge | Current RocksDB 
block cache usage. |
   | RocksDB index | `rocksdb.block_cache.hit_ratio` | Gauge | Overall RocksDB 
block cache hit ratio. |
   | RocksDB index | `rocksdb.block_cache.data_hit_ratio` | Gauge | Data block 
cache hit ratio. |
   | RocksDB index | `rocksdb.block_cache.index_hit_ratio` | Gauge | Index 
block cache hit ratio. |
   | RocksDB index | `rocksdb.block_cache.filter_hit_ratio` | Gauge | Filter 
block cache hit ratio. |
   | RocksDB index | `rocksdb.memtable.active_size` | Gauge | Size of the 
active RocksDB memtable. |
   | RocksDB index | `rocksdb.memtable.all_size` | Gauge | Total size of all 
RocksDB memtables. |
   | RocksDB index | `rocksdb.memtable.immutable_count` | Gauge | Number of 
immutable RocksDB memtables. |
   | RocksDB index | `rocksdb.memtable.hit_ratio` | Gauge | RocksDB memtable 
lookup hit ratio. |
   | Streaming read | `issuedInstant` | Gauge | Last Hudi instant issued by 
streaming read. |
   | Streaming read | `issuedInstantDelay` | Gauge | Delay in seconds since the 
last issued instant. |
   | Streaming read | `splitLatestCommit` | Gauge | Latest commit seen by the 
current read split. |
   | Streaming read | `splitLatestCommitDelay` | Gauge | Delay in seconds since 
the split’s latest commit. |
   
   <action> is usually compaction or clustering. Metadata-table compaction 
exposes the same compaction metrics with an mdt. prefix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to