[ https://issues.apache.org/jira/browse/SPARK-36236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384670#comment-17384670 ]
Apache Spark commented on SPARK-36236: -------------------------------------- User 'vkorukanti' has created a pull request for this issue: https://github.com/apache/spark/pull/33455 > RocksDB state store: Add additional metrics for better observability into > state store operations > ------------------------------------------------------------------------------------------------ > > Key: SPARK-36236 > URL: https://issues.apache.org/jira/browse/SPARK-36236 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming > Affects Versions: 3.1.2 > Reporter: Venki Korukanti > Priority: Major > > Proposing adding following new metrics to {{customMetrics}} under the > {{stateOperators}} in {{StreamingQueryProgress}} event These metrics help > have better visibility into the RocksDB based state store in streaming jobs. > * {{rocksdbGetCount}} number of get calls to the DB (doesn’t include Gets > from WriteBatch - in memory batch used for staging writes) > * {{rocksdbPutCount}} number of put calls to the DB (doesn’t include Puts to > WriteBatch - in memory batch used for staging writes) > * {{rocksdbTotalBytesReadByGet/rocksdbTotalBytesWrittenByPut}}: Number of > uncompressed bytes read/written by get/put operations > * {{rocksdbReadBlockCacheHitCount/rocksdbReadBlockCacheMissCount}} indicates > how much of the block cache in RocksDB is useful or not and avoiding local > disk reads > * > {{rocksdbTotalBytesReadByCompaction/rocksdbTotalBytesWrittenByCompaction}}: > How many bytes the compaction process read from disk and written to disk. > * {{rocksdbTotalCompactionTime}}: Time (in ns) took for compactions (both > background and the optional compaction initiated during the commit) > * {{rocksdbWriterStallDuration}} Time (in ns) the writer has stalled due to > a background compaction or flushing of the immutable memtables to disk. > * {{rocksdbTotalBytesReadThroughIterator}} Some of the stateful operations > (such as timeout processing in FlatMapGroupsWithState and watermarking) > requires reading entire data in DB through iterator. This metric tells the > total size of uncompressed data read using the iterator. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org