[ 
https://issues.apache.org/jira/browse/SPARK-36236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

L. C. Hsieh reassigned SPARK-36236:
-----------------------------------

    Assignee: Venki Korukanti

> RocksDB state store: Add additional metrics for better observability into 
> state store operations
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-36236
>                 URL: https://issues.apache.org/jira/browse/SPARK-36236
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Structured Streaming
>    Affects Versions: 3.1.2
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>            Priority: Major
>
> Proposing adding following new metrics to {{customMetrics}} under the 
> {{stateOperators}} in {{StreamingQueryProgress}} event These metrics help 
> have better visibility into the RocksDB based state store in streaming jobs.
>  * {{rocksdbGetCount}} number of get calls to the DB (doesn’t include Gets 
> from WriteBatch - in memory batch used for staging writes) 
>  * {{rocksdbPutCount}} number of put calls to the DB (doesn’t include Puts to 
> WriteBatch - in memory batch used for staging writes)
>  * {{rocksdbTotalBytesReadByGet/rocksdbTotalBytesWrittenByPut}}: Number of 
> uncompressed bytes read/written by get/put operations
>  * {{rocksdbReadBlockCacheHitCount/rocksdbReadBlockCacheMissCount}} indicates 
> how much of the block cache in RocksDB is useful or not and avoiding local 
> disk reads
>  * 
> {{rocksdbTotalBytesReadByCompaction/rocksdbTotalBytesWrittenByCompaction}}: 
> How many bytes the compaction process read from disk and written to disk. 
>  * {{rocksdbTotalCompactionTime}}: Time (in ns) took for compactions (both 
> background and the optional compaction initiated during the commit)
>  * {{rocksdbWriterStallDuration}} Time (in ns) the writer has stalled due to 
> a background compaction or flushing of the immutable memtables to disk. 
>  * {{rocksdbTotalBytesReadThroughIterator}} Some of the stateful operations 
> (such as timeout processing in FlatMapGroupsWithState and watermarking) 
> requires reading entire data in DB through iterator. This metric tells the 
> total size of uncompressed data read using the iterator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to