Jungtaek Lim created SPARK-24441: ------------------------------------ Summary: Expose total size of states in HDFSBackedStateStoreProvider Key: SPARK-24441 URL: https://issues.apache.org/jira/browse/SPARK-24441 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 2.3.0 Reporter: Jungtaek Lim
While Spark exposes state metrics for single state, Spark still doesn't expose overall memory usage of state (loadedMaps) in HDFSBackedStateStoreProvider. Since HDFSBackedStateStoreProvider caches multiple versions of entire state in hashmap, this can occupy much memory than single version of state. Based on the default value of minVersionsToRetain, the size of cache map can grow more than 100 times of the size of single state. It would be better to expose it as well so that end users can determine actual memory usage for state. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org