[ https://issues.apache.org/jira/browse/SPARK-24441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-24441: ------------------------------------ Assignee: Apache Spark > Expose total size of states in HDFSBackedStateStoreProvider > ----------------------------------------------------------- > > Key: SPARK-24441 > URL: https://issues.apache.org/jira/browse/SPARK-24441 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming > Affects Versions: 2.3.0 > Reporter: Jungtaek Lim > Assignee: Apache Spark > Priority: Major > > While Spark exposes state metrics for single state, Spark still doesn't > expose overall memory usage of state (loadedMaps) in > HDFSBackedStateStoreProvider. > Since HDFSBackedStateStoreProvider caches multiple versions of entire state > in hashmap, this can occupy much memory than single version of state. Based > on the default value of minVersionsToRetain, the size of cache map can grow > more than 100 times of the size of single state. It would be better to expose > it as well so that end users can determine actual memory usage for state. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org