[jira] [Assigned] (SPARK-24441) Expose total size of states in HDFSBackedStateStoreProvider
[ https://issues.apache.org/jira/browse/SPARK-24441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24441: Assignee: Apache Spark > Expose total size of states in HDFSBackedStateStoreProvider > --- > > Key: SPARK-24441 > URL: https://issues.apache.org/jira/browse/SPARK-24441 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 2.3.0 >Reporter: Jungtaek Lim >Assignee: Apache Spark >Priority: Major > > While Spark exposes state metrics for single state, Spark still doesn't > expose overall memory usage of state (loadedMaps) in > HDFSBackedStateStoreProvider. > Since HDFSBackedStateStoreProvider caches multiple versions of entire state > in hashmap, this can occupy much memory than single version of state. Based > on the default value of minVersionsToRetain, the size of cache map can grow > more than 100 times of the size of single state. It would be better to expose > it as well so that end users can determine actual memory usage for state. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-24441) Expose total size of states in HDFSBackedStateStoreProvider
[ https://issues.apache.org/jira/browse/SPARK-24441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24441: Assignee: (was: Apache Spark) > Expose total size of states in HDFSBackedStateStoreProvider > --- > > Key: SPARK-24441 > URL: https://issues.apache.org/jira/browse/SPARK-24441 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 2.3.0 >Reporter: Jungtaek Lim >Priority: Major > > While Spark exposes state metrics for single state, Spark still doesn't > expose overall memory usage of state (loadedMaps) in > HDFSBackedStateStoreProvider. > Since HDFSBackedStateStoreProvider caches multiple versions of entire state > in hashmap, this can occupy much memory than single version of state. Based > on the default value of minVersionsToRetain, the size of cache map can grow > more than 100 times of the size of single state. It would be better to expose > it as well so that end users can determine actual memory usage for state. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org