[ https://issues.apache.org/jira/browse/SPARK-33827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim resolved SPARK-33827. ---------------------------------- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 30827 [https://github.com/apache/spark/pull/30827] > Unload State Store asap once it becomes inactive > ------------------------------------------------ > > Key: SPARK-33827 > URL: https://issues.apache.org/jira/browse/SPARK-33827 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming > Affects Versions: 3.2.0 > Reporter: L. C. Hsieh > Assignee: L. C. Hsieh > Priority: Major > Fix For: 3.2.0 > > > SS maintains state stores in executors across batches. Due to the nature of > Spark scheduling, a state store might be allocated on another executor in > next batch. The state store in previous batch becomes inactive. > Now we run a maintenance task periodically to unload inactive state stores. > So there will be some delays between a state store becomes inactive and it is > unloaded. > Per the discussion on https://github.com/apache/spark/pull/30770 with > [~kabhwan], I think the preference is to unload inactive state store asap. > However, we can force Spark to always allocate a state store to same > executor, by using task locality configuration. This can reduce the > possibility to have inactive state store. > Normally, I think with locality configuration, we might not able to see > inactive state store generally. There is still chance an executor can be > failed and reallocated, but in this case, inactive state store is also lost > too. So it is not an issue. > So unloading inactive store asap is only useful when we don't use task > locality to force state store locality across batches. > The required change to make driver-executor bi-directional for state store > management looks non-trivial. If we already can reduce possibility of > inactive store, is it still worth making non-trivial here? -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org