[ 
https://issues.apache.org/jira/browse/SPARK-33827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-33827.
----------------------------------
    Fix Version/s: 3.2.0
       Resolution: Fixed

Issue resolved by pull request 30827
[https://github.com/apache/spark/pull/30827]

> Unload State Store asap once it becomes inactive
> ------------------------------------------------
>
>                 Key: SPARK-33827
>                 URL: https://issues.apache.org/jira/browse/SPARK-33827
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.2.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>             Fix For: 3.2.0
>
>
> SS maintains state stores in executors across batches. Due to the nature of 
> Spark scheduling, a state store might be allocated on another executor in 
> next batch. The state store in previous batch becomes inactive.
> Now we run a maintenance task periodically to unload inactive state stores. 
> So there will be some delays between a state store becomes inactive and it is 
> unloaded.
> Per the discussion on https://github.com/apache/spark/pull/30770 with 
> [~kabhwan], I think the preference is to unload inactive state store asap.
> However, we can force Spark to always allocate a state store to same 
> executor, by using task locality configuration. This can reduce the 
> possibility to have inactive state store.
> Normally, I think with locality configuration, we might not able to see 
> inactive state store generally. There is still chance an executor can be 
> failed and reallocated, but in this case, inactive state store is also lost 
> too. So it is not an issue.
> So unloading inactive store asap is only useful when we don't use task 
> locality to force state store locality across batches.
> The required change to make driver-executor bi-directional for state store 
> management looks non-trivial. If we already can reduce possibility of 
> inactive store, is it still worth making non-trivial here?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to