[ 
https://issues.apache.org/jira/browse/SPARK-33827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

L. C. Hsieh updated SPARK-33827:
--------------------------------
    Description: 
SS maintains state stores in executors across batches. Due to the nature of 
Spark scheduling, a state store might be allocated on another executor in next 
batch. The state store in previous batch becomes inactive.

Now we run a maintenance task periodically to unload inactive state stores. So 
there will be some delays between a state store becomes inactive and it is 
unloaded.

Per the discussion on https://github.com/apache/spark/pull/30770 with 
[~kabhwan], I think the preference is to unload inactive state store asap.

However, we can force Spark to always allocate a state store to same executor, 
by using task locality configuration. This can reduce the possibility to have 
inactive state store.

Normally, I think with locality configuration, we might not able to see 
inactive state store generally. There is still chance an executor can be failed 
and reallocated, but in this case, inactive state store is also lost too. So it 
is not an issue.

So unloading inactive store asap is only useful when we don't use task locality 
to force state store locality across batches.

The required change to make driver-executor bi-directional for state store 
management looks non-trivial. If we already can reduce possibility of inactive 
store, is it still worth making non-trivial here?







  was:
SS maintains state stores in executors across batches. Due to the nature of 
Spark scheduling, a state store might be allocated on another executor in next 
batch. The state store in previous batch becomes inactive.

Now we run a maintenance task periodically to unload inactive state stores. So 
there will be some delays between a state store becomes inactive and it is 
unloaded.

Per the discussion on https://github.com/apache/spark/pull/30770 with 
[~kabhwan], I think the preference is to unload inactive state asap.

However, we can force Spark to always allocate a state store to same executor, 
by using task locality configuration. This can reduce the possibility to have 
inactive state store.

Normally, I think with locality configuration, we might not able to see 
inactive state store generally. There is still chance an executor can be failed 
and reallocated, but in this case, inactive state store is also lost too. So it 
is not an issue.

The required change to make driver-executor bi-directional for state store 
management looks non-trivial. If we already can reduce possibility of inactive 
store, is it still worth making non-trivial here?








> Unload State Store asap once it becomes inactive
> ------------------------------------------------
>
>                 Key: SPARK-33827
>                 URL: https://issues.apache.org/jira/browse/SPARK-33827
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.2.0
>            Reporter: L. C. Hsieh
>            Priority: Major
>
> SS maintains state stores in executors across batches. Due to the nature of 
> Spark scheduling, a state store might be allocated on another executor in 
> next batch. The state store in previous batch becomes inactive.
> Now we run a maintenance task periodically to unload inactive state stores. 
> So there will be some delays between a state store becomes inactive and it is 
> unloaded.
> Per the discussion on https://github.com/apache/spark/pull/30770 with 
> [~kabhwan], I think the preference is to unload inactive state store asap.
> However, we can force Spark to always allocate a state store to same 
> executor, by using task locality configuration. This can reduce the 
> possibility to have inactive state store.
> Normally, I think with locality configuration, we might not able to see 
> inactive state store generally. There is still chance an executor can be 
> failed and reallocated, but in this case, inactive state store is also lost 
> too. So it is not an issue.
> So unloading inactive store asap is only useful when we don't use task 
> locality to force state store locality across batches.
> The required change to make driver-executor bi-directional for state store 
> management looks non-trivial. If we already can reduce possibility of 
> inactive store, is it still worth making non-trivial here?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to