[ 
https://issues.apache.org/jira/browse/SPARK-12260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051949#comment-15051949
 ] 

Mao, Wei commented on SPARK-12260:
----------------------------------

In order to recover from process restarting, there are several steps: 
1) dump in-memory state when StreamingContext.stop is invoked. 
2) load the dump back from HDFS or whatever source, and convent to RDD
3) pass the RDD into updateStateByKey as initial state.

As you said, step 3) is already supported with current code which is great. But 
step 1) is missing. In short, the main purpose of this JIRA is adding new 
callback function in StreamingContext.stop, so user can have chance to dump 
specified in-memory states during streaming context shutdown. 

> Graceful Shutdown with In-Memory State
> --------------------------------------
>
>                 Key: SPARK-12260
>                 URL: https://issues.apache.org/jira/browse/SPARK-12260
>             Project: Spark
>          Issue Type: New Feature
>          Components: Streaming
>            Reporter: Mao, Wei
>              Labels: streaming
>
> Users often stop and restart their streaming jobs for tasks such as 
> maintenance, software upgrades or even application logic updates. When a job 
> re-starts it should pick up where it left off i.e. any state information that 
> existed when the job stopped should be used as the initial state when the job 
> restarts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to