Hi Yunfan,

Jobs are supposed to correctly restart from both savepoints and checkpoints
with different parallelisms if only operator states and keyed states are
used. In the cases where there exist unpartitionable states (e.g., those
are produced by the Checkpointed interface), the job will fail to restart
if the parallelism is changed.

In Flink, both operator states and keyed states are described as
collections of objects, hence are partitionable. To be specific, operator
states are composed of a list of objects. When the parallelism changes,
these objects will be redistributed to the tasks evenly.

The assignment of keyed states shares a similar idea. The keyed states are
composed of a set of key groups. When the parallelism changes, these key
groups will also be redistributed to the tasks.  The restoring of keyed
states varies in different state backend settings.  In Flink-1.2, the
rocksdb state backend will download all the key-value pairs in its key
group range and insert them into a new rocksdb instance to recover the
states.

You can find more details about the scaling of keyed states and operator
states in the following links.
Dynamic Scaling: Key Groups
<https://docs.google.com/document/d/1G1OS1z3xEBOrYD4wSu-LuBCyPUWyFd9l3T9WyssQ63w/edit#heading=h.2suhu1fjxmp4>
FLIP-8: Rescalable Non-Partitioned State
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-8%3A+Rescalable+Non-Partitioned+State>

May the information helps you.

Regards
Xiaogang


2017-05-21 11:43 GMT+08:00 yunfan123 <yunfanfight...@foxmail.com>:

> How this exactly works?
> For example, I  save my state using rocksdb + hdfs.
> When I change the parallelism of my job,  can start from checkpoint work?
>
>
>
> --
> View this message in context: http://apache-flink-user-
> mailing-list-archive.2336050.n4.nabble.com/Question-about-
> start-with-checkpoint-tp13234.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>

Reply via email to