Hi Dongwoo

I think there are two configurations about state, one is state backend and
the other is snapshot storage. Flink will create a snapshot for each state
when the stateful operator collects all checkpoint barriers.

As @Feng mentioned above, users can config different state backend with
option: state.backend

The snapshot of state can be stored in JobManager. When the state is large,
flink supports storing the snapshot a distributed storage with option:
state.checkpoints.dir:

Best,
Shammon FY


On Mon, Apr 10, 2023 at 12:31 AM Feng Jin <jinfeng1...@gmail.com> wrote:

> Hi Dongwoo
>
>
> This can be quite confusing.
> Before Flink 1.13, Flink's statebackend was actually a hybrid concept that
> included three types of statebackends:
> *MemoryStateBackend*, *FsStateBackend*, and *RocksDBStateBackend*.
>
> The default *MemoryStateBackend* uses heap as the backend, and the state
> is stored in jobManger.
>
>
> You can refer to this migration document for more information:
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/state_backends/#migrating-from-legacy-backends
> .
>
>
> Best
> Feng
>
> On Sun, Apr 9, 2023 at 10:23 PM Dongwoo Kim <dongwoo7....@gmail.com>
> wrote:
>
>> Hi community, I’m new to flink and trying to learn about the concepts of
>> flink to prepare migrating heron application to flink.
>> I have a quick question about this flink document.
>> (
>> https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/concepts/stateful-stream-processing/#snapshotting-operator-state
>> )
>>
>> What I understood is states are stored in configured state backend which
>> can be either task manager’s heap or rocksdb.
>> And snapshots of checkpoint is stored by default in job manager’s heap
>> and mostly in distributed file system.
>> But in the document it says like below and it is confusing to me. Isn’t
>> the second line talking about checkpoint storage or checkpoint backend? Not
>> state backend? Thanks in advance, enjoy your weekend!
>>
>> *"Because the state of a snapshot may be large, it is stored in a
>> configurable state backend
>> <https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/state/state_backends/>.
>> By default, this is the JobManager’s memory, but for production use a
>> distributed reliable storage should be configured (such as HDFS)” *
>>
>

Reply via email to