Hi Dongwoo I think there are two configurations about state, one is state backend and the other is snapshot storage. Flink will create a snapshot for each state when the stateful operator collects all checkpoint barriers.
As @Feng mentioned above, users can config different state backend with option: state.backend The snapshot of state can be stored in JobManager. When the state is large, flink supports storing the snapshot a distributed storage with option: state.checkpoints.dir: Best, Shammon FY On Mon, Apr 10, 2023 at 12:31 AM Feng Jin <jinfeng1...@gmail.com> wrote: > Hi Dongwoo > > > This can be quite confusing. > Before Flink 1.13, Flink's statebackend was actually a hybrid concept that > included three types of statebackends: > *MemoryStateBackend*, *FsStateBackend*, and *RocksDBStateBackend*. > > The default *MemoryStateBackend* uses heap as the backend, and the state > is stored in jobManger. > > > You can refer to this migration document for more information: > https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/state_backends/#migrating-from-legacy-backends > . > > > Best > Feng > > On Sun, Apr 9, 2023 at 10:23 PM Dongwoo Kim <dongwoo7....@gmail.com> > wrote: > >> Hi community, I’m new to flink and trying to learn about the concepts of >> flink to prepare migrating heron application to flink. >> I have a quick question about this flink document. >> ( >> https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/concepts/stateful-stream-processing/#snapshotting-operator-state >> ) >> >> What I understood is states are stored in configured state backend which >> can be either task manager’s heap or rocksdb. >> And snapshots of checkpoint is stored by default in job manager’s heap >> and mostly in distributed file system. >> But in the document it says like below and it is confusing to me. Isn’t >> the second line talking about checkpoint storage or checkpoint backend? Not >> state backend? Thanks in advance, enjoy your weekend! >> >> *"Because the state of a snapshot may be large, it is stored in a >> configurable state backend >> <https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/state/state_backends/>. >> By default, this is the JobManager’s memory, but for production use a >> distributed reliable storage should be configured (such as HDFS)” * >> >