Re: s3 statebackend user state size

Chen Qin Tue, 10 May 2016 18:12:44 -0700

Hi Ufuk,

Yes, it does help with Rocksdb backend!
After tune checkpoint frequency align with network throughput, task manager 
released and job get cancelled are gone.


Chen


> On May 10, 2016, at 10:33 AM, Ufuk Celebi <u...@apache.org> wrote:
> 
>> On Tue, May 10, 2016 at 5:07 PM, Chen Qin <qinnc...@gmail.com> wrote:
>> Future, to keep large key/value space, wiki point out using rocksdb as
>> backend. My understanding is using rocksdb will write to local file systems
>> instead of sync to s3. Does flink support memory->rocksdb(local disk)->s3
>> checkpoint state split yet? Or would implement kvstate interface makes flink
>> take care of large state problem?
> 
> Hey Chen,
> 
> when you use RocksDB, you only need to explicitly configure the file
> system checkpoint directory, for which you can use S3:
> 
> new RocksDBStateBackend(new URI("s3://..."))
> 
> The local disk path are configured via the general Flink temp
> directory configuration (see taskmanager.tmp.dirs in
> https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/config.html,
> default is /tmp).
> 
> State is written to the local RocksDB instance and the RocksDB files
> are copied to S3 on checkpoints.
> 
> Does this help?
> 
> – Ufuk

Re: s3 statebackend user state size

Reply via email to