Re: incremental Checkpointing , Rocksdb HA

2016-06-18 Thread Chen Qin
Thanks everyone, we were reasoning about the expense of drawing snapshots of large state as a major benenfits to using rocksdb compare to jdbc backend. Our use case is money related event processing. It requires keeping weeks long large window, major data source ingestion QPS is around hundreds, a

Re: incremental Checkpointing , Rocksdb HA

2016-06-10 Thread Stephan Ewen
Hi! The incremental checkpointing is still being worked upon. Aljoscha, Till and me have thought through this a lot and have now a pretty good understanding how we want to do it with respect to coordination, savepoints, restore, garbage collecting unneeded checkpoints, etc. We want to put this in

Re: incremental Checkpointing , Rocksdb HA

2016-06-09 Thread Nick Dimiduk
IIRC, all the above support data locality from back in the MR days. Not sure how much data you're planning to checkpoint though -- is locality really that important for transient processor state? On Thu, Jun 9, 2016 at 11:06 AM, CPC wrote: > Cassandra backend would be interesting especially if

Re: incremental Checkpointing , Rocksdb HA

2016-06-09 Thread CPC
Cassandra backend would be interesting especially if flink could benefit from cassandra data locality. Cassandra/spark integration is using this for information to schedule spark tasks. On 9 June 2016 at 19:55, Nick Dimiduk wrote: > You might also consider support for a Bigtable > backend: HBas

Re: incremental Checkpointing , Rocksdb HA

2016-06-09 Thread Nick Dimiduk
You might also consider support for a Bigtable backend: HBase/Accumulo/Cassandra. The data model should be similar (identical?) to RocksDB and you get HA, recoverability, and support for really large state "for free". On Thursday, June 9, 2016, Chen Qin wrote: > Hi there, > > What is progress on

incremental Checkpointing , Rocksdb HA

2016-06-09 Thread Chen Qin
Hi there, What is progress on incremental checkpointing? Does flink dev has plan to work on this or JIRA to track this? super interested to know. I also research and consider use rocksdbstatebackend without running HDFS cluster nor talk to S3. Some primitive idea is to use ZK to store / notify st