Re: Wiring batch and stream together

2018-05-11 Thread Fabian Hueske
Hi Peter, Building the state for a DataStream job in a DataSet (batch) job is currently not possible. You can however, implement a DataStream job that reads batch data and builds the state. When all data was processed, you'd need to save the state as a savepoint and can resume a streaming job

Wiring batch and stream together

2018-05-02 Thread Peter Zende
Hi, We have a Flink streaming pipeline (1.4.2) which reads from Kafka, uses mapWithState with RocksDB and writes the updated states to Cassandra. We also would like to reprocess the ingested records from HDFS. For this we consider computing the latest state of the records over the whole dataset