Hi Anand, About "Cancel with savepoint" congxian is right.
And for the duplicates, You should use kafka producer transaction (since 0.11) provided EXACTLY_ONCE semantic[1]. Thanks, vino. [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/connectors/kafka.html#kafka-011 Congxian Qiu <qcx978132...@gmail.com> 于2018年10月12日周五 下午7:55写道: > AFAIK, "Cancel Job with Savepoint" will stop checkpointScheduler --> > trigger a savepoint, then cancel your job. there will no more checkpoints. > > <anand.gopin...@ubs.com> 于2018年10月12日周五 上午1:30写道: > >> Hi, >> >> >> >> I had a couple questions about savepoints / checkpoints >> >> >> >> When I issue "Cancel Job with Savepoint", how is that instruction >> co-ordinated with check points? Am I certain the savepoint will be the last >> operation (i.e. no more check points)? >> >> >> >> I have a kafka src>operation>kafka sink task in flink. And it looks like >> on restart from the savepoint there are duplicates written to the sink >> topic in kafka. The dupes overlap with the last few events prior to save >> point, and I am trying to work out what could have happened. >> >> My FlinkKafkaProducer011 is set to Semantic.AT_LEAST_ONCE, but >> env.enableCheckpointing(parameters.getInt("checkpoint.interval"), >> CheckpointingMode.EXACTLY_ONCE). >> >> I thought at least once still implies flushes to kafka still only occur >> with a checkpoint. >> >> >> >> One theory is a further checkpoint occurred after/ during the savepoint >> - which would have flushed events to kafka that are not in my savepoint. >> >> >> >> Any pointers to schoolboy errors I may have made would be appreciated. >> >> >> >> ----- >> >> Also am I right in thinking if I have managed state with rocksdb back >> end that is using 1G on disk, but substantially less keyed state in memory, >> a savepoint needs to save the full 1G to complete? >> >> >> >> Regards >> >> Anand >> > > > -- > Blog:http://www.klion26.com > GTalk:qcx978132955 > 一切随心 >