Hello, We have an apache beam streaming application, running under flink native kubernetes. It consolidates aws kinesis records into parquet files every few minutes.
To manage the lifecycle of this app, we use the rest api to stop the job with a savepoint and then restart the cluster/job from said savepoint. This normally works as expected, but we run into problems when the data schema changes. So far so good, since, as expected, even if the schema changes, stopping the job using "drain:true" results in a proper upgrade without issues. To avoid over complicating our release workflows, we are evaluating the possibility of doing a "drain" restart every time we do a new release. However, we have come across the following: > Use the --drain flag if you want to terminate the job permanently. If you want to resume the job at a later point in time, then do not drain the pipeline because it could lead to incorrect results when the job is resumed [1]. It's not clear what kind of "incorrect results" we could face here - can anybody elaborate? Our own tests show that we do not lose events from the kinesis queue after the restart. Thanks, Pedro [1]( https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/cli/#terminating-a-job ) -- *Pedro Facal San Luis* <ped...@empathy.co> Data Team Lead [image: Empathy Logo] Privacy Policy <https://www.empathy.co/privacy-policy/>