lrsb commented on PR #16011: URL: https://github.com/apache/iceberg/pull/16011#issuecomment-4288652472
> > [..] so a restart or rescale with a fresh Flink JobID would stamp restored write results with the new id. > > Regular job restarts or rescalings do not change the Flink JobID. The JobID only changes on resubmission of the job, which can be triggered by the Kubernetes operator or some other external process. > > That said, I'm not convinced we are fixing an actual bug here. `DynamicCommitter` will commit once the job restarts and restores its state. There will be no new entries in the commit state from the aggregator with the new JobID. The to-be-committed entries after restore will all carry the old JobID. So the logic will work fine as-is. Regarding the job id change during rescalings, that happens when the scaling is applied using stop with savepoint and then the parallelism is overridden at vertex level, without involving the stock Flink autoscaler. I agree this is not the standard setup but might be a possibility when the autoscaler cannot be used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
