In one case however, we do want to retain the same cluster id ( think ingress on k8s and thus SLAs with external touch points ) but it is essentially a new job ( added an incompatible change but at the interface level it retains the same contract ) , the only way seems to be to remove the chroot/subcontext from ZK , and relaunch , essentially deleting ant vestiges of the previous incarnation. And that is fine if that is indeed the process.
On Fri, Feb 8, 2019 at 7:58 AM Till Rohrmann <trohrm...@apache.org> wrote: > If you keep the same cluster id, the upgraded job should pick up > checkpoints from the completed checkpoint store. However, I would recommend > to take a savepoint and resume from this savepoint because then you can > also specify that you allow non restored state, for example. > > Cheers, > Till > > On Fri, Feb 8, 2019 at 11:20 AM Vishal Santoshi <vishal.santo...@gmail.com> > wrote: > >> Is the rationale of using a jobID 000000* also roughly the same. As in a >> Flink job cluster is a single job and thus a single job id suffices ? I am >> more wondering about the case when we are doing a compatible changes to a >> job and want to resume ( given we are in HA mode and thus have a >> chroot/subcontext on ZK for the job cluster ) , it would make no sense to >> give a brand new job id ? >> >> On Thu, Feb 7, 2019 at 4:42 AM Till Rohrmann <trohrm...@apache.org> >> wrote: >> >>> Hi Sergey, >>> >>> the rationale why we are using a K8s job instead of a deployment is that >>> a Flink job cluster should terminate after it has successfully executed the >>> Flink job. This is unlike a session cluster which should run forever and >>> for which a K8s deployment would be better suited. >>> >>> If in your use case a K8s deployment would better work, then I would >>> suggest to change the `job-cluster-job.yaml` accordingly. >>> >>> Cheers, >>> Till >>> >>> On Tue, Feb 5, 2019 at 4:12 PM Sergey Belikov <belikov.ser...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> my team is currently experimenting with Flink running in Kubernetes >>>> (job cluster setup). And we found out that with JobManager being deployed >>>> as "Job" we can't just simply update certain values in job's yaml, e.g. >>>> spec.template.spec.containers.image ( >>>> https://github.com/kubernetes/kubernetes/issues/48388#issuecomment-319493817). >>>> This causes certain troubles in our CI/CD pipelines so we are thinking >>>> about using "Deployment" instead of "Job". >>>> >>>> With that being said I'm wondering what was the motivation behind using >>>> "Job" resource for deploying JobManager? And are there any pitfalls related >>>> to using Deployment and not Job for JobManager? >>>> >>>> Thank you in advance. >>>> -- >>>> Best regards, >>>> Sergey Belikov >>>> >>>