Hi Gyula, Flink version is 1.14 Our flink version is hard to upgrade since we have some user in our platform. sorry I have not noticed this configuration, it's confusing because flink operator announced support from 1.13 to 1.17/1.18
Has other solution will work in our situation? Thanks Richard Su > 2023年12月7日 16:22,Gyula Fóra <gyula.f...@gmail.com> 写道: > > Hi! > > What Flink version are you using? > The operator always sets: execution.shutdown-on-application-finish to false > so that finished / failed application clusters should not exit immediately > and we can observe them. > > This is however only available in Flink 1.15 and above. > > Cheers, > Gyula > > On Thu, Dec 7, 2023 at 9:15 AM richard.su <richardsuc...@gmail.com> wrote: > >> Hi, Community, I had found out this issue, but I'm not sure this issue >> have any solution. I have tried flink operator 1.6, which this issue is >> still exist. >> >> If not, I think this could create a jira issue to following. >> >> When we create a bounded streaming jobs which will finally to become >> Finished status, after this job's status from Running to Finished, flink >> will shut down kubernetes cluster, at code of flink-kubernetes package, >> class KubernetesResourceManagerDriver's method deregisterApplication, which >> will delete jm deployment directly in a second (in our env). >> But our operator config, when jm deployment status is Ready and not in >> savepoint progress, this observer interval is 15s, which means operator >> will never observe the job status changing. >> So if the job is failed not finished, we cannot distinguish this. All we >> known is Jm deployment is Missing and Job status is Reconciling. >> We want to using flink operator integrating into our platform, but it >> cannot monitor job real status, which is wired. >> >> May be it till related to the clean logic of flink native mode, from my >> side, operator side is hard to deal with such situation because we cannot >> directly get the exit code of container when pod is missing and jm >> deployment is missing. >> >> Thanks to your time to read this issue. >> Richard Su >>> >>> 2023年12月6日 13:34,richard.su <richardsuc...@gmail.com> 写道: >>> >>> For more information to produce this problem, >>> >>> version: flink operator 1.4 >>> mode: native >>> job: wordcount >>> language: java >>> type: FlinkDeployment >>> >>>> 2023年12月6日 10:52,richard.su <richardsuc...@gmail.com> 写道: >>>> >>>> Hi Community, the default configuration of flink operator is: >>>> >>>> kubernetes.operator.reconcile.interval: 15s >>>> kubernetes.operator.observer.progress-check.interval: 5s >>>> >>>> when a bounded streaming job already stays in stop or error status, jm >> deployment will stay to be missing, if I set configuration: >>>> >>>> kubernetes.operator.jm-deployment-recover.enabled: false >>>> >>>> then, flink operator can only observe the job status at Recociling and >> jm deployment status at Missing >>>> >>>> we cannot check whether the flink job is finished or error, because of >> in the interval of observer.progress-check, flink web ui is already down. >>>> >>>> so, we hope someone in community could show a way to monitor bounded >> steaming job's status. >>>> >>>> Thanks. >>>> >>>> Richard Su >>> >> >>