Maximilian Michels created FLINK-32170:
------------------------------------------

             Summary: Continue metric collection on intermittant job restarts
                 Key: FLINK-32170
                 URL: https://issues.apache.org/jira/browse/FLINK-32170
             Project: Flink
          Issue Type: Improvement
          Components: Autoscaler, Kubernetes Operator
            Reporter: Maximilian Michels


If the underlying infrastructure is not stable, e.g. Kubernetes pod eviction, 
the jobs will sometimes restart. This will restart the metric collection 
process for the autoscaler and discard any existing metrics. If the 
interruption time is short, e.g. less than one minute, we could consider 
resuming metric collection after the job goes back into RUNNING state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to