Re: ArgoCD unable to process health from k8s FlinkDeployment - stuck in Processing

Gyula Fóra Fri, 09 Dec 2022 04:53:14 -0800

Hi!

The resource lifecycle state is currently not shown explicitly in the
status.


You are confusing it with reconciliation status. At the moment you can only
get this through the Java client:

https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator-api/src/main/java/org/apache/flink/kubernetes/operator/api/status/CommonStatus.java

This seems to be a very common request so we should probably expose this
field directly in the status, even though it could technically be derived
from other fields.

Could you please open a jira ticket for this improvement?

Cheers
Gyula

On Fri, 9 Dec 2022 at 05:46, Edgar H <kaotix...@gmail.com> wrote:

> Morning all,
>
> Recently been testing the Flink k8s operator (flink.apache.org/v1beta1)
> and although the jobs do startup and run perfectly fine, their status in
> ArgoCD is not yet as it should be, some details:
>
> When describing the flinkdeployment I'm currently trying to test, the
> follows appears in events:
>
>   Type    Reason         Age   From                  Message
>   ----    ------         ----  ----                  -------
>   Normal  Submit         22m   JobManagerDeployment  Starting deployment
>   Normal  StatusChanged  21m   Job                   Job status changed
> from RECONCILING to CREATED
>   Normal  StatusChanged  20m   Job                   Job status changed
> from CREATED to RUNNING
>
> On top of it, the reconciliation timestamp and the state are as follows:
>
>     Reconciliation Timestamp:  1670581014190
>     State:                     DEPLOYED
>
> From what I've read in the docs, the flinkdeployment is not considered
> healthy until that state: STABLE, right?
>
>
>    - DEPLOYED : The resource is deployed/submitted to Kubernetes, but
>    it’s not yet considered to be stable and might be rolled back in the future
>    - STABLE : The resource deployment is considered to be stable and
>    won’t be rolled back
>
>
> The jobs have been running for some hours already, one of them would throw
> some exceptions but won't cause downtime. What does it take for the job to
> be in STABLE state rather than just DEPLOYED? Would that be the cause of
> the Processing... health status in ArgoCD or just that internally in k8s
> the flinkoperator can't really notice the pods running well?
>

Re: ArgoCD unable to process health from k8s FlinkDeployment - stuck in Processing

Reply via email to