Hi
I declare a error pod-template without a container named flink-main-container 
to test rollback feature.
Please pay attention to the Pod-template in the old and new specs.


Last stable  spec:
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
  name: basic-example
spec:
  image: flink:1.16
  flinkVersion: v1_16
  flinkConfiguration:
    taskmanager.numberOfTaskSlots: "2"
    kubernetes.operator.deployment.rollback.enabled: true
    state.savepoints.dir: s3://flink-data/savepoints
    state.checkpoints.dir: s3://flink-data/checkpoints
    high-availability: 
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
    high-availability.storageDir: s3://flink-data/ha
  serviceAccount: flink
  podTemplate:
    spec:
      containers:
        - name: flink-main-container      
          env:
          - name: TZ
            value: Asia/Shanghai
  jobManager:
    resource:
      memory: "2048m"
      cpu: 1
  taskManager:
    resource:
      memory: "2048m"
      cpu: 1
  job:
    jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
    parallelism: 2
    upgradeMode: stateless


new Spec:
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
  name: basic-example
spec:
  image: flink:1.16
  flinkVersion: v1_16
  flinkConfiguration:
    taskmanager.numberOfTaskSlots: "2"
    kubernetes.operator.deployment.rollback.enabled: true
    state.savepoints.dir: s3://flink-data/savepoints
    state.checkpoints.dir: s3://flink-data/checkpoints
    high-availability: 
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
    high-availability.storageDir: s3://flink-data/ha
  serviceAccount: flink
  podTemplate:
    spec:
      containers:
        -   env:
          - name: TZ
            value: Asia/Shanghai
  jobManager:
    resource:
      memory: "2048m"
      cpu: 1
  taskManager:
    resource:
      memory: "2048m"
      cpu: 1
  job:
    jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
    parallelism: 2
    upgradeMode: stateless















--

Best,
Hjw




At 2023-02-20 08:48:46, "Shammon FY" <zjur...@gmail.com> wrote:

Hi


I cannot see the difference between the two configurations, but the error info 
`Failure executing: POST at: 
https://*/k8s/clusters/c-fwkxh/apis/apps/v1/namespaces/test-flink/deployments. 
Message: Deployment.apps "basic-example" is invalid` is strange. Maybe you can 
check whether the configuration of k8s has changed?



Best,
Shammon




On Mon, Feb 20, 2023 at 12:56 AM hjw <hjw_em...@163.com> wrote:

I make a test on the Application upgrade rollback feature, but this function 
fails.The Flink application mode job cannot roll back to  last stable spec.
As shown in the follow example, I declare a error pod-template without a 
container named flink-main-container to test rollback feature.
However, only the error of deploying the flink application job failed without 
rollback.


Error:
org.apache.flink.client.deployment.ClusterDeploymentException: Could not create 
Kubernetes cluster "basic-example".
 at 
org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployClusterInternal(KubernetesClusterDescriptor.java:292)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure 
executing: POST at: 
https://*/k8s/clusters/c-fwkxh/apis/apps/v1/namespaces/test-flink/deployments. 
Message: Deployment.apps "basic-example" is invalid: 
[spec.template.spec.containers[0].name: Required value, 
spec.template.spec.containers[0].image: Required value]. Received status: 
Status(apiVersion=v1, code=422, 
details=StatusDetails(causes=[StatusCause(field=spec.template.spec.containers[0].name,
 message=Required value, reason=FieldValueRequired, additionalProperties={}), 
StatusCause(field=spec.template.spec.containers[0].image, message=Required 
value, reason=FieldValueRequired, additionalProperties={})], group=apps, 
kind=Deployment, name=flink-bdra-sql-application-job-s3p, 
retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, 
message=Deployment.apps "flink-bdra-sql-application-job-s3p" is invalid: 
[spec.template.spec.containers[0].name: Required value, 
spec.template.spec.containers[0].image: Required value], 
metadata=ListMeta(_continue=null, remainingItemCount=null, 
resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, 
status=Failure, additionalProperties={}).
 at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:673)
 at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:612)
 at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:560)


Env:
Flink version:Flink 1.16
Flink Kubernetes Operator:1.3.1


Last stable  spec:
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
  name: basic-example
spec:
  image: flink:1.16
  flinkVersion: v1_16
  flinkConfiguration:
    taskmanager.numberOfTaskSlots: "2"
    kubernetes.operator.deployment.rollback.enabled: true
    state.savepoints.dir: s3:///flink-data/savepoints
    state.checkpoints.dir: s3:///flink-data/checkpoints
    high-availability: 
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
    high-availability.storageDir: s3:///flink-data/ha
  serviceAccount: flink
  podTemplate:
    spec:
      containers:
        - name: flink-main-container      
          env:
          - name: TZ
            value: Asia/Shanghai
  jobManager:
    resource:
      memory: "2048m"
      cpu: 1
  taskManager:
    resource:
      memory: "2048m"
      cpu: 1
  job:
    jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
    parallelism: 2
    upgradeMode: stateless


new Spec:
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
  name: basic-example
spec:
  image: flink:1.16
  flinkVersion: v1_16
  flinkConfiguration:
    taskmanager.numberOfTaskSlots: "2"
    kubernetes.operator.deployment.rollback.enabled: true
    state.savepoints.dir: s3:///flink-data/savepoints
    state.checkpoints.dir: s3:///flink-data/checkpoints
    high-availability: 
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
    high-availability.storageDir: s3:///flink-data/ha
  serviceAccount: flink
  podTemplate:
    spec:
      containers:
        -   env:
          - name: TZ
            value: Asia/Shanghai
  jobManager:
    resource:
      memory: "2048m"
      cpu: 1
  taskManager:
    resource:
      memory: "2048m"
      cpu: 1
  job:
    jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
    parallelism: 2
    upgradeMode: stateless

--

Best,
Hjw

Reply via email to