[jira] [Commented] (FLINK-28210) FlinkSessionJob fails after FlinkDeployment is updated

2022-06-23 Thread Daniel Crowe (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558270#comment-17558270
 ] 

Daniel Crowe commented on FLINK-28210:
--

Thank you. I'll give it a go.

> FlinkSessionJob fails after FlinkDeployment is updated
> --
>
> Key: FLINK-28210
> URL: https://issues.apache.org/jira/browse/FLINK-28210
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.0.0
> Environment: The [quick 
> start|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/]
>  was followed to install minikube and the flink operator. 
>  
> minikube 1.24.1
> kubectl 1.24.2
> flink operator: 1.0.0
>Reporter: Daniel Crowe
>Priority: Major
>
> I created a flink deployment using this example:
> {code}
> curl 
> https://raw.githubusercontent.com/apache/flink-kubernetes-operator/main/examples/basic-session-job.yaml
>  -o basic-session-job.yaml 
> kubectl create -f basic-session-job.yaml 
> {code}
> Then, I modified the memory allocated to the jobManager and applied the change
> {code}
> kubectl apply -f basic-session-job.yaml 
> {code}
> The job manager is restarted to apply the change, but the jobs are not. 
> Looking at the operator logs, it appears that something is failing during job 
> status observation:
> {noformat}
> 2022-06-23 03:29:51,189 o.a.f.k.o.c.FlinkSessionJobController [INFO 
> ][default/basic-session-job-example2] Starting reconciliation
> 2022-06-23 03:29:51,190 o.a.f.k.o.o.JobStatusObserver  [INFO 
> ][default/basic-session-job-example2] Observing job status
> 2022-06-23 03:29:51,205 o.a.f.k.o.c.FlinkSessionJobController [INFO 
> ][default/basic-session-job-example] Starting reconciliation
> 2022-06-23 03:29:51,206 o.a.f.k.o.o.JobStatusObserver  [INFO 
> ][default/basic-session-job-example] Observing job status
> 2022-06-23 03:29:51,208 o.a.f.k.o.c.FlinkDeploymentController [INFO 
> ][default/basic-session-cluster] Starting reconciliation
> 2022-06-23 03:29:51,227 o.a.f.k.o.c.FlinkDeploymentController [INFO 
> ][default/basic-session-cluster] End of reconciliation
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-28210) FlinkSessionJob fails after FlinkDeployment is updated

2022-06-23 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557875#comment-17557875
 ] 

Gyula Fora commented on FLINK-28210:


Yes. At this point in 1.0.0 this is an expected limitation of the session mode. 
If you enable HA like in 
[https://github.com/apache/flink-kubernetes-operator/blob/main/examples/basic-checkpoint-ha.yaml#L30-L31
 
|https://github.com/apache/flink-kubernetes-operator/blob/main/examples/basic-checkpoint-ha.yaml#L30-L31]

that would hopefully make it work.

We will try to improve this behaviour in later versions, this is related to 
https://issues.apache.org/jira/browse/FLINK-27979

cc [~aitozi] 

> FlinkSessionJob fails after FlinkDeployment is updated
> --
>
> Key: FLINK-28210
> URL: https://issues.apache.org/jira/browse/FLINK-28210
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.0.0
> Environment: The [quick 
> start|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/]
>  was followed to install minikube and the flink operator. 
>  
> minikube 1.24.1
> kubectl 1.24.2
> flink operator: 1.0.0
>Reporter: Daniel Crowe
>Priority: Major
>
> I created a flink deployment using this example:
> {code}
> curl 
> https://raw.githubusercontent.com/apache/flink-kubernetes-operator/main/examples/basic-session-job.yaml
>  -o basic-session-job.yaml 
> kubectl create -f basic-session-job.yaml 
> {code}
> Then, I modified the memory allocated to the jobManager and applied the change
> {code}
> kubectl apply -f basic-session-job.yaml 
> {code}
> The job manager is restarted to apply the change, but the jobs are not. 
> Looking at the operator logs, it appears that something is failing during job 
> status observation:
> {noformat}
> 2022-06-23 03:29:51,189 o.a.f.k.o.c.FlinkSessionJobController [INFO 
> ][default/basic-session-job-example2] Starting reconciliation
> 2022-06-23 03:29:51,190 o.a.f.k.o.o.JobStatusObserver  [INFO 
> ][default/basic-session-job-example2] Observing job status
> 2022-06-23 03:29:51,205 o.a.f.k.o.c.FlinkSessionJobController [INFO 
> ][default/basic-session-job-example] Starting reconciliation
> 2022-06-23 03:29:51,206 o.a.f.k.o.o.JobStatusObserver  [INFO 
> ][default/basic-session-job-example] Observing job status
> 2022-06-23 03:29:51,208 o.a.f.k.o.c.FlinkDeploymentController [INFO 
> ][default/basic-session-cluster] Starting reconciliation
> 2022-06-23 03:29:51,227 o.a.f.k.o.c.FlinkDeploymentController [INFO 
> ][default/basic-session-cluster] End of reconciliation
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-28210) FlinkSessionJob fails after FlinkDeployment is updated

2022-06-23 Thread Daniel Crowe (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557865#comment-17557865
 ] 

Daniel Crowe commented on FLINK-28210:
--

Is this the file you are after?

{noformat}

#  Licensed to the Apache Software Foundation (ASF) under one
#  or more contributor license agreements.  See the NOTICE file
#  distributed with this work for additional information
#  regarding copyright ownership.  The ASF licenses this file
#  to you under the Apache License, Version 2.0 (the
#  "License"); you may not use this file except in compliance
#  with the License.  You may obtain a copy of the License at
#
#  http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
# limitations under the License.


apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
  name: basic-session-cluster
spec:
  image: flink:1.15
  flinkVersion: v1_15
  jobManager:
resource:
  memory: "2048m"
  cpu: 1
  taskManager:
resource:
  memory: "2048m"
  cpu: 1
  serviceAccount: flink
---
apiVersion: flink.apache.org/v1beta1
kind: FlinkSessionJob
metadata:
  name: basic-session-job-example
spec:
  deploymentName: basic-session-cluster
  job:
jarURI: 
https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.12/1.15.0/flink-examples-streaming_2.12-1.15.0-TopSpeedWindowing.jar
parallelism: 4
upgradeMode: stateless

---
apiVersion: flink.apache.org/v1beta1
kind: FlinkSessionJob
metadata:
  name: basic-session-job-example2
spec:
  deploymentName: basic-session-cluster
  job:
jarURI: 
https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.12/1.15.0/flink-examples-streaming_2.12-1.15.0.jar
parallelism: 2
upgradeMode: stateless
entryClass: 
org.apache.flink.streaming.examples.statemachine.StateMachineExample
{noformat}


> FlinkSessionJob fails after FlinkDeployment is updated
> --
>
> Key: FLINK-28210
> URL: https://issues.apache.org/jira/browse/FLINK-28210
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.0.0
> Environment: The [quick 
> start|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/]
>  was followed to install minikube and the flink operator. 
>  
> minikube 1.24.1
> kubectl 1.24.2
> flink operator: 1.0.0
>Reporter: Daniel Crowe
>Priority: Major
>
> I created a flink deployment using this example:
> {code}
> curl 
> https://raw.githubusercontent.com/apache/flink-kubernetes-operator/main/examples/basic-session-job.yaml
>  -o basic-session-job.yaml 
> kubectl create -f basic-session-job.yaml 
> {code}
> Then, I modified the memory allocated to the jobManager and applied the change
> {code}
> kubectl apply -f basic-session-job.yaml 
> {code}
> The job manager is restarted to apply the change, but the jobs are not. 
> Looking at the operator logs, it appears that something is failing during job 
> status observation:
> {noformat}
> 2022-06-23 03:29:51,189 o.a.f.k.o.c.FlinkSessionJobController [INFO 
> ][default/basic-session-job-example2] Starting reconciliation
> 2022-06-23 03:29:51,190 o.a.f.k.o.o.JobStatusObserver  [INFO 
> ][default/basic-session-job-example2] Observing job status
> 2022-06-23 03:29:51,205 o.a.f.k.o.c.FlinkSessionJobController [INFO 
> ][default/basic-session-job-example] Starting reconciliation
> 2022-06-23 03:29:51,206 o.a.f.k.o.o.JobStatusObserver  [INFO 
> ][default/basic-session-job-example] Observing job status
> 2022-06-23 03:29:51,208 o.a.f.k.o.c.FlinkDeploymentController [INFO 
> ][default/basic-session-cluster] Starting reconciliation
> 2022-06-23 03:29:51,227 o.a.f.k.o.c.FlinkDeploymentController [INFO 
> ][default/basic-session-cluster] End of reconciliation
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-28210) FlinkSessionJob fails after FlinkDeployment is updated

2022-06-22 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557800#comment-17557800
 ] 

Gyula Fora commented on FLINK-28210:


This is expected if HA is not configured for the session FlinkDeploymemt. Can 
you share your session yaml?

> FlinkSessionJob fails after FlinkDeployment is updated
> --
>
> Key: FLINK-28210
> URL: https://issues.apache.org/jira/browse/FLINK-28210
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.0.0
> Environment: The [quick 
> start|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/]
>  was followed to install minikube and the flink operator. 
>  
> minikube 1.24.1
> kubectl 1.24.2
> flink operator: 1.0.0
>Reporter: Daniel Crowe
>Priority: Major
>
> I created a flink deployment using this example:
> {code}
> curl 
> https://raw.githubusercontent.com/apache/flink-kubernetes-operator/main/examples/basic-session-job.yaml
>  -o basic-session-job.yaml 
> kubectl create -f basic-session-job.yaml 
> {code}
> Then, I modified the memory allocated to the jobManager and applied the change
> {code}
> kubectl apply -f basic-session-job.yaml 
> {code}
> The job manager is restarted to apply the change, but the jobs are not. 
> Looking at the operator logs, it appears that something is failing during job 
> status observation:
> {noformat}
> 2022-06-23 03:29:51,189 o.a.f.k.o.c.FlinkSessionJobController [INFO 
> ][default/basic-session-job-example2] Starting reconciliation
> 2022-06-23 03:29:51,190 o.a.f.k.o.o.JobStatusObserver  [INFO 
> ][default/basic-session-job-example2] Observing job status
> 2022-06-23 03:29:51,205 o.a.f.k.o.c.FlinkSessionJobController [INFO 
> ][default/basic-session-job-example] Starting reconciliation
> 2022-06-23 03:29:51,206 o.a.f.k.o.o.JobStatusObserver  [INFO 
> ][default/basic-session-job-example] Observing job status
> 2022-06-23 03:29:51,208 o.a.f.k.o.c.FlinkDeploymentController [INFO 
> ][default/basic-session-cluster] Starting reconciliation
> 2022-06-23 03:29:51,227 o.a.f.k.o.c.FlinkDeploymentController [INFO 
> ][default/basic-session-cluster] End of reconciliation
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)