[jira] [Commented] (FLINK-34009) Apache flink: Checkpoint restoration issue on Application Mode of deployment

2024-01-07 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804118#comment-17804118
 ] 

Vijay commented on FLINK-34009:
---

As flink support multi-job execution on Application mode of deployment (with HA 
being disabled), we need more details of how to enable restoration process via 
checkpointing (when app / flink is upgraded). Please support us to overcome 
this issue. Thanks.

> Apache flink: Checkpoint restoration issue on Application Mode of deployment
> 
>
> Key: FLINK-34009
> URL: https://issues.apache.org/jira/browse/FLINK-34009
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Good Day. Wish you all a happy new year 2024.
> We are using Flink (1.18) version on our flink cluster. Job manager has been 
> deployed on "Application mode" and HA is disabled (high-availability.type: 
> NONE), under this configuration parameters we are able to start multiple jobs 
> (using env.executeAsync()) of a single application.
> Note: We have also setup checkpoint on a s3 instance with 
> RETAIN_ON_CANCELLATION mode (plus other required settings).
> Lets say now we start two jobs of the same application (ex: Jobidxxx1, 
> jobidxxx2) and they are currently running on the k8s env. If we have to 
> perform Flink minor upgrade (or) upgrade of our application with minor 
> changes, in that case we will stop the Job Manager and Task Managers 
> instances and perform the necessary up-gradation then when we start both Job 
> Manager and Task Managers instance. On startup we expect the job's to be 
> restored back from the last checkpoint, but the job restoration is not 
> happening on Job manager startup. Please let us know if this is an bug (or) 
> its the general behavior of flink under application mode of deployment.
> Additional information: If we enable HA (using Zookeeper) on Application 
> mode, we are able to startup only one job (i.e., per-job behavior). When we 
> perform Flink minor upgrade (or) upgrade of our application with minor 
> changes, the checkpoint restoration is working properly on Job Manager & Task 
> Managers restart process.
> It seems checkpoint restoration and HA are inter-related, but why checkpoint 
> restoration doesn't work when HA is disabled.
>  
> Please let us know if anyone has experienced similar issues or if have any 
> suggestions, it will be highly appreciated. Thanks in advance for your 
> assistance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34009) Apache flink: Checkpoint restoration issue on Application Mode of deployment

2024-01-07 Thread Vijay (Jira)
Vijay created FLINK-34009:
-

 Summary: Apache flink: Checkpoint restoration issue on Application 
Mode of deployment
 Key: FLINK-34009
 URL: https://issues.apache.org/jira/browse/FLINK-34009
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.18.0
 Environment: Flink version: 1.18

Zookeeper version: 3.7.2

Env: Custom flink docker image (with embedded application class) deployed over 
kubernetes (v1.26.11).
Reporter: Vijay


Hi Team,

Good Day. Wish you all a happy new year 2024.

We are using Flink (1.18) version on our flink cluster. Job manager has been 
deployed on "Application mode" and HA is disabled (high-availability.type: 
NONE), under this configuration parameters we are able to start multiple jobs 
(using env.executeAsync()) of a single application.

Note: We have also setup checkpoint on a s3 instance with 
RETAIN_ON_CANCELLATION mode (plus other required settings).

Lets say now we start two jobs of the same application (ex: Jobidxxx1, 
jobidxxx2) and they are currently running on the k8s env. If we have to perform 
Flink minor upgrade (or) upgrade of our application with minor changes, in that 
case we will stop the Job Manager and Task Managers instances and perform the 
necessary up-gradation then when we start both Job Manager and Task Managers 
instance. On startup we expect the job's to be restored back from the last 
checkpoint, but the job restoration is not happening on Job manager startup. 
Please let us know if this is an bug (or) its the general behavior of flink 
under application mode of deployment.

Additional information: If we enable HA (using Zookeeper) on Application mode, 
we are able to startup only one job (i.e., per-job behavior). When we perform 
Flink minor upgrade (or) upgrade of our application with minor changes, the 
checkpoint restoration is working properly on Job Manager & Task Managers 
restart process.

It seems checkpoint restoration and HA are inter-related, but why checkpoint 
restoration doesn't work when HA is disabled.

 

Please let us know if anyone has experienced similar issues or if have any 
suggestions, it will be highly appreciated. Thanks in advance for your 
assistance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33944) Apache Flink: Process to restore more than one job on job manager startup from the respective savepoints

2023-12-27 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800927#comment-17800927
 ] 

Vijay commented on FLINK-33944:
---

[~martijnvisser] Can we use Aligned checkpointing instead of Savepoint for 
restore process when flink is upgraded?

> Apache Flink: Process to restore more than one job on job manager startup 
> from the respective savepoints
> 
>
> Key: FLINK-33944
> URL: https://issues.apache.org/jira/browse/FLINK-33944
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0
>Reporter: Vijay
>Priority: Major
>
>  
> We are using Flink (1.18) version for our Flink cluster. The job manager has 
> been deployed in "Application mode" and we are looking for a process to 
> restore multiple jobs (using their respective savepoint directories) when the 
> job manager is started. Currently, we have the option to restore only one job 
> while running "standalone-job.sh" using the --fromSavepoint and 
> --allowNonRestoredState. However, we need a way to trigger multiple job 
> executions via Java client (from its respective savepoint location) on 
> Jobmanager startup.
> Note: We are not using a Kubernetes native deployment, but we are using k8s 
> standalone mode of deployment.
> Additional Query: If there is a process to restore multiple jobs from its 
> respective savepoints on "Application mode" of deployment, is the same 
> supported on Session mode of deployment or not?
> *Expected process:*
>  # Before starting with the Flink/application image upgrade, trigger the 
> savepoints for all the current running jobs.
>  # Once the savepoints process completed for all jobs, will trigger the scale 
> down of job manager and task manager instances.
>  # Update the image version on the k8s deployment with the update application 
> image.
>  # After image version is updated, scale up the job manager and task manager.
>  # We need a process to restore the previously running jobs from the 
> savepoint dir and start all the jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33944) Apache Flink: Process to restore more than one job on job manager startup from the respective savepoints

2023-12-27 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800823#comment-17800823
 ] 

Vijay commented on FLINK-33944:
---

[~martijnvisser] Using "application mode" we can run multiple run multiple 
instance of job executions of a single flink application and "session mode" can 
also configured with the same, also it supports multiple flink application 
based job executions. We want to use the "application mode" to trigger 
savepoint for each job execution and restore for each job executions back once 
the flink upgrade / image upgrade. Please confirm if the version existing 
support this requirement on "application mode" or not?

> Apache Flink: Process to restore more than one job on job manager startup 
> from the respective savepoints
> 
>
> Key: FLINK-33944
> URL: https://issues.apache.org/jira/browse/FLINK-33944
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0
>Reporter: Vijay
>Priority: Major
>
>  
> We are using Flink (1.18) version for our Flink cluster. The job manager has 
> been deployed in "Application mode" and we are looking for a process to 
> restore multiple jobs (using their respective savepoint directories) when the 
> job manager is started. Currently, we have the option to restore only one job 
> while running "standalone-job.sh" using the --fromSavepoint and 
> --allowNonRestoredState. However, we need a way to trigger multiple job 
> executions via Java client (from its respective savepoint location) on 
> Jobmanager startup.
> Note: We are not using a Kubernetes native deployment, but we are using k8s 
> standalone mode of deployment.
> Additional Query: If there is a process to restore multiple jobs from its 
> respective savepoints on "Application mode" of deployment, is the same 
> supported on Session mode of deployment or not?
> *Expected process:*
>  # Before starting with the Flink/application image upgrade, trigger the 
> savepoints for all the current running jobs.
>  # Once the savepoints process completed for all jobs, will trigger the scale 
> down of job manager and task manager instances.
>  # Update the image version on the k8s deployment with the update application 
> image.
>  # After image version is updated, scale up the job manager and task manager.
>  # We need a process to restore the previously running jobs from the 
> savepoint dir and start all the jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800694#comment-17800694
 ] 

Vijay commented on FLINK-33943:
---

Thanks [~wanglijie] for your inputs.

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33944) Apache Flink: Process to restore more than one job on job manager startup from the respective savepoints

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800692#comment-17800692
 ] 

Vijay commented on FLINK-33944:
---

[~wanglijie] Do you have any input on this information request for savepoint 
restore process for multiple jobs (via Java Client) or Job-manager startup (via 
standalone-job.sh or jobmanager.sh). "standalone-job.sh" supports only one job 
to be restore from savepoint on Jobmanager startup.

> Apache Flink: Process to restore more than one job on job manager startup 
> from the respective savepoints
> 
>
> Key: FLINK-33944
> URL: https://issues.apache.org/jira/browse/FLINK-33944
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0
>Reporter: Vijay
>Priority: Major
>
>  
> We are using Flink (1.18) version for our Flink cluster. The job manager has 
> been deployed in "Application mode" and we are looking for a process to 
> restore multiple jobs (using their respective savepoint directories) when the 
> job manager is started. Currently, we have the option to restore only one job 
> while running "standalone-job.sh" using the --fromSavepoint and 
> --allowNonRestoredState. However, we need a way to trigger multiple job 
> executions via Java client (from its respective savepoint location) on 
> Jobmanager startup.
> Note: We are not using a Kubernetes native deployment, but we are using k8s 
> standalone mode of deployment.
> Additional Query: If there is a process to restore multiple jobs from its 
> respective savepoints on "Application mode" of deployment, is the same 
> supported on Session mode of deployment or not?
> *Expected process:*
>  # Before starting with the Flink/application image upgrade, trigger the 
> savepoints for all the current running jobs.
>  # Once the savepoints process completed for all jobs, will trigger the scale 
> down of job manager and task manager instances.
>  # Update the image version on the k8s deployment with the update application 
> image.
>  # After image version is updated, scale up the job manager and task manager.
>  # We need a process to restore the previously running jobs from the 
> savepoint dir and start all the jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33944) Apache Flink: Process to restore more than one job on job manager startup from the respective savepoints

2023-12-26 Thread Vijay (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated FLINK-33944:
--
Description: 
 
We are using Flink (1.18) version for our Flink cluster. The job manager has 
been deployed in "Application mode" and we are looking for a process to restore 
multiple jobs (using their respective savepoint directories) when the job 
manager is started. Currently, we have the option to restore only one job while 
running "standalone-job.sh" using the --fromSavepoint and 
--allowNonRestoredState. However, we need a way to trigger multiple job 
executions via Java client (from its respective savepoint location) on 
Jobmanager startup.

Note: We are not using a Kubernetes native deployment, but we are using k8s 
standalone mode of deployment.

Additional Query: If there is a process to restore multiple jobs from its 
respective savepoints on "Application mode" of deployment, is the same 
supported on Session mode of deployment or not?

*Expected process:*
 # Before starting with the Flink/application image upgrade, trigger the 
savepoints for all the current running jobs.
 # Once the savepoints process completed for all jobs, will trigger the scale 
down of job manager and task manager instances.
 # Update the image version on the k8s deployment with the update application 
image.
 # After image version is updated, scale up the job manager and task manager.
 # We need a process to restore the previously running jobs from the savepoint 
dir and start all the jobs.

  was:
 
We are using Flink (1.18) version for our Flink cluster. The job manager has 
been deployed in "Application mode" and we are looking for a process to restore 
multiple jobs (using their respective savepoint directories) when the job 
manager is started. Currently, we have the option to restore only one job while 
running "standalone-job.sh" using the --fromSavepoint and 
--allowNonRestoredState. However, we need a way to trigger multiple job 
executions via Java client.

Note: We are not using a Kubernetes native deployment, but we are using k8s 
standalone mode of deployment.

Additional Query: If there is a process to restore multiple jobs from its 
respective savepoints on "Application mode" of deployment, is the same 
supported on Session mode of deployment or not?

*Expected process:*
 # Before starting with the Flink/application image upgrade, trigger the 
savepoints for all the current running jobs.
 # Once the savepoints process completed for all jobs, will trigger the scale 
down of job manager and task manager instances.
 # Update the image version on the k8s deployment with the update application 
image.
 # After image version is updated, scale up the job manager and task manager.
 # We need a process to restore the previously running jobs from the savepoint 
dir and start all the jobs.


> Apache Flink: Process to restore more than one job on job manager startup 
> from the respective savepoints
> 
>
> Key: FLINK-33944
> URL: https://issues.apache.org/jira/browse/FLINK-33944
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0
>Reporter: Vijay
>Priority: Major
>
>  
> We are using Flink (1.18) version for our Flink cluster. The job manager has 
> been deployed in "Application mode" and we are looking for a process to 
> restore multiple jobs (using their respective savepoint directories) when the 
> job manager is started. Currently, we have the option to restore only one job 
> while running "standalone-job.sh" using the --fromSavepoint and 
> --allowNonRestoredState. However, we need a way to trigger multiple job 
> executions via Java client (from its respective savepoint location) on 
> Jobmanager startup.
> Note: We are not using a Kubernetes native deployment, but we are using k8s 
> standalone mode of deployment.
> Additional Query: If there is a process to restore multiple jobs from its 
> respective savepoints on "Application mode" of deployment, is the same 
> supported on Session mode of deployment or not?
> *Expected process:*
>  # Before starting with the Flink/application image upgrade, trigger the 
> savepoints for all the current running jobs.
>  # Once the savepoints process completed for all jobs, will trigger the scale 
> down of job manager and task manager instances.
>  # Update the image version on the k8s deployment with the update application 
> image.
>  # After image version is updated, scale up the job manager and task manager.
>  # We need a process to restore the previously running jobs from the 
> savepoint dir and start all the jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800677#comment-17800677
 ] 

Vijay edited comment on FLINK-33943 at 12/27/23 2:48 AM:
-

[~wanglijie]  Thanks for the prompt update. Is there a plan to support of HA 
functionality on application mode (for multiple exections) in near future 
versions? (or) is there is technical reason why its not supported currently?


was (Author: JIRAUSER303619):
[~wanglijie]  Thanks for the prompt update. Is there a plan to support of HA 
functionality on application mode (for multiple exections) be supported in near 
future versions? (or) is there is technical reason why its not supported 
currently?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800677#comment-17800677
 ] 

Vijay edited comment on FLINK-33943 at 12/27/23 2:48 AM:
-

[~wanglijie]  Thanks for the prompt update. Is there a plan to support of HA 
functionality on application mode (for multiple exections) in near future 
versions? (or) is there a technical reasoning why its not supported currently?


was (Author: JIRAUSER303619):
[~wanglijie]  Thanks for the prompt update. Is there a plan to support of HA 
functionality on application mode (for multiple exections) in near future 
versions? (or) is there is technical reason why its not supported currently?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800677#comment-17800677
 ] 

Vijay edited comment on FLINK-33943 at 12/27/23 2:48 AM:
-

[~wanglijie]  Thanks for the prompt update. Is there a plan to support of HA 
functionality on application mode (for multiple exections) be supported in near 
future versions? (or) is there is technical reason why its not supported 
currently?


was (Author: JIRAUSER303619):
Thanks for the prompt update. Is there a plan to support of HA functionality on 
application mode (for multiple exections) be supported in near future versions? 
(or) is there is technical reason why its not supported currently?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800677#comment-17800677
 ] 

Vijay commented on FLINK-33943:
---

Thanks for the prompt update. Is there a plan to support of HA functionality on 
application mode (for multiple exections) be supported in near future versions? 
(or) is there is technical reason why its not supported currently?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33944) Apache Flink: Process to restore more than one job on job manager startup from the respective savepoints

2023-12-26 Thread Vijay (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated FLINK-33944:
--
Description: 
 
We are using Flink (1.18) version for our Flink cluster. The job manager has 
been deployed in "Application mode" and we are looking for a process to restore 
multiple jobs (using their respective savepoint directories) when the job 
manager is started. Currently, we have the option to restore only one job while 
running "standalone-job.sh" using the --fromSavepoint and 
--allowNonRestoredState. However, we need a way to trigger multiple job 
executions via Java client.

Note: We are not using a Kubernetes native deployment, but we are using k8s 
standalone mode of deployment.

Additional Query: If there is a process to restore multiple jobs from its 
respective savepoints on "Application mode" of deployment, is the same 
supported on Session mode of deployment or not?

*Expected process:*
 # Before starting with the Flink/application image upgrade, trigger the 
savepoints for all the current running jobs.
 # Once the savepoints process completed for all jobs, will trigger the scale 
down of job manager and task manager instances.
 # Update the image version on the k8s deployment with the update application 
image.
 # After image version is updated, scale up the job manager and task manager.
 # We need a process to restore the previously running jobs from the savepoint 
dir and start all the jobs.

  was:
 
We are using Flink (1.18) version for our Flink cluster. The job manager has 
been deployed in "Application mode" and we are looking for a process to restore 
multiple jobs (using their respective savepoint directories) when the job 
manager is started. Currently, we have the option to restore only one job while 
running "standalone-job.sh" using the --fromSavepoint and 
--allowNonRestoredState. However, we need a way to trigger multiple job 
executions via Java client.

Note: We are not using a Kubernetes native deployment, but we are using k8s 
standalone mode of deployment.

*Expected process:*
 # Before starting with the Flink/application image upgrade, trigger the 
savepoints for all the current running jobs.
 # Once the savepoints process completed for all jobs, will trigger the scale 
down of job manager and task manager instances.
 # Update the image version on the k8s deployment with the update application 
image.
 # After image version is updated, scale up the job manager and task manager.
 # We need a process to restore the previously running jobs from the savepoint 
dir and start all the jobs.


> Apache Flink: Process to restore more than one job on job manager startup 
> from the respective savepoints
> 
>
> Key: FLINK-33944
> URL: https://issues.apache.org/jira/browse/FLINK-33944
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0
>Reporter: Vijay
>Priority: Major
>
>  
> We are using Flink (1.18) version for our Flink cluster. The job manager has 
> been deployed in "Application mode" and we are looking for a process to 
> restore multiple jobs (using their respective savepoint directories) when the 
> job manager is started. Currently, we have the option to restore only one job 
> while running "standalone-job.sh" using the --fromSavepoint and 
> --allowNonRestoredState. However, we need a way to trigger multiple job 
> executions via Java client.
> Note: We are not using a Kubernetes native deployment, but we are using k8s 
> standalone mode of deployment.
> Additional Query: If there is a process to restore multiple jobs from its 
> respective savepoints on "Application mode" of deployment, is the same 
> supported on Session mode of deployment or not?
> *Expected process:*
>  # Before starting with the Flink/application image upgrade, trigger the 
> savepoints for all the current running jobs.
>  # Once the savepoints process completed for all jobs, will trigger the scale 
> down of job manager and task manager instances.
>  # Update the image version on the k8s deployment with the update application 
> image.
>  # After image version is updated, scale up the job manager and task manager.
>  # We need a process to restore the previously running jobs from the 
> savepoint dir and start all the jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33944) Apache Flink: Process to restore more than one job on job manager startup from the respective savepoints

2023-12-26 Thread Vijay (Jira)
Vijay created FLINK-33944:
-

 Summary: Apache Flink: Process to restore more than one job on job 
manager startup from the respective savepoints
 Key: FLINK-33944
 URL: https://issues.apache.org/jira/browse/FLINK-33944
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.18.0
Reporter: Vijay


 
We are using Flink (1.18) version for our Flink cluster. The job manager has 
been deployed in "Application mode" and we are looking for a process to restore 
multiple jobs (using their respective savepoint directories) when the job 
manager is started. Currently, we have the option to restore only one job while 
running "standalone-job.sh" using the --fromSavepoint and 
--allowNonRestoredState. However, we need a way to trigger multiple job 
executions via Java client.

Note: We are not using a Kubernetes native deployment, but we are using k8s 
standalone mode of deployment.

*Expected process:*
 # Before starting with the Flink/application image upgrade, trigger the 
savepoints for all the current running jobs.
 # Once the savepoints process completed for all jobs, will trigger the scale 
down of job manager and task manager instances.
 # Update the image version on the k8s deployment with the update application 
image.
 # After image version is updated, scale up the job manager and task manager.
 # We need a process to restore the previously running jobs from the savepoint 
dir and start all the jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated FLINK-33943:
--
Description: 
Hi Team,

*Note:* Not sure whether I have picked the right component while raising the 
issue.

Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
cluster. Job manager has been deployed on "Application mode" and when HA is 
disabled (high-availability.type: NONE) we are able to start multiple jobs 
(using env.executeAsyn()) for a single application. But when I setup the 
Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
seeing only one job is getting executed on the Flink dashboard. Following are 
the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
Please let us know if anyone has experienced similar issues and have any 
suggestions. Thanks in advance for your assistance.

*Note:* We are using a Streaming application and following are the 
flink-config.yaml configurations.

*Additional query:* Does "Session mode" of deployment support HA for multiple 
execute() executions?
 # high-availability.storageDir: /opt/flink/data
 # high-availability.cluster-id: test
 # high-availability.zookeeper.quorum: localhost:2181
 # high-availability.type: zookeeper
 # high-availability.zookeeper.path.root: /dp/configs/flinkha

  was:
Hi Team,

Note: Not sure whether I have picked the right component while raising the 
issue.

Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
cluster. Job manager has been deployed on "Application mode" and when HA is 
disabled (high-availability.type: NONE) we are able to start multiple jobs 
(using env.executeAsyn()) for a single application. But when I setup the 
Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
seeing only one job is getting executed on the Flink dashboard. Following are 
the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
Please let us know if anyone has experienced similar issues and have any 
suggestions. Thanks in advance for your assistance.

Note: We are using a Streaming application and following are the 
flink-config.yaml configurations.
 # high-availability.storageDir: /opt/flink/data
 # high-availability.cluster-id: test
 # high-availability.zookeeper.quorum: localhost:2181
 # high-availability.type: zookeeper
 # high-availability.zookeeper.path.root: /dp/configs/flinkha


> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800670#comment-17800670
 ] 

Vijay edited comment on FLINK-33943 at 12/27/23 2:22 AM:
-

[~wanglijie] Is the HA in session mode support execution of multiple 
execute/executeAsync operations? Sorry I am unable to find any documentation 
related to HA on session mode and its features / limitations.


was (Author: JIRAUSER303619):
[~wanglijie] Is the HA in session mode support execution of multiple 
execute/executeAsync operations?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Note: Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> Note: We are using a Streaming application and following are the 
> flink-config.yaml configurations.
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800670#comment-17800670
 ] 

Vijay commented on FLINK-33943:
---

[~wanglijie] Is the HA in session mode support execution of multiple 
execute/executeAsync operations?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Note: Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> Note: We are using a Streaming application and following are the 
> flink-config.yaml configurations.
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800669#comment-17800669
 ] 

Vijay commented on FLINK-33943:
---

 
The issue can be reproduced by enabling high-availability.type: zookeeper (with 
above config's specified on the issue) and in the flink client code try to call 
env.executeAsync() for multiple instance of job for the same application. Now 
open Dashboard and check the number of jobs running(same can be tried via REST 
api call too), then you will find only one job running. When you disable HA 
(high-availability.type: NONE), then you can see multiple jobs running (same 
can be seen via REST api call too).

REST api: http://:8081/v1/jobs

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Note: Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> Note: We are using a Streaming application and following are the 
> flink-config.yaml configurations.
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)
Vijay created FLINK-33943:
-

 Summary: Apache flink: Issues after configuring HA (using 
zookeeper setting)
 Key: FLINK-33943
 URL: https://issues.apache.org/jira/browse/FLINK-33943
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.18.0
 Environment: Flink version: 1.18

Zookeeper version: 3.7.2

Env: Custom flink docker image (with embedded application class) deployed over 
kubernetes (v1.26.11).

 
Reporter: Vijay


Hi Team,

Note: Not sure whether I have picked the right component while raising the 
issue.

Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
cluster. Job manager has been deployed on "Application mode" and when HA is 
disabled (high-availability.type: NONE) we are able to start multiple jobs 
(using env.executeAsyn()) for a single application. But when I setup the 
Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
seeing only one job is getting executed on the Flink dashboard. Following are 
the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
Please let us know if anyone has experienced similar issues and have any 
suggestions. Thanks in advance for your assistance.

Note: We are using a Streaming application and following are the 
flink-config.yaml configurations.
 # high-availability.storageDir: /opt/flink/data
 # high-availability.cluster-id: test
 # high-availability.zookeeper.quorum: localhost:2181
 # high-availability.type: zookeeper
 # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)