[jira] [Commented] (FLINK-34696) GSRecoverableWriterCommitter is generating excessive data blobs

2024-03-15 Thread Tobias Hofer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17827557#comment-17827557
 ] 

Tobias Hofer commented on FLINK-34696:
--

Will the temporary bucket prevent the use of object compose? Letting the 
GSRecoverableWriterCommitter not recursively compose objects? To ask more 
precisely: will the use of a temporary bucket just double the amount of data 
that is being written (one time in temporary bucket, the other time in the 
final target)?

 

> GSRecoverableWriterCommitter is generating excessive data blobs
> ---
>
> Key: FLINK-34696
> URL: https://issues.apache.org/jira/browse/FLINK-34696
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Simon-Shlomo Poil
>Priority: Major
>
> The `composeBlobs` method in 
> `org.apache.flink.fs.gs.writer.GSRecoverableWriterCommitter` is designed to 
> merge multiple small blobs into a single large blob using Google Cloud 
> Storage's compose method. This process is iterative, combining the result 
> from the previous iteration with 31 new blobs until all blobs are merged. 
> Upon completion of the composition, the method proceeds to remove the 
> temporary blobs.
> *Issue:*
> This methodology results in significant, unnecessary data storage consumption 
> during the blob composition process, incurring considerable costs due to 
> Google Cloud Storage pricing models.
> *Example to Illustrate the Problem:*
>  - Initial state: 64 blobs, each 1 GB in size (totaling 64 GB).
>  - After 1st step: 32 blobs are merged into a single blob, increasing total 
> storage to 96 GB (64 original + 32 GB new).
>  - After 2nd step: The newly created 32 GB blob is merged with 31 more blobs, 
> raising the total to 159 GB.
>  - After 3rd step: The final blob is merged, culminating in a total of 223 GB 
> to combine the original 64 GB of data. This results in an overhead of 159 GB.
> *Impact:*
> This inefficiency has a profound impact, especially at scale, where terabytes 
> of data can incur overheads in the petabyte range, leading to unexpectedly 
> high costs. Additionally, we have observed an increase in storage exceptions 
> thrown by the Google Storage library, potentially linked to this issue.
> *Suggested Solution:*
> To mitigate this problem, we propose modifying the `composeBlobs` method to 
> immediately delete source blobs once they have been successfully combined. 
> This change could significantly reduce data duplication and associated costs. 
> However, the implications for data recovery and integrity need careful 
> consideration to ensure that this optimization does not compromise the 
> ability to recover data in case of a failure during the composition process.
> *Steps to Reproduce:*
> 1. Initiate the blob composition process in an environment with a significant 
> number of blobs (e.g., 64 blobs of 1 GB each).
> 2. Observe the temporary increase in data storage as blobs are iteratively 
> combined.
> 3. Note the final amount of data storage used compared to the initial total 
> size of the blobs.
> *Expected Behavior:*
> The blob composition process should minimize unnecessary data storage use, 
> efficiently managing resources to combine blobs without generating excessive 
> temporary data overhead.
> *Actual Behavior:*
> The current implementation results in significant temporary increases in data 
> storage, leading to high costs and potential system instability due to 
> frequent storage exceptions.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-26808) [flink v1.14.2] Submit jobs via REST API not working after set web.submit.enable: false

2023-03-03 Thread Tobias Hofer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696167#comment-17696167
 ] 

Tobias Hofer commented on FLINK-26808:
--

The expectation is that Flink behaves as documented.

In 
[https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/] 
I can read the following:
{quote}{{{}web.submit.enable{}}}: Enables uploading and starting jobs through 
the Flink UI {_}(true by default){_}. Please note that even when this is 
disabled, session clusters still accept jobs through REST requests (HTTP 
calls). This flag only guards the feature to upload jobs in the UI.
{quote}
I would like to be able to disable submission via UI but still allow to job 
sumission to work.

This behavior impacts the Flink Kubernetes Operator. It fails with

{color:#174ea6}Warning | SESSIONJOBEXCEPTION | 
org.apache.flink.runtime.rest.util.RestClientException: [Not found: 
/v1/jars/upload]"{color}

> [flink v1.14.2] Submit jobs via REST API not working after set 
> web.submit.enable: false
> ---
>
> Key: FLINK-26808
> URL: https://issues.apache.org/jira/browse/FLINK-26808
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.14.2
>Reporter: Luís Costa
>Priority: Minor
>
> Greetings,
> I am using flink version 1.14.2 and after changing web.submit.enable to 
> false, job submission via REST API is no longer working. 
> The app that uses flink receives a 404 with "Not found: /jars/upload" 
> Looking into 
> [documentation|[https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/]]
>   saw that web.upload.dir is only used if  {{web.submit.enable}} is true, if 
> not it will be used JOB_MANAGER_WEB_TMPDIR_KEY
> Doing a curl to /jars it returns:
> {code:java}
> curl -X GET http://localhost:8081/jars
> HTTP/1.1 404 Not Found
> {"errors":["Unable to load requested file /jars."]} {code}
> Found this issue related to option web.submit.enable 
> https://issues.apache.org/jira/browse/FLINK-13799
> Could you please let me know if this is an issue that you are already aware?
> Thanks in advance
> Best regards,
> Luís Costa
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31064) Error while retrieving the leader gateway

2023-02-15 Thread Tobias Hofer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689279#comment-17689279
 ] 

Tobias Hofer commented on FLINK-31064:
--

Hi [~gyfora], in the meantime I'm not sure anymore that the exception in the 
description is really representing the problem, because I was aware of the very 
same exception in the logs of a properly running session.

The only other log entry that my be a hint can be found in the operator log.

{color:#174ea6}ERROR | ReconcilerExecutor-flinksessionjobcontroller-31 | 
org.apache.flink.kubernetes.operator.observer.sessionjob.FlinkSessionJobObserver
 - Missing Session Job{color}

But next output says: {color:#174ea6}nothing to do...{color} even so no job is 
running.

Find attached my session and job config.

> Error while retrieving the leader gateway
> -
>
> Key: FLINK-31064
> URL: https://issues.apache.org/jira/browse/FLINK-31064
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.1
> Environment:  
>  
>Reporter: Tobias Hofer
>Priority: Major
> Attachments: jobmanager_log.txt, my-job.yaml, my-session.yaml
>
>
> JobManager enters corrupt state after restart (e.g. increasing restartNonce).
> {code:java}
> WARN  | flink-akka.actor.default-dispatcher-5 | RpcGatewayRetriever           
>    | Error while retrieving the leader gateway. Retrying to connect to 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
> org.apache.flink.util.concurrent.FutureUtils$RetryException: Could not 
> complete the operation. Number of retries has been exhausted.
>     at 
> org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$6(FutureUtils.java:293)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
> Source)
>     at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown
>  Source)
>     at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown 
> Source)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown
>  Source)
>     at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1275)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
> ...
> Caused by: java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not 
> connect to rpc endpoint under address 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
>     at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.lambda$resolveActorAddress$11(AkkaRpcService.java:604)
>     at 
> scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:59)
>     ... 5 more
> Caused by: org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: 
> Could not connect to rpc endpoint under address 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
>     ... 7 more
> Caused by: akka.actor.ActorNotFound: Actor not found for: 
> ActorSelection[Anchor(akka://flink/), Path(/user/rpc/resourcemanager_*)]
>     at 
> akka.actor.ActorSelection.$anonfun$resolveOne$1(ActorSelection.scala:74)
>     at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
>     at 
> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:63)
>     at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:81)
>     at 
> akka.dispatch.internal.SameThreadExecutionContext$$anon$1.unbatchedExecute(SameThreadExecutionContext.scala:21)
>     at akka.dispatch.BatchingExecutor.execute(BatchingExecutor.scala:130)
>     at akka.dispatch.BatchingExecutor.execute$(BatchingExecutor.scala:124)
>     at 
> akka.dispatch.internal.SameThreadExecutionContext$$anon$1.execute(SameThreadExecutionContext.scala:20)
>     at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
>     at 
> scala.concurrent.impl.Promise$DefaultPromise.dispatchOrAddCallback(Promise.scala:312)
>     at 
> scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:303)
>     at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:72)
>     at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:89)
>     at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:130)
>     at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.resolveActorAddress(AkkaRpcService.java:598)
>     at 
> org.apache.flin

[jira] [Updated] (FLINK-31064) Error while retrieving the leader gateway

2023-02-15 Thread Tobias Hofer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tobias Hofer updated FLINK-31064:
-
Attachment: my-job.yaml

> Error while retrieving the leader gateway
> -
>
> Key: FLINK-31064
> URL: https://issues.apache.org/jira/browse/FLINK-31064
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.1
> Environment:  
>  
>Reporter: Tobias Hofer
>Priority: Major
> Attachments: jobmanager_log.txt, my-job.yaml, my-session.yaml
>
>
> JobManager enters corrupt state after restart (e.g. increasing restartNonce).
> {code:java}
> WARN  | flink-akka.actor.default-dispatcher-5 | RpcGatewayRetriever           
>    | Error while retrieving the leader gateway. Retrying to connect to 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
> org.apache.flink.util.concurrent.FutureUtils$RetryException: Could not 
> complete the operation. Number of retries has been exhausted.
>     at 
> org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$6(FutureUtils.java:293)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
> Source)
>     at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown
>  Source)
>     at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown 
> Source)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown
>  Source)
>     at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1275)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
> ...
> Caused by: java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not 
> connect to rpc endpoint under address 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
>     at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.lambda$resolveActorAddress$11(AkkaRpcService.java:604)
>     at 
> scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:59)
>     ... 5 more
> Caused by: org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: 
> Could not connect to rpc endpoint under address 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
>     ... 7 more
> Caused by: akka.actor.ActorNotFound: Actor not found for: 
> ActorSelection[Anchor(akka://flink/), Path(/user/rpc/resourcemanager_*)]
>     at 
> akka.actor.ActorSelection.$anonfun$resolveOne$1(ActorSelection.scala:74)
>     at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
>     at 
> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:63)
>     at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:81)
>     at 
> akka.dispatch.internal.SameThreadExecutionContext$$anon$1.unbatchedExecute(SameThreadExecutionContext.scala:21)
>     at akka.dispatch.BatchingExecutor.execute(BatchingExecutor.scala:130)
>     at akka.dispatch.BatchingExecutor.execute$(BatchingExecutor.scala:124)
>     at 
> akka.dispatch.internal.SameThreadExecutionContext$$anon$1.execute(SameThreadExecutionContext.scala:20)
>     at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
>     at 
> scala.concurrent.impl.Promise$DefaultPromise.dispatchOrAddCallback(Promise.scala:312)
>     at 
> scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:303)
>     at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:72)
>     at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:89)
>     at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:130)
>     at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.resolveActorAddress(AkkaRpcService.java:598)
>     at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.connectInternal(AkkaRpcService.java:549)
>     at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.connect(AkkaRpcService.java:232)
>     at 
> org.apache.flink.runtime.webmonitor.retriever.impl.RpcGatewayRetriever.lambda$null$0(RpcGatewayRetriever.java:66)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown 
> Source)
>     at java.base/java.util.concurrent.CompletableFuture.thenCompose(Unknown 
> Source)
>     at 
> org.apache.flink.runtime.webmonitor.retriever.impl.RpcGatewayRetriever.lambda$createGateway$1(RpcGatewayRetriever.java:64)
>     at 
> org.apache.flink

[jira] [Updated] (FLINK-31064) Error while retrieving the leader gateway

2023-02-15 Thread Tobias Hofer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tobias Hofer updated FLINK-31064:
-
Attachment: my-session.yaml

> Error while retrieving the leader gateway
> -
>
> Key: FLINK-31064
> URL: https://issues.apache.org/jira/browse/FLINK-31064
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.1
> Environment:  
>  
>Reporter: Tobias Hofer
>Priority: Major
> Attachments: jobmanager_log.txt, my-session.yaml
>
>
> JobManager enters corrupt state after restart (e.g. increasing restartNonce).
> {code:java}
> WARN  | flink-akka.actor.default-dispatcher-5 | RpcGatewayRetriever           
>    | Error while retrieving the leader gateway. Retrying to connect to 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
> org.apache.flink.util.concurrent.FutureUtils$RetryException: Could not 
> complete the operation. Number of retries has been exhausted.
>     at 
> org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$6(FutureUtils.java:293)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
> Source)
>     at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown
>  Source)
>     at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown 
> Source)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown
>  Source)
>     at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1275)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
> ...
> Caused by: java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not 
> connect to rpc endpoint under address 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
>     at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.lambda$resolveActorAddress$11(AkkaRpcService.java:604)
>     at 
> scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:59)
>     ... 5 more
> Caused by: org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: 
> Could not connect to rpc endpoint under address 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
>     ... 7 more
> Caused by: akka.actor.ActorNotFound: Actor not found for: 
> ActorSelection[Anchor(akka://flink/), Path(/user/rpc/resourcemanager_*)]
>     at 
> akka.actor.ActorSelection.$anonfun$resolveOne$1(ActorSelection.scala:74)
>     at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
>     at 
> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:63)
>     at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:81)
>     at 
> akka.dispatch.internal.SameThreadExecutionContext$$anon$1.unbatchedExecute(SameThreadExecutionContext.scala:21)
>     at akka.dispatch.BatchingExecutor.execute(BatchingExecutor.scala:130)
>     at akka.dispatch.BatchingExecutor.execute$(BatchingExecutor.scala:124)
>     at 
> akka.dispatch.internal.SameThreadExecutionContext$$anon$1.execute(SameThreadExecutionContext.scala:20)
>     at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
>     at 
> scala.concurrent.impl.Promise$DefaultPromise.dispatchOrAddCallback(Promise.scala:312)
>     at 
> scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:303)
>     at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:72)
>     at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:89)
>     at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:130)
>     at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.resolveActorAddress(AkkaRpcService.java:598)
>     at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.connectInternal(AkkaRpcService.java:549)
>     at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.connect(AkkaRpcService.java:232)
>     at 
> org.apache.flink.runtime.webmonitor.retriever.impl.RpcGatewayRetriever.lambda$null$0(RpcGatewayRetriever.java:66)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown 
> Source)
>     at java.base/java.util.concurrent.CompletableFuture.thenCompose(Unknown 
> Source)
>     at 
> org.apache.flink.runtime.webmonitor.retriever.impl.RpcGatewayRetriever.lambda$createGateway$1(RpcGatewayRetriever.java:64)
>     at 
> org.apache.flink.util.con

[jira] [Updated] (FLINK-31064) Error while retrieving the leader gateway

2023-02-14 Thread Tobias Hofer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tobias Hofer updated FLINK-31064:
-
Description: 
JobManager enters corrupt state after restart (e.g. increasing restartNonce).
{code:java}
WARN  | flink-akka.actor.default-dispatcher-5 | RpcGatewayRetriever             
 | Error while retrieving the leader gateway. Retrying to connect to 
akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
org.apache.flink.util.concurrent.FutureUtils$RetryException: Could not complete 
the operation. Number of retries has been exhausted.
    at 
org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$6(FutureUtils.java:293)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
Source)
    at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown
 Source)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown 
Source)
    at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown 
Source)
    at 
org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1275)
    at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
    at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
    at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not 
connect to rpc endpoint under address 
akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
    at 
org.apache.flink.runtime.rpc.akka.AkkaRpcService.lambda$resolveActorAddress$11(AkkaRpcService.java:604)
    at 
scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:59)
    ... 5 more
Caused by: org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: 
Could not connect to rpc endpoint under address 
akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.
    ... 7 more
Caused by: akka.actor.ActorNotFound: Actor not found for: 
ActorSelection[Anchor(akka://flink/), Path(/user/rpc/resourcemanager_*)]
    at akka.actor.ActorSelection.$anonfun$resolveOne$1(ActorSelection.scala:74)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
    at 
akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:63)
    at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:81)
    at 
akka.dispatch.internal.SameThreadExecutionContext$$anon$1.unbatchedExecute(SameThreadExecutionContext.scala:21)
    at akka.dispatch.BatchingExecutor.execute(BatchingExecutor.scala:130)
    at akka.dispatch.BatchingExecutor.execute$(BatchingExecutor.scala:124)
    at 
akka.dispatch.internal.SameThreadExecutionContext$$anon$1.execute(SameThreadExecutionContext.scala:20)
    at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
    at 
scala.concurrent.impl.Promise$DefaultPromise.dispatchOrAddCallback(Promise.scala:312)
    at 
scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:303)
    at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:72)
    at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:89)
    at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:130)
    at 
org.apache.flink.runtime.rpc.akka.AkkaRpcService.resolveActorAddress(AkkaRpcService.java:598)
    at 
org.apache.flink.runtime.rpc.akka.AkkaRpcService.connectInternal(AkkaRpcService.java:549)
    at 
org.apache.flink.runtime.rpc.akka.AkkaRpcService.connect(AkkaRpcService.java:232)
    at 
org.apache.flink.runtime.webmonitor.retriever.impl.RpcGatewayRetriever.lambda$null$0(RpcGatewayRetriever.java:66)
    at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(Unknown 
Source)
    at java.base/java.util.concurrent.CompletableFuture.thenCompose(Unknown 
Source)
    at 
org.apache.flink.runtime.webmonitor.retriever.impl.RpcGatewayRetriever.lambda$createGateway$1(RpcGatewayRetriever.java:64)
    at 
org.apache.flink.util.concurrent.FutureUtils.retryOperationWithDelay(FutureUtils.java:259)
    at 
org.apache.flink.util.concurrent.FutureUtils.lambda$null$4(FutureUtils.java:279)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown 
Source)
    at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
    at 
org.apache.flink.runtime.concurrent.akka.ActorSystemScheduledExecutorAdapter$ScheduledFutureTask.run(ActorSystemScheduledExecutorAdapter.java:171)
    at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
    at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$withContextClassLoader$0(Cl

[jira] [Commented] (FLINK-31064) Error while retrieving the leader gateway

2023-02-14 Thread Tobias Hofer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688451#comment-17688451
 ] 

Tobias Hofer commented on FLINK-31064:
--

I found the following entry in the meantime: 
[https://www.mail-archive.com/user@flink.apache.org/msg37424.html|https://www.mail-archive.com/user@flink.apache.org/msg37424.html.]

Suggesting to do
{code:java}
strategy:
  type: RollingUpdate
  rollingUpdate: maxSurge: 0
  maxUnavailable: 1 {code}
But operator CRD do not allow me to change the update strategy (or at least I 
don't know how).

> Error while retrieving the leader gateway
> -
>
> Key: FLINK-31064
> URL: https://issues.apache.org/jira/browse/FLINK-31064
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.1
> Environment:  
>  
>Reporter: Tobias Hofer
>Priority: Major
> Attachments: jobmanager_log.txt
>
>
> JobManager enters corrupt state after restart (e.g. increasing restartNonce).
> {code:java}
> WARN  | flink-akka.actor.default-dispatcher-5 | RpcGatewayRetriever   
>    | Errorwhile retrieving the leader gateway. Retrying to connect to 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.org.apache.flink.util.concurrent.FutureUtils$RetryException:
>  Could not complete the operation. Number of retries has been exhausted.    
> at 
> org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$6(FutureUtils.java:293)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
> Source)    at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(UnknownSource)
>     at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown 
> Source)    at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown
>  Source)    at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1275) 
>    at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
> Source)    at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(UnknownSource)
>     at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown 
> Source)    at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown
>  Source)    at 
> scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:61)
>     at 
> scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:53)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
> Source)    at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(UnknownSource)
>     at 
> java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown 
> Source)    at java.base/java.lang.Thread.run(Unknown Source)Caused by: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not 
> connect to rpc endpoint under address 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.    at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.lambda$resolveActorAddress$11(AkkaRpcService.java:604)
>     at 
> scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:59)
>     ... 5 moreCaused by: 
> org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not 
> connect to rpc endpoint under address 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.    ... 7 
> moreCaused by: akka.actor.ActorNotFound: Actor not found for: 
> ActorSelection[Anchor(akka://flink/), Path(/user/rpc/resourcemanager_*)]    
> at akka.actor.ActorSelection.$anonfun$resolveOne$1(ActorSelection.scala:74)   
>  at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)    at 
> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:63)
>     at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:81)    
> at 
> akka.dispatch.internal.SameThreadExecutionContext$$anon$1.unbatchedExecute(SameThreadExecutionContext.scala:21)
>     at akka.dispatch.BatchingExecutor.execute(BatchingExecutor.scala:130)    
> at akka.dispatch.BatchingExecutor.execute$(BatchingExecutor.scala:124)    at 
> akka.dispatch.internal.SameThreadExecutionContext$$anon$1.execute(SameThreadExecu

[jira] [Commented] (FLINK-31064) Error while retrieving the leader gateway

2023-02-14 Thread Tobias Hofer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688449#comment-17688449
 ] 

Tobias Hofer commented on FLINK-31064:
--

I first tried to get help on the user mailing list. Without success so far.

https://lists.apache.org/thread/bb4vnw8c7l4725tlokz9xp2bjjdkgmo1.

> Error while retrieving the leader gateway
> -
>
> Key: FLINK-31064
> URL: https://issues.apache.org/jira/browse/FLINK-31064
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.1
> Environment:  
>  
>Reporter: Tobias Hofer
>Priority: Major
> Attachments: jobmanager_log.txt
>
>
> JobManager enters corrupt state after restart (e.g. increasing restartNonce).
> {code:java}
> WARN  | flink-akka.actor.default-dispatcher-5 | RpcGatewayRetriever   
>    | Errorwhile retrieving the leader gateway. Retrying to connect to 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.org.apache.flink.util.concurrent.FutureUtils$RetryException:
>  Could not complete the operation. Number of retries has been exhausted.    
> at 
> org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$6(FutureUtils.java:293)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
> Source)    at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(UnknownSource)
>     at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown 
> Source)    at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown
>  Source)    at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1275) 
>    at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>     at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
> Source)    at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(UnknownSource)
>     at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown 
> Source)    at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown
>  Source)    at 
> scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:61)
>     at 
> scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:53)
>     at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
> Source)    at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(UnknownSource)
>     at 
> java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown 
> Source)    at java.base/java.lang.Thread.run(Unknown Source)Caused by: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not 
> connect to rpc endpoint under address 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.    at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcService.lambda$resolveActorAddress$11(AkkaRpcService.java:604)
>     at 
> scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:59)
>     ... 5 moreCaused by: 
> org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not 
> connect to rpc endpoint under address 
> akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.    ... 7 
> moreCaused by: akka.actor.ActorNotFound: Actor not found for: 
> ActorSelection[Anchor(akka://flink/), Path(/user/rpc/resourcemanager_*)]    
> at akka.actor.ActorSelection.$anonfun$resolveOne$1(ActorSelection.scala:74)   
>  at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)    at 
> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:63)
>     at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:81)    
> at 
> akka.dispatch.internal.SameThreadExecutionContext$$anon$1.unbatchedExecute(SameThreadExecutionContext.scala:21)
>     at akka.dispatch.BatchingExecutor.execute(BatchingExecutor.scala:130)    
> at akka.dispatch.BatchingExecutor.execute$(BatchingExecutor.scala:124)    at 
> akka.dispatch.internal.SameThreadExecutionContext$$anon$1.execute(SameThreadExecutionContext.scala:20)
>     at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)    
> at 
> scala.concurrent.impl.Promise$DefaultPromise.dispatchOrAddCallback(Promise.scala:312)
>     at 
> scala.concurrent.impl.Promise$De

[jira] [Created] (FLINK-31064) Error while retrieving the leader gateway

2023-02-14 Thread Tobias Hofer (Jira)
Tobias Hofer created FLINK-31064:


 Summary: Error while retrieving the leader gateway
 Key: FLINK-31064
 URL: https://issues.apache.org/jira/browse/FLINK-31064
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.3.1
 Environment:  

 
Reporter: Tobias Hofer
 Attachments: jobmanager_log.txt

JobManager enters corrupt state after restart (e.g. increasing restartNonce).
{code:java}
WARN  | flink-akka.actor.default-dispatcher-5 | RpcGatewayRetriever 
 | Errorwhile retrieving the leader gateway. Retrying to connect to 
akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.org.apache.flink.util.concurrent.FutureUtils$RetryException:
 Could not complete the operation. Number of retries has been exhausted.    at 
org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$6(FutureUtils.java:293)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
Source)    at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(UnknownSource)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown 
Source)    at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown 
Source)    at 
org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1275)   
 at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
    at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
    at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
Source)    at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(UnknownSource)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown 
Source)    at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown 
Source)    at 
scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:61)
    at 
scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:53)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown 
Source)    at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(UnknownSource)
    at java.base/java.util.concurrent.CompletableFuture$Completion.run(Unknown 
Source)    at java.base/java.lang.Thread.run(Unknown Source)Caused by: 
java.util.concurrent.CompletionException: 
org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not 
connect to rpc endpoint under address 
akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.    at 
org.apache.flink.runtime.rpc.akka.AkkaRpcService.lambda$resolveActorAddress$11(AkkaRpcService.java:604)
    at 
scala.concurrent.java8.FuturesConvertersImpl$CF$$anon$1.accept(FutureConvertersImpl.scala:59)
    ... 5 moreCaused by: 
org.apache.flink.runtime.rpc.exceptions.RpcConnectionException: Could not 
connect to rpc endpoint under address 
akka.tcp://flink@my-session.flink:6123/user/rpc/resourcemanager_*.    ... 7 
moreCaused by: akka.actor.ActorNotFound: Actor not found for: 
ActorSelection[Anchor(akka://flink/), Path(/user/rpc/resourcemanager_*)]    at 
akka.actor.ActorSelection.$anonfun$resolveOne$1(ActorSelection.scala:74)    at 
scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)    at 
akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:63)
    at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:81)    
at 
akka.dispatch.internal.SameThreadExecutionContext$$anon$1.unbatchedExecute(SameThreadExecutionContext.scala:21)
    at akka.dispatch.BatchingExecutor.execute(BatchingExecutor.scala:130)    at 
akka.dispatch.BatchingExecutor.execute$(BatchingExecutor.scala:124)    at 
akka.dispatch.internal.SameThreadExecutionContext$$anon$1.execute(SameThreadExecutionContext.scala:20)
    at 
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)    at 
scala.concurrent.impl.Promise$DefaultPromise.dispatchOrAddCallback(Promise.scala:312)
    at 
scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:303)    
at akka.actor.ActorSelection.resolveOne(ActorSelection.scala:72)    at 
akka.actor.ActorSelection.resolveOne(ActorSelection.scala:89)    at 
akka.actor.ActorSelection.resolveOne(ActorSelection.scala:130)    at 
org.apache.flink.runtime.rpc.akka.AkkaRpcService.resolveActorAddress(AkkaRpcService.java:598)
    at 
org.apache.flink.runtime.rpc.akka.AkkaRpcService.connectInternal(AkkaRpcService.java:549)
    at 
org.apache.flink.runtime.rpc.akka.AkkaRpcService.connect(AkkaRpcService.java:232)

[jira] [Commented] (FLINK-30046) Cannot use digest in Helm chart to reference image

2022-11-30 Thread Tobias Hofer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641239#comment-17641239
 ] 

Tobias Hofer commented on FLINK-30046:
--

Hi [~mbalassi], I'm sorry for the situation. I try to contact the webmaster to 
get a copy.

The guide mainly proposed to use a single value named {{version}} to handle 
tags and digests transparently. The guide also contained a proposal of the 
changes in the Helm chart templates. Nothing worldshaking if you're used to 
create Helm charts (but I'm not).

> Cannot use digest in Helm chart to reference image
> --
>
> Key: FLINK-30046
> URL: https://issues.apache.org/jira/browse/FLINK-30046
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.2.0, kubernetes-operator-1.3.0, 
> kubernetes-operator-1.2.1
>Reporter: Tobias Hofer
>Assignee: Márton Balassi
>Priority: Major
> Fix For: kubernetes-operator-1.3.0
>
>
> Images can be referenced by tag only.
> Referencing images by digest has a number of advantages:
>  # Avoid unexpected or undesirable image changes.
>  # Increase security and awareness by knowing the specific image running in 
> your environment.
> The following document describes a template to handle tags and digests:
> [Adding Image Digest References to Your Helm 
> Charts|https://blog.andyserver.com/2021/09/adding-image-digest-references-to-your-helm-charts/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30046) Cannot use digest in Helm chart to reference image

2022-11-16 Thread Tobias Hofer (Jira)
Tobias Hofer created FLINK-30046:


 Summary: Cannot use digest in Helm chart to reference image
 Key: FLINK-30046
 URL: https://issues.apache.org/jira/browse/FLINK-30046
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.2.0, kubernetes-operator-1.3.0, 
kubernetes-operator-1.2.1
Reporter: Tobias Hofer


Images can be referenced by tag only.

Referencing images by digest has a number of advantages:
 # Avoid unexpected or undesirable image changes.
 # Increase security and awareness by knowing the specific image running in 
your environment.

The following document describes a template to handle tags and digests:

[Adding Image Digest References to Your Helm 
Charts|https://blog.andyserver.com/2021/09/adding-image-digest-references-to-your-helm-charts/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)