rmetzger edited a comment on pull request #18692:
URL: https://github.com/apache/flink/pull/18692#issuecomment-1034859586


   Sadly, the JRS still doesn't work on K8s, using a minio s3 implementation:
   ```
   2022-02-10 12:20:23,679 INFO  
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - 
Starting the resource manager.
   2022-02-10 12:20:23,765 INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Start SessionDispatcherLeaderProcess.
   2022-02-10 12:20:25,060 INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Stopping SessionDispatcherLeaderProcess.
   2022-02-10 12:20:25,164 INFO  
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore     [] - Stopping 
DefaultJobGraphStore.
   2022-02-10 12:20:25,255 ERROR 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Fatal error 
occurred in the cluster entrypoint.
   java.util.concurrent.CompletionException: 
org.apache.flink.util.FlinkRuntimeException: Could not retrieve JobResults of 
globally-terminated jobs from JobResultStore
        at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
 ~[?:1.8.0_322]
        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
 [?:1.8.0_322]
        at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
 [?:1.8.0_322]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_322]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_322]
        at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
   Caused by: org.apache.flink.util.FlinkRuntimeException: Could not retrieve 
JobResults of globally-terminated jobs from JobResultStore
        at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResults(SessionDispatcherLeaderProcess.java:186)
 ~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
        at 
org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess.supplyUnsynchronizedIfRunning(AbstractDispatcherLeaderProcess.java:198)
 ~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
        at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResultsIfRunning(SessionDispatcherLeaderProcess.java:178)
 ~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
        at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
 ~[?:1.8.0_322]
        ... 3 more
   Caused by: java.io.FileNotFoundException: No such file or directory: 
s3://vvc-eu-west-1-dev-store/myorg/myscope/3d78a6e7-4c88-4e6f-8e59-4fb4b6dd6319-test-job-name-aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/ha/job-result-store/default
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2344) 
~[?:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2226)
 ~[?:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2160) 
~[?:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1961) 
~[?:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$9(S3AFileSystem.java:1940)
 ~[?:?]
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) ~[?:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1940) 
~[?:?]
        at 
org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.listStatus(HadoopFileSystem.java:170)
 ~[?:?]
        at 
org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.listStatus(PluginFileSystemFactory.java:141)
 ~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
        at 
org.apache.flink.runtime.highavailability.FileSystemJobResultStore.getDirtyResultsInternal(FileSystemJobResultStore.java:158)
 ~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
        at 
org.apache.flink.runtime.highavailability.AbstractThreadsafeJobResultStore.withReadLock(AbstractThreadsafeJobResultStore.java:118)
 ~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
        at 
org.apache.flink.runtime.highavailability.AbstractThreadsafeJobResultStore.getDirtyResults(AbstractThreadsafeJobResultStore.java:100)
 ~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
        at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResults(SessionDispatcherLeaderProcess.java:184)
 ~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
        at 
org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess.supplyUnsynchronizedIfRunning(AbstractDispatcherLeaderProcess.java:198)
 ~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
        at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResultsIfRunning(SessionDispatcherLeaderProcess.java:178)
 ~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
        at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
 ~[?:1.8.0_322]
        ... 3 more
   2022-02-10 12:20:25,384 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Shutting 
StandaloneApplicationClusterEntryPoint down with application status UNKNOWN. 
Diagnostics Cluster entrypoint has been closed externally..
   ```
   
   The directory exists:
   
   ```
   AWS_ACCESS_KEY_ID=admin AWS_SECRET_ACCESS_KEY=password aws --endpoint-url 
http://localhost:9000 s3 ls 
s3://vvc-eu-west-1-dev-store/myorg/myscope/3d78a6e7-4c88-4e6f-8e59-4fb4b6dd6319-test-job-name-aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/ha/job-result-store/default
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to