[ https://issues.apache.org/jira/browse/FLINK-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967225#comment-16967225 ]
Zhu Zhu commented on FLINK-14572: --------------------------------- >From the root cause, the case fails when trying to get jar files from blob >server, but the file does not exist. This seems not to be related to NG scheduler since the scheduler is not created yet. >From the logics of {{BlobsCleanupITCase#testBlobServerCleanup()}}, a job will >only be submitted if the jar uploading succeeded with >{{BlobClient.uploadFiles}}. So it's high likely that the uploaded jar was >removed unexpectedly before the job is submitted. But I've no idea why the file can be removed. And I cannot produce this problem locally with hundreds of re-runs either. [~trohrmann][~gjy] Does you have good ideas for it? Root cause: 07:47:47,787 ERROR org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Failed to submit job 15838a6ef77eb89697d3def42c1a58b0. java.lang.RuntimeException: org.apache.flink.runtime.client.JobExecutionException: Could not set up JobManager at org.apache.flink.util.function.CheckedSupplier.lambda$unchecked$0(CheckedSupplier.java:36) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not set up JobManager at org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.<init>(JobManagerRunnerImpl.java:152) at org.apache.flink.runtime.dispatcher.DefaultJobManagerRunnerFactory.createJobManagerRunner(DefaultJobManagerRunnerFactory.java:84) at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$6(Dispatcher.java:381) at org.apache.flink.util.function.CheckedSupplier.lambda$unchecked$0(CheckedSupplier.java:34) ... 7 more Caused by: java.lang.Exception: Cannot set up the user code libraries: /tmp/junit6489941706970935338/junit9210755917417967354/blobStore-1474738d-89f8-4ab0-88b2-9df867ba4cc1/incoming/temp-00000001 at org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.<init>(JobManagerRunnerImpl.java:131) ... 10 more Caused by: java.nio.file.NoSuchFileException: /tmp/junit6489941706970935338/junit9210755917417967354/blobStore-1474738d-89f8-4ab0-88b2-9df867ba4cc1/incoming/temp-00000001 at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409) at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) at java.nio.file.Files.move(Files.java:1395) at org.apache.flink.runtime.blob.BlobUtils.moveTempFileToStore(BlobUtils.java:410) at org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:497) at org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:444) at org.apache.flink.runtime.blob.BlobServer.getFile(BlobServer.java:417) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:120) at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerJob(BlobLibraryCacheManager.java:91) at org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.<init>(JobManagerRunnerImpl.java:128) ... 10 more > BlobsCleanupITCase failed in Travis stage core - scheduler_ng > ------------------------------------------------------------- > > Key: FLINK-14572 > URL: https://issues.apache.org/jira/browse/FLINK-14572 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Tests > Affects Versions: 1.10.0 > Reporter: Gary Yao > Priority: Critical > Labels: scheduler-ng, test-stability > Fix For: 1.10.0 > > > {noformat} > java.lang.AssertionError: > Expected: is <true> > but: was <false> > at > org.apache.flink.runtime.jobmanager.BlobsCleanupITCase.testBlobServerCleanup(BlobsCleanupITCase.java:220) > at > org.apache.flink.runtime.jobmanager.BlobsCleanupITCase.testBlobServerCleanupFinishedJob(BlobsCleanupITCase.java:133) > {noformat} > https://api.travis-ci.com/v3/job/250445874/log.txt -- This message was sent by Atlassian Jira (v8.3.4#803005)