Guowei Ma created FLINK-21600: --------------------------------- Summary: Resuming Savepoint (rocks, no parallelism change, heap timers) end-to-end test failed Key: FLINK-21600 URL: https://issues.apache.org/jira/browse/FLINK-21600 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.12.2 Reporter: Guowei Ma
{code:java} 2021-03-03T23:23:22.9286204Z The program finished with the following exception: 2021-03-03T23:23:22.9286551Z 2021-03-03T23:23:22.9287394Z org.apache.flink.util.FlinkException: Could not stop with a savepoint job "6c88aa43b703192e88987473e722c22c". 2021-03-03T23:23:22.9288135Z at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:581) 2021-03-03T23:23:22.9288846Z at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1002) 2021-03-03T23:23:22.9289787Z at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:569) 2021-03-03T23:23:22.9290721Z at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1069) 2021-03-03T23:23:22.9291800Z at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) 2021-03-03T23:23:22.9292565Z at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) 2021-03-03T23:23:22.9293368Z at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) 2021-03-03T23:23:22.9294111Z Caused by: java.util.concurrent.TimeoutException 2021-03-03T23:23:22.9294842Z at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886) 2021-03-03T23:23:22.9295735Z at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021) 2021-03-03T23:23:22.9296418Z at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:579) 2021-03-03T23:23:22.9297527Z ... 6 more 2021-03-03T23:23:22.9622887Z Mar 03 23:23:22 Waiting for job (6c88aa43b703192e88987473e722c22c) to reach terminal state FINISHED ... 2021-03-03T23:36:57.3094695Z Mar 03 23:36:57 Test (pid: 3517) did not finish after 900 seconds. {code} killing logs {code:java} 2021-03-03T23:36:57.3232128Z Mar 03 23:36:57 java.util.concurrent.RejectedExecutionException: event executor terminated 2021-03-03T23:36:57.3233348Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:926) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3234766Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:353) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3236170Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:346) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3237567Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:828) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3265526Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:818) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3267472Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe.register(AbstractChannel.java:471) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3268806Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:87) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3275491Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:81) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3277092Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.channel.MultithreadEventLoopGroup.register(MultithreadEventLoopGroup.java:86) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3279030Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.bootstrap.AbstractBootstrap.initAndRegister(AbstractBootstrap.java:323) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3280277Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.bootstrap.Bootstrap.doResolveAndConnect(Bootstrap.java:155) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3281591Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:139) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3282748Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:123) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3283868Z Mar 03 23:36:57 at org.apache.flink.runtime.rest.RestClient.submitRequest(RestClient.java:421) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3284969Z Mar 03 23:36:57 at org.apache.flink.runtime.rest.RestClient.sendRequest(RestClient.java:344) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3286053Z Mar 03 23:36:57 at org.apache.flink.runtime.rest.RestClient.sendRequest(RestClient.java:258) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3287244Z Mar 03 23:36:57 at org.apache.flink.client.program.rest.RestClusterClient.lambda$sendRetriableRequest$23(RestClusterClient.java:777) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3288091Z Mar 03 23:36:57 at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072) [?:?] 2021-03-03T23:36:57.3288763Z Mar 03 23:36:57 at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) [?:?] 2021-03-03T23:36:57.3289660Z Mar 03 23:36:57 at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610) [?:?] 2021-03-03T23:36:57.3290339Z Mar 03 23:36:57 at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:649) [?:?] 2021-03-03T23:36:57.3291011Z Mar 03 23:36:57 at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) [?:?] 2021-03-03T23:36:57.3291678Z Mar 03 23:36:57 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] 2021-03-03T23:36:57.3292340Z Mar 03 23:36:57 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] 2021-03-03T23:36:57.3292927Z Mar 03 23:36:57 at java.lang.Thread.run(Thread.java:834) [?:?] 2021-03-03T23:36:57.3294023Z Mar 03 23:36:57 2021-03-03 23:23:22,926 ERROR org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.rejectedExecution [] - Failed to submit a listener notification task. Event loop shut down? 2021-03-03T23:36:57.3294797Z Mar 03 23:36:57 java.util.concurrent.RejectedExecutionException: event executor terminated 2021-03-03T23:36:57.3330282Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:926) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3332091Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:353) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3333454Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:346) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3334983Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:828) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3336322Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:818) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3337809Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:841) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3339118Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:498) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3368256Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:183) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3369794Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:95) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3371107Z Mar 03 23:36:57 at org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:30) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3372302Z Mar 03 23:36:57 at org.apache.flink.runtime.rest.RestClient.submitRequest(RestClient.java:425) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3373399Z Mar 03 23:36:57 at org.apache.flink.runtime.rest.RestClient.sendRequest(RestClient.java:344) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3374497Z Mar 03 23:36:57 at org.apache.flink.runtime.rest.RestClient.sendRequest(RestClient.java:258) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3375694Z Mar 03 23:36:57 at org.apache.flink.client.program.rest.RestClusterClient.lambda$sendRetriableRequest$23(RestClusterClient.java:777) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3376535Z Mar 03 23:36:57 at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072) [?:?] 2021-03-03T23:36:57.3377218Z Mar 03 23:36:57 at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) [?:?] 2021-03-03T23:36:57.3377874Z Mar 03 23:36:57 at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610) [?:?] 2021-03-03T23:36:57.3378745Z Mar 03 23:36:57 at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:649) [?:?] 2021-03-03T23:36:57.3379443Z Mar 03 23:36:57 at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) [?:?] 2021-03-03T23:36:57.3380305Z Mar 03 23:36:57 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] 2021-03-03T23:36:57.3381397Z Mar 03 23:36:57 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] 2021-03-03T23:36:57.3382019Z Mar 03 23:36:57 at java.lang.Thread.run(Thread.java:834) [?:?] 2021-03-03T23:36:57.3382989Z Mar 03 23:36:57 2021-03-03 23:23:22,926 ERROR org.apache.flink.client.cli.CliFrontend [] - Error while running the command. 2021-03-03T23:36:57.3383776Z Mar 03 23:36:57 org.apache.flink.util.FlinkException: Could not stop with a savepoint job "6c88aa43b703192e88987473e722c22c". 2021-03-03T23:36:57.3385129Z Mar 03 23:36:57 at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:581) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3422634Z Mar 03 23:36:57 at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1002) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3423852Z Mar 03 23:36:57 at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:569) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] 2021-03-03T23:36:57.3425311Z Mar 03 23:36:57 at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1069) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)