Hi 请问你使用哪个版本的 Flink 呢?能否分享一下 Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) 这个 tm 的 log 呢?从上面给的日志看,应该是在 083f69d029de 这台机器上。
Best, Congxian Z-Z <zz9876543...@qq.com> 于2020年7月17日周五 下午6:22写道: > 大家好,我在部署的时候发现了一个问题,我通过restAPI接口停掉了一个任务并保存了它的savepoint(步骤:/jobs/overview > ---> /jobs/{jobid}/savepoints ---> > /jobs/{jobid}/savepoints/{triggerid}),但我通过flink命令带上savepoint部署任务时会报错,但通过webui上传jar并带上savepoint就不会报错,报错堆栈如下: > 2020-07-17 09:51:48,925 INFO > org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - > Request slot with profile ResourceProfile{UNKNOWN} for job > 7639673873b707aa86c4387aa7b4aac3 with allocation id > e8865cdbfe4c3c33099c7112bc2e3231. > 2020-07-17 09:51:48,952 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Source: Custom Source -> Filter (1/1) > (1177659bff014e8dbc3f0508055d4307) switched from SCHEDULED to DEPLOYING. > 2020-07-17 09:51:48,952 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Deploying Source: Custom Source -> Filter (1/1) (attempt #0) to > e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758) > 2020-07-17 09:51:48,953 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Source: Custom Source (1/1) (141f0dc22b624b39e21127f637ba63c2) > switched from SCHEDULED to DEPLOYING. > 2020-07-17 09:51:48,953 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Deploying Source: Custom Source (1/1) (attempt #0) to > e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758) > 2020-07-17 09:51:48,954 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Source: Custom Source (1/1) (274b3df03e1fab627059c1a78e4a26da) > switched from SCHEDULED to DEPLOYING. > 2020-07-17 09:51:48,954 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Deploying Source: Custom Source (1/1) (attempt #0) to > e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758) > 2020-07-17 09:51:48,954 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from > SCHEDULED to DEPLOYING. > 2020-07-17 09:51:48,954 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Deploying Co-Process (1/1) (attempt #0) to > e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758) > 2020-07-17 09:51:48,955 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) > (618b75fcf5ea05fb5c6487bec6426e31) switched from SCHEDULED to DEPLOYING. > 2020-07-17 09:51:48,955 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Deploying Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) > (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de > (dataPort=35758) > 2020-07-17 09:51:49,346 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) > (618b75fcf5ea05fb5c6487bec6426e31) switched from DEPLOYING to RUNNING. > 2020-07-17 09:51:49,370 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Source: Custom Source (1/1) (274b3df03e1fab627059c1a78e4a26da) > switched from DEPLOYING to RUNNING. > 2020-07-17 09:51:49,370 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Source: Custom Source (1/1) (141f0dc22b624b39e21127f637ba63c2) > switched from DEPLOYING to RUNNING. > 2020-07-17 09:51:49,377 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from > DEPLOYING to RUNNING. > 2020-07-17 09:51:49,377 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Source: Custom Source -> Filter (1/1) > (1177659bff014e8dbc3f0508055d4307) switched from DEPLOYING to RUNNING. > 2020-07-17 09:51:49,493 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph > - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from > RUNNING to FAILED. > java.lang.Exception: Exception while creating StreamOperatorStateContext. > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:191) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:255) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1006) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:454) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:449) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:461) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.flink.util.FlinkException: Could not restore keyed > state backend for > LegacyKeyedCoProcessOperator_65e7116c7aa972ad18a796ae22bd6327_(1/1) from > any of the 1 provided restore options. > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:304) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:131) > ... 9 more > Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught > unexpected exception. > at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:336) > at > org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:548) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:288) > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142) > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121) > ... 11 more > Caused by: java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:197) > at java.io.DataInputStream.readFully(DataInputStream.java:169) > at > org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:85) > at > org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKVStateData(RocksDBFullRestoreOperation.java:221) > at > org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBFullRestoreOperation.java:168) > at > org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restore(RocksDBFullRestoreOperation.java:151) > at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:279) > ... 15 more