Hi

请问你使用哪个版本的 Flink 呢?能否分享一下  Co-Process (1/1)
(d0309f26a545e74643382ed3f758269b) 这个 tm 的 log 呢?从上面给的日志看,应该是在 083f69d029de
这台机器上。

Best,
Congxian


Z-Z <zz9876543...@qq.com> 于2020年7月17日周五 下午6:22写道:

> 大家好,我在部署的时候发现了一个问题,我通过restAPI接口停掉了一个任务并保存了它的savepoint(步骤:/jobs/overview
> ---&gt; /jobs/{jobid}/savepoints ---&gt;
> /jobs/{jobid}/savepoints/{triggerid}),但我通过flink命令带上savepoint部署任务时会报错,但通过webui上传jar并带上savepoint就不会报错,报错堆栈如下:
> 2020-07-17 09:51:48,925 INFO&nbsp;
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager&nbsp; -
> Request slot with profile ResourceProfile{UNKNOWN} for job
> 7639673873b707aa86c4387aa7b4aac3 with allocation id
> e8865cdbfe4c3c33099c7112bc2e3231.
> 2020-07-17 09:51:48,952 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Source: Custom Source -&gt; Filter (1/1)
> (1177659bff014e8dbc3f0508055d4307) switched from SCHEDULED to DEPLOYING.
> 2020-07-17 09:51:48,952 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Deploying Source: Custom Source -&gt; Filter (1/1) (attempt #0) to
> e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
> 2020-07-17 09:51:48,953 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Source: Custom Source (1/1) (141f0dc22b624b39e21127f637ba63c2)
> switched from SCHEDULED to DEPLOYING.
> 2020-07-17 09:51:48,953 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Deploying Source: Custom Source (1/1) (attempt #0) to
> e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
> 2020-07-17 09:51:48,954 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Source: Custom Source (1/1) (274b3df03e1fab627059c1a78e4a26da)
> switched from SCHEDULED to DEPLOYING.
> 2020-07-17 09:51:48,954 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Deploying Source: Custom Source (1/1) (attempt #0) to
> e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
> 2020-07-17 09:51:48,954 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from
> SCHEDULED to DEPLOYING.
> 2020-07-17 09:51:48,954 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Deploying Co-Process (1/1) (attempt #0) to
> e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
> 2020-07-17 09:51:48,955 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Co-Process -&gt; (Sink: Unnamed, Sink: Unnamed) (1/1)
> (618b75fcf5ea05fb5c6487bec6426e31) switched from SCHEDULED to DEPLOYING.
> 2020-07-17 09:51:48,955 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Deploying Co-Process -&gt; (Sink: Unnamed, Sink: Unnamed) (1/1)
> (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de
> (dataPort=35758)
> 2020-07-17 09:51:49,346 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Co-Process -&gt; (Sink: Unnamed, Sink: Unnamed) (1/1)
> (618b75fcf5ea05fb5c6487bec6426e31) switched from DEPLOYING to RUNNING.
> 2020-07-17 09:51:49,370 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Source: Custom Source (1/1) (274b3df03e1fab627059c1a78e4a26da)
> switched from DEPLOYING to RUNNING.
> 2020-07-17 09:51:49,370 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Source: Custom Source (1/1) (141f0dc22b624b39e21127f637ba63c2)
> switched from DEPLOYING to RUNNING.
> 2020-07-17 09:51:49,377 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from
> DEPLOYING to RUNNING.
> 2020-07-17 09:51:49,377 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Source: Custom Source -&gt; Filter (1/1)
> (1177659bff014e8dbc3f0508055d4307) switched from DEPLOYING to RUNNING.
> 2020-07-17 09:51:49,493 INFO&nbsp;
> org.apache.flink.runtime.executiongraph.ExecutionGraph&nbsp; &nbsp; &nbsp;
> &nbsp; - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from
> RUNNING to FAILED.
> java.lang.Exception: Exception while creating StreamOperatorStateContext.
>         at
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:191)
>         at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:255)
>         at
> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1006)
>         at
> org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:454)
>         at
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
>         at
> org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:449)
>         at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:461)
>         at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.util.FlinkException: Could not restore keyed
> state backend for
> LegacyKeyedCoProcessOperator_65e7116c7aa972ad18a796ae22bd6327_(1/1) from
> any of the 1 provided restore options.
>         at
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>         at
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:304)
>         at
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:131)
>         ... 9 more
> Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught
> unexpected exception.
>         at
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:336)
>         at
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:548)
>         at
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:288)
>         at
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)
>         at
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)
>         ... 11 more
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:197)
>         at java.io.DataInputStream.readFully(DataInputStream.java:169)
>         at
> org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:85)
>         at
> org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKVStateData(RocksDBFullRestoreOperation.java:221)
>         at
> org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBFullRestoreOperation.java:168)
>         at
> org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restore(RocksDBFullRestoreOperation.java:151)
>         at
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:279)
>         ... 15 more

回复