Is it filesystem or hadoop? If its NAS then why the exception "Caused by:
org.apache.hadoop.hdfs.BlockMissingException: "
It seems you configured hadoop state store and giving NAS mount.

Regards
Bhaskar



On Thu, Nov 28, 2019 at 11:36 AM 曾祥才 <xcz200...@qq.com> wrote:

> /flink/checkpoints  is a external persistent store (a nas directory mounts
> to the job manager)
>
>
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Vijay Bhaskar"<bhaskar.eba...@gmail.com>;
> *发送时间:* 2019年11月28日(星期四) 下午2:29
> *收件人:* "曾祥才"<xcz200...@qq.com>;
> *抄送:* "user"<user@flink.apache.org>;
> *主题:* Re: JobGraphs not cleaned up in HA mode
>
> Following are the mandatory condition to run in HA:
>
> a) You should have persistent common external store for jobmanager and
> task managers to while writing the state
> b) You should have persistent external store for zookeeper to store the
> Jobgraph.
>
> Zookeeper is referring  path:
> /flink/checkpoints/submittedJobGraph480ddf9572ed  to get the job graph but
> jobmanager unable to find it.
> It seems /flink/checkpoints  is not the external persistent store
>
>
> Regards
> Bhaskar
>
> On Thu, Nov 28, 2019 at 10:43 AM seuzxc <xcz200...@qq.com> wrote:
>
>> hi ,I've the same problem with flink 1.9.1 , any solution to fix it
>> when the k8s redoploy jobmanager ,  the error looks like (seems zk not
>> remove submitted job info, but jobmanager remove the file):
>>
>>
>> Caused by: org.apache.flink.util.FlinkException: Could not retrieve
>> submitted JobGraph from state handle under
>> /147dd022ec91f7381ad4ca3d290387e9. This indicates that the retrieved state
>> handle is broken. Try cleaning the state handle store.
>>         at
>>
>> org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:208)
>>         at
>>
>> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJob(Dispatcher.java:696)
>>         at
>>
>> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobGraphs(Dispatcher.java:681)
>>         at
>>
>> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobs(Dispatcher.java:662)
>>         at
>>
>> org.apache.flink.runtime.dispatcher.Dispatcher.lambda$null$26(Dispatcher.java:821)
>>         at
>>
>> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:72)
>>         ... 9 more
>> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain
>> block: BP-1651346363-10.20.1.81-1525354906737:blk_1083182315_9441494
>> file=/flink/checkpoints/submittedJobGraph480ddf9572ed
>>         at
>>
>> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1052)
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>

Reply via email to