my guess is that tmp directory got cleaned on your host and Flink couldn't
restore memory state from it upon startup.

Take a look at
https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#configuring-temporary-io-directories
article, I think it is relevant

On Thu, Nov 1, 2018 at 8:51 PM Dmitry Minaev <mina...@gmail.com> wrote:

> Hi everyone,
>
> I'm having an issue when restarting a job in Flink. I'm doing a simple
> stop with savepoint and then start from the savepoint. Savepoints are
> stored in a separate folder, there is no configuration for "/tmp" folder in
> my setup. There is only 1 task manager and parallelism is 1.
>
> I'm getting FileNotFoundException:
>
> 31 Oct 2018 23:40:35,837 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph -
> filter-business-metrics -> Sink: data_feed (1/1)
> (51ce53532932c33805291dc188d2f99e) switched from DEPLOYING to RUNNING.
> 31 Oct 2018 23:40:35,837 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph -
> agents-working-on-interactions (1/1) (72a916158d07f2353fb270848d95ba2f)
> switched from DEPLOYING to RUNNING.
> 31 Oct 2018 23:40:35,929 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph -
> interaction-details (1/1) (c004e64e90c0dbd3bc007459bc3d7420) switched from
> RUNNING to FAILED.
> java.io.FileNotFoundException:
> /tmp/flink-io-7bfd6603-c115-463d-bcfc-b97e31be5a37/f7ce787242e6afd91c3cbeccc2f74bc4a7dd0e6e600ff83e51bc5be9a95750f9.0.buffer
> (No such file or directory)
>         at java.io.RandomAccessFile.open0(Native Method)
>         at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
>         at
> org.apache.flink.streaming.runtime.io.BufferSpiller.createSpillingChannel(BufferSpiller.java:259)
>         at
> org.apache.flink.streaming.runtime.io.BufferSpiller.<init>(BufferSpiller.java:120)
>         at
> org.apache.flink.streaming.runtime.io.BarrierBuffer.<init>(BarrierBuffer.java:149)
>         at
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.<init>(StreamInputProcessor.java:129)
>         at
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.init(OneInputStreamTask.java:56)
>         at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:235)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
>         at java.lang.Thread.run(Thread.java:748)
>
> I've checked the logs and there are no errors prior to that. The job was
> stopped with no issues, and it was starting normally and passed multiple
> operators setting them to RUNNING state. But for several other operators it
> throws this FileNotFoundException.
>
> Any help is appreciated.
>
> -- Regards, Dmitry
> --
>
> --
> Dmitry
>

Reply via email to