Hi Matthias,  You are correct.  After a few minutes I took another look at
my savepoint folder and the data was there.  I think increasing the timeout
may resolve the problem?

On Fri, May 28, 2021 at 8:21 AM Matthias Pohl <matth...@ververica.com>
wrote:

> Hi Robert,
> it would be interesting to see the corresponding taskmanager/jobmanager
> logs. That would help in finding out why the savepoint creation failed.
> Just to verify: The savepoint data wasn't written to S3 even after the
> timeout happened, was it?
>
> Best,
> Matthias
>
> On Thu, May 27, 2021 at 7:50 PM Robert Cullen <cinquate...@gmail.com>
> wrote:
>
>> I triggered a savepoint from a currently running job. Although the
>> directory structure gets created in the MINIO S3 store, the command
>> ultimately fails without writing the data.
>>
>> root@flink-client:/opt/flink# ./bin/flink list --target kubernetes-session 
>> -Dkubernetes.cluster-id=flink-jobmanager -Dkubernetes.namespace=cmdaa
>> 2021-05-27 17:37:00,409 INFO  
>> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Retrieve 
>> flink cluster flink-jobmanager successfully, JobManager Web Interface: 
>> http://flink-jobmanager-rest.cmdaa:8081
>> Waiting for response...
>> ------------------ Running/Restarting Jobs -------------------
>> 27.05.2021 16:50:00 : 72f614340dc1a7416d0613362d1ef83b : Streaming Log Count 
>> (RUNNING)
>> --------------------------------------------------------------
>> No scheduled jobs.
>> root@flink-client:/opt/flink# ./bin/flink savepoint 
>> 72f614340dc1a7416d0613362d1ef83b --target kubernetes-session 
>> -Dkubernetes.cluster-id=flink-jobmanager -Dkubernetes.namespace=cmdaa
>> 2021-05-27 17:37:58,776 INFO  
>> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Retrieve 
>> flink cluster flink-jobmanager successfully, JobManager Web Interface: 
>> http://flink-jobmanager-rest.cmdaa:8081
>> Triggering savepoint for job 72f614340dc1a7416d0613362d1ef83b.
>> Waiting for response...
>>
>> ------------------------------------------------------------
>>  The program finished with the following exception:
>>
>> org.apache.flink.util.FlinkException: Triggering a savepoint for the job 
>> 72f614340dc1a7416d0613362d1ef83b failed.
>>         at 
>> org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:777)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.lambda$savepoint$9(CliFrontend.java:754)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1002)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:751)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1072)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
>>         at 
>> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)
>> Caused by: java.util.concurrent.TimeoutException
>>         at 
>> java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
>>         at 
>> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:771)
>>         ... 7 more
>> root@flink-client:/opt/flink#
>>
>> --
>> Robert Cullen
>> 240-475-4490
>>
>

-- 
Robert Cullen
240-475-4490

Reply via email to