Re: Sporadic issues with savepoint status lookup in Flink 1.15

2022-06-17 Thread Chesnay Schepler
D4C.C00AC080 cidimage002.jpg@01D78D4C.C00AC080 <http://www.genesys.com/> *From: *Chesnay Schepler *Date: *Thursday, June 16, 2022 at 11:32 AM *To: *Peter Westermann , user@flink.apache.org *Subject: *Re: Sporadic issues with savepoint status lookup in Flink 1.15 * EXTERNAL EMAIL - Please use

Re: Sporadic issues with savepoint status lookup in Flink 1.15

2022-06-16 Thread Peter Westermann
image001.jpg@01D78D4C.C00AC080] [cidimage002.jpg@01D78D4C.C00AC080]<http://www.genesys.com/> From: Chesnay Schepler Date: Thursday, June 16, 2022 at 11:32 AM To: Peter Westermann , user@flink.apache.org Subject: Re: Sporadic issues with savepoint status lookup in Flink 1.15 EXTERNAL EMAIL - Please use

Re: Sporadic issues with savepoint status lookup in Flink 1.15

2022-06-16 Thread Chesnay Schepler
hesnay Schepler *Date: *Thursday, June 16, 2022 at 10:55 AM *To: *Peter Westermann , user@flink.apache.org *Subject: *Re: Sporadic issues with savepoint status lookup in Flink 1.15 * EXTERNAL EMAIL - Please use caution with

Re: Sporadic issues with savepoint status lookup in Flink 1.15

2022-06-16 Thread Chesnay Schepler
Are there any log messages from the CompletedOperationCache in the logs? On 16/06/2022 16:54, Chesnay Schepler wrote: There is an expected case where this might happen: if too much time has elapsed since the savepoint was completed (default 5 minutes; controlled by rest.async.store-duration)

Re: Sporadic issues with savepoint status lookup in Flink 1.15

2022-06-16 Thread Peter Westermann
Subject: Re: Sporadic issues with savepoint status lookup in Flink 1.15 EXTERNAL EMAIL - Please use caution with links and attachments There is an expected case where this might happen: if too much time has elapsed since the savepoint was completed (default 5 minutes

Re: Sporadic issues with savepoint status lookup in Flink 1.15

2022-06-16 Thread Chesnay Schepler
There is an expected case where this might happen: if too much time has elapsed since the savepoint was completed (default 5 minutes; controlled by rest.async.store-duration) Did this happen earlier than that? On 16/06/2022 15:53, Peter Westermann wrote: We recently upgraded one of our Flink

Sporadic issues with savepoint status lookup in Flink 1.15

2022-06-16 Thread Peter Westermann
We recently upgraded one of our Flink clusters to version 1.15.0 and are now seeing sporadic issues when stopping a job with a savepoint via the REST API. This happens for /jobs/:jobid/savepoints and /jobs/:jobid/stop: The job finishes with a savepoint but the triggerId returned from the REST API