Hi Dongwon, Thanks for reporting the issue, I've created a ticket for it [1] and we will analyse and try to fix it soon. In the meantime it should be safe for you to ignore this problem. If this failure happens only rarely, you can always retry stop-with-savepoint command and there should be no visible side effects for you.
Piotrek [1] https://issues.apache.org/jira/browse/FLINK-24846 wt., 9 lis 2021 o 03:55 Dongwon Kim <eastcirc...@gmail.com> napisaĆ(a): > Hi community, > > I failed to stop a job with savepoint with the following message: > >> Inconsistent execution state after stopping with savepoint. At least one >> execution is still in one of the following states: FAILED, CANCELED. A >> global fail-over is triggered to recover the job >> 452594f3ec5797f399e07f95c884a44b. >> > > The job manager said > >> A savepoint was created at >> hdfs://mobdata-flink-hdfs/driving-habits/svpts/savepoint-452594-f60305755d0e >> but the corresponding job 452594f3ec5797f399e07f95c884a44b didn't terminate >> successfully. > > while complaining about > >> Mailbox is in state QUIESCED, but is required to be in state OPEN for put >> operations. >> > > Is it okay to ignore this kind of error? > > Please see the attached files for the detailed context. > > FYI, > - I used the latest 1.14.0 > - I started the job with "$FLINK_HOME"/bin/flink run --target yarn-per-job > - I couldn't reproduce the exception using the same jar so I might not > able to provide DUBUG messages > > Best, > > Dongwon > >