Hi Diogo,

the idea is that a savepoint operation can also fail. The status only
denotes whether the savepoint operation is still in-progress or completed
because it is an asynchronous operation. A savepoint operation can be
completed if it succeeded or if it failed. The failure cause should tell
you what went wrong with the operation. Does this make sense?

Cheers,
Till

On Mon, May 17, 2021 at 7:00 AM Diogo Santos <diogodssan...@gmail.com>
wrote:

> Hi guys,
>
> We developed some scripts to improve the rolling updates in our pipelines,
> and one of the tasks done is to trigger a savepoint and waits for the
> response until the status is Completed or until it achieves the limit of
> retries.
>
> It was noticed that sometimes the response has the status Completed but the
> request failed:
>
> {
>     "status": {
>         "id": "COMPLETED"
>     },
>     "operation": {
>         "failure-cause": {
>             "class": "java.util.concurrent.CompletionException",
>             "stack-trace": "java.util.concurrent.CompletionException: ....
> )\n\t... 47 more\n",
>             "serialized-throwable": "..."
>         }
>     }
> }
>
> An easy way to reproduce the issue is to put the job in a restart loop and
> trigger a savepoint.
>
> Should the status be in-progress, right?
>

Reply via email to