Hi everyone,

I'd like to propose a change on the Web UI to replace the Attempt column
with an Attempt Number column on the subtask list page.

>From the very beginning, the attempt number shown is calculated at the
frontend by subtask.attempt + 1, which means the attempt number shown on
the web UI is not the same as it is in the runtime, as well as the logs and
the metrics. Users may get confused since they can't find logs or metrics
of the subtask with the same attempt number.

Fortunately, by now the users don't need to care about the attempt number,
since there can be only one attempt of each subtask. However, the confusion
seems inevitable once the speculative execution[1] or the attempt history
is introduced, since multiple attempts of the same subtask can be executed
or presented at the same time.

I suggest that the attempt number shown on the web UI should be changed to
align that on the runtime side, which is used in logging and metrics
reporting. To avoid confusion, the column should also be renamed as
"Attempt Number". The changes should only affect the Web UI. No REST API
needs to change. What do you think?

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job

Best,
Gen

Reply via email to