Hello,
I am analyzing the logs from a Flink batch job and am seeing the following
two lines:
2016-05-30 15:32:31,701 INFO ... - DataSource (at ${path}) (4/4)
(7efe8fcfe9c7c7e6cd4683e1b5c06a3a) switched from SCHEDULED to DEPLOYING
2016-05-30 15:32:31,701 INFO ... - DataSource (at ${path}) (4/4)
(e54bda3e413816b5d35046468cedbf86) switched from SCHEDULED to DEPLOYING
The lines are basically identical, except for the task ID. Both tasks then
transitioned to FINISHED.
I got the impression that the (x/y) part of the log reports the task number
/ total number of parallel tasks. Consequently, I was expecting that the x
component and the task path will functionally determine the task ID, except
for situations in which the first task fails.
Can somebody shed a light on the execution semantics of the scheduler which
will explain this behavior?
Cheers,
Alex