xintongsong edited a comment on pull request #13464:
URL: https://github.com/apache/flink/pull/13464#issuecomment-698301073
I can see how the mapping simplifies things. My concern is whether this
simplification hurts not only optimal but also the correctness. Not entirely
sure about this. I'll try to explain my concern with an example.
* 2 requirement profiles: A & B
* 2 slot profiles: X & Y
* A can only be fulfilled by X
* B can be fulfilled by X or Y
* Resource-requirement mapping status
* A: 1 -> X: 1
* B: 2 -> X: 1, Y: 1
* excees -> Y: 1
Now a slot of profile X is lost. Since neither A nor B have too many
resources, either of them might be deducted.
If A is deducted, the excess Y cannot be used, and we would need to request
for a new resource for A.
* A: 1 -> X: 0
* B: 2 -> X: 1, Y: 1
* excees -> Y: 1
If B is deducted, then the excess Y can be used, and we do not need to
allocate new resources.
* A: 1 -> X: 1
* B: 2 -> Y: 2
* excees -> none
Assuming all tasks are in running state. If a slot assigned to requirement A
is lost and RM deducts B, then RM will not assign new slot to the job, and JM
cannot deploy tasks from the lost slot to the excess slot Y. Either the tasks
cannot recover, or JM will have to stop some tasks from a slot X and move them
to the excess Y. On the other hand, if a slot assigned to requirement B is lost
and RM deducts A, then JM will have no problem recovering the failed tasks in
slot Y, but RM still allocates and assign a new slot to the job. Even if the
job returns the unneeded slot, RM may keep trying to allocate new slot for the
job, because it sees that the acquired resources for this job does not match
the required resources.
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org