[GitHub] [flink] xintongsong edited a comment on pull request #13464: [FLINK-19307][coordination] Add ResourceTracker

2020-09-25 Thread GitBox


xintongsong edited a comment on pull request #13464:
URL: https://github.com/apache/flink/pull/13464#issuecomment-698301073







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong edited a comment on pull request #13464: [FLINK-19307][coordination] Add ResourceTracker

2020-09-24 Thread GitBox


xintongsong edited a comment on pull request #13464:
URL: https://github.com/apache/flink/pull/13464#issuecomment-698685698


   Sorry for the typo. Just corrected.
   
   Having heuristics for triggering re-assignment on both JM/RM sides sounds 
promising to me. Just to add another idea, we may also consider exactly 
matching between requirement/resource profiles that are not `UNKNOWN`. We can 
keep the discussion on the pros and cons of the two approaches and potential 
other ideas open.
   
   I think this issues should not block this PR. Anyway, we do not have the 
different profiles at the moment. I was just trying to better understand the 
status and limitations of the current implementation.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong edited a comment on pull request #13464: [FLINK-19307][coordination] Add ResourceTracker

2020-09-24 Thread GitBox


xintongsong edited a comment on pull request #13464:
URL: https://github.com/apache/flink/pull/13464#issuecomment-698301073


   I can see how the mapping simplifies things. My concern is whether this 
simplification hurts not only optimal but also the correctness. Not entirely 
sure about this. I'll try to explain my concern with an example.
   
   * 2 requirement profiles: A & B
   * 2 slot profiles: X & Y
   * A can only be fulfilled by X
   * B can be fulfilled by X or Y
   * Resource-requirement mapping status
 * A: 1 -> X: 1
 * B: 2 -> X: 1, Y: 1
 * excees -> Y: 1
   
   Now a slot of profile X is lost. Since neither A nor B have too many 
resources, either of them might be deducted.
   
   If A is deducted, the excess Y cannot be used, and we would need to request 
for a new resource for A.
 * A: 1 -> X: 0
 * B: 2 -> X: 1, Y: 1
 * excees -> Y: 1
   
   If B is deducted, then the excess Y can be used, and we do not need to 
allocate new resources.
 * A: 1 -> X: 1
 * B: 2 -> Y: 2
 * excees -> none
   
   Assuming all tasks are in running state. If a slot assigned to requirement A 
is lost and RM deducts B, then RM will not assign new slot to the job, and JM 
cannot deploy tasks from the lost slot to the excess slot Y. Either the tasks 
cannot recover, or JM will have to stop some tasks from a slot X and move them 
to the excess Y. On the other hand, if a slot assigned to requirement B is lost 
and RM deducts A, then JM will have no problem recovering the failed tasks in 
slot Y, but RM still allocates and assign a new slot to the job. Even if the 
job returns the unneeded slot, RM may keep trying to allocate new slot for the 
job, because it sees that the acquired resources for this job does not match 
the required resources.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org