[GitHub] [spark] bmarcott commented on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized

GitBox Mon, 06 Jan 2020 23:49:46 -0800

bmarcott commented on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait 
time be the time since a TSM's available slots were fully utilized
URL: https://github.com/apache/spark/pull/26696#issuecomment-571475603
 
 
   >  I think it would be useful to add in a high level description of what 
exactly the approach is trying to do
   
   I agree. We should all have a good idea of what the goal is here.
   I understand the high level goal to be to only let delay scheduling be 
**delaying** the taskset for the period determined by locality wait. Delaying 
in this case means not scheduling a task, when it would have, if were not for 
delay scheduling (indicated by the new boolean return by TSM). We know there 
has been no **delay** when there have been no rejects since and including an 
"all resource offer". 
   I am now refraining from using the term "all resources utilized, " since 
there are other reasons they may not be fully utilized, such as blacklisting, 
which would have caused problems with my previous approach.
   
   
   > Please correct me if I'm wrong, but I think essentially at a high level 
you are proposing to only reset the timer whenever all the currently free 
resources are scheduled on. So if you start out with 4 executors, you don't 
reset the timer until all of them have been scheduled. At that point you reset 
the timer. 
   
   It is not strictly required that all 4 executors be utilized. The new 
approach directly measures effects due to delay scheduling. As long as there 
are no rejects due to delay scheduling the timer is reset.
   
   > Which means that when a running task finishes and you go to start a new 
one or a new executor comes in, it has to wait the timeout.
   
   The timer is per TSM, not per task, so a single task doesn't necessarily 
have to wait when another one completes. The same applies to executors.
   As long as the TSM is not rejecting resource offers due to delay scheduling, 
the timer is reset. 
   
   > And for the fair scheduler case, the way you are passing the flags in 
handles it resetting the timer if it has used its fair share?
   
   New flags are not passed it. New flags are returned by the TSM saying 
whether it rejected an offer due to delay scheduling.
   Fair or FIFO scheduling are handled the same way in the new approach. the 
scheduling determines what resources are offered to which TSMs, and the TSMs 
return whether they were "delayed."
   
   I believe another interesting bug which exists in master code (timer reset 
on task launched), is that assume because of FAIR scheduling (or any other 
reason) a TSM cannot launch tasks for the locality wait period. currently that 
TSM would be penalized and would increase its locality level.
   With the new approach I linked: there would be an all resource offer and 
that TSM would be offered 0 resources, but it also wouldn't reject any 
resources due to delay scheduling, so the timer would be reset.
   
   > 
   > Also just to clarifying a few things:
   > 
   > > offer 1 resource, no reject - timer is reset
   > 
   > so the assumption here is that we are out of all resources because we 
either had 1 task finish or we added a new executor and if nothing was rejected 
we scheduled that fully and we were fully scheduled before this offer? This is 
why you have the last line : offer 1 resource, no reject - no timer reset 
because previous offer was rejected -> since previous offer was rejected there 
are still free resources so we don't reset.
   Whatever assumptions or circumstances caused the case, the fact is that the 
TSM is not being delayed due to delay scheduling in that case, so the timer 
should be reset.
   
   When an "all resource offer" has no rejects, we know there is no delay, so 
we can reset the timer.
   For single offers, we only know there has been no delay if there hasn't been 
any delay since the last all resource offer.
   
   
   > So I assume by this you mean the startup case, but I'm not sure that is 
true. You get an "all free resource" case when you first submitTasks.
   I think there are 2 cases - static allocation and dynamic allocation. 
Generally with static you will get your executors before you start any 
application code, so it won't matter if it makes offers before that. With 
dynamic allocation generally you won't have any executors so this perhaps is 
the case on submitTasks you offer all but there are no offers because no 
executors yet. Which case are you referring to?
   
   No, I believe the approach doesn't matter whether executors are statically 
or dynamically allocated.
   The case I am referring to is: imagine you have 2 resources and an "all 
resource offer" is scheduled every second. when TSM1 is submitted, it'll also 
get an "all resource offer", and assume it rejects both, causing a prexisting 
TSM2 to utilize them. Assume those 2 tasks finish, and the freed resources are 
offered one by one to TSM1, which accepts both, all within 1 second (before any 
"all resource offer"). This should reset the timer, but it won't in the 
implementation.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] bmarcott commented on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized

Reply via email to