bmarcott commented on issue #26696: [WIP][SPARK-18886][CORE] Make locality wait time be the time since a TSM's available slots were fully utilized URL: https://github.com/apache/spark/pull/26696#issuecomment-571475603 > I think it would be useful to add in a high level description of what exactly the approach is trying to do I agree. We should all have a good idea of what the goal is here. I understand the high level goal to be to only let delay scheduling be **delaying** the taskset for the period determined by locality wait. Delaying in this case means not scheduling a task, when it would have, if were not for delay scheduling (indicated by the new boolean return by TSM). We know there has been no **delay** when there have been no rejects since and including an "all resource offer". I am now refraining from using the term "all resources utilized, " since there are other reasons they may not be fully utilized, such as blacklisting, which would have caused problems with my previous approach. > Please correct me if I'm wrong, but I think essentially at a high level you are proposing to only reset the timer whenever all the currently free resources are scheduled on. So if you start out with 4 executors, you don't reset the timer until all of them have been scheduled. At that point you reset the timer. It is not strictly required that all 4 executors be utilized. The new approach directly measures effects due to delay scheduling. As long as there are no rejects due to delay scheduling the timer is reset. > Which means that when a running task finishes and you go to start a new one or a new executor comes in, it has to wait the timeout. The timer is per TSM, not per task, so a single task doesn't necessarily have to wait when another one completes. The same applies to executors. As long as the TSM is not rejecting resource offers due to delay scheduling, the timer is reset. > And for the fair scheduler case, the way you are passing the flags in handles it resetting the timer if it has used its fair share? New flags are not passed it. New flags are returned by the TSM saying whether it rejected an offer due to delay scheduling. Fair or FIFO scheduling are handled the same way in the new approach. the scheduling determines what resources are offered to which TSMs, and the TSMs return whether they were "delayed." I believe another interesting bug which exists in master code (timer reset on task launched), is that assume because of FAIR scheduling (or any other reason) a TSM cannot launch tasks for the locality wait period. currently that TSM would be penalized and would increase its locality level. With the new approach I linked: there would be an all resource offer and that TSM would be offered 0 resources, but it also wouldn't reject any resources due to delay scheduling, so the timer would be reset. > > Also just to clarifying a few things: > > > offer 1 resource, no reject - timer is reset > > so the assumption here is that we are out of all resources because we either had 1 task finish or we added a new executor and if nothing was rejected we scheduled that fully and we were fully scheduled before this offer? This is why you have the last line : offer 1 resource, no reject - no timer reset because previous offer was rejected -> since previous offer was rejected there are still free resources so we don't reset. Whatever assumptions or circumstances caused the case, the fact is that the TSM is not being delayed due to delay scheduling in that case, so the timer should be reset. When an "all resource offer" has no rejects, we know there is no delay, so we can reset the timer. For single offers, we only know there has been no delay if there hasn't been any delay since the last all resource offer. > So I assume by this you mean the startup case, but I'm not sure that is true. You get an "all free resource" case when you first submitTasks. I think there are 2 cases - static allocation and dynamic allocation. Generally with static you will get your executors before you start any application code, so it won't matter if it makes offers before that. With dynamic allocation generally you won't have any executors so this perhaps is the case on submitTasks you offer all but there are no offers because no executors yet. Which case are you referring to? No, I believe the approach doesn't matter whether executors are statically or dynamically allocated. The case I am referring to is: imagine you have 2 resources and an "all resource offer" is scheduled every second. when TSM1 is submitted, it'll also get an "all resource offer", and assume it rejects both, causing a prexisting TSM2 to utilize them. Assume those 2 tasks finish, and the freed resources are offered one by one to TSM1, which accepts both, all within 1 second (before any "all resource offer"). This should reset the timer, but it won't in the implementation.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org