bmarcott edited a comment on issue #26696: [WIP][SPARK-18886][CORE] Make 
locality wait time be the time since a TSM's available slots were fully utilized
URL: https://github.com/apache/spark/pull/26696#issuecomment-568617329
 
 
   I think I came up with a much better approach 
[here](https://github.com/apache/spark/compare/master...bmarcott:nmarcott-fulfill-slots-2?expand=1).
    It avoids trying to simulate scheduling logic like the previous approach, 
which had a lot of discrepancies as well as high time complexity.
   
   This change makes the `TaskSetManager.resourceOffer` return an explicit 
boolean saying whether it rejected the resource due to delay scheduling or not
   
   An `isAllFreeResources` boolean parameter was also added to 
`TaskSchedulerImpl.resourceOffers` which tells the scheduler the offers 
represent all free resources as opposed to a single resource. 
   
   Then, timers will be reset only if there were no resources rejected due to 
scheduling delay since the last offer which included all free resources. 
   
   example event sequence:
   offer 1 resource that was rejected - no timer reset
   offer all resources with no rejects - timer is reset
   offer 1 resource, no reject - timer is reset
   offer 1 resource that was rejected - no timer reset
   offer 1 resource, no reject - no timer reset because previous offer was 
rejected
   
   Here is a breakdown of when resources are offered (not changed):
   Single executors are offered when:
   - a task finishes
   - new executor launched
   
   All free resources are offered when:
   - continually every spark.scheduler.revive.interval seconds (default 1 
second)
   - on taskset submit
   - when a task fails
   - speculationScheduler on fixed delay revives if there are any speculative 
tasks
   - executor lost
    
   One remaining case that isn't handled:
   Before any "all free resource" offer, all free resources are offered one by 
one and all not rejected.
   This case should reset the timer, but won't with current impl.
   
   Thoughts or know of any other issues with this approach?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to