Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/18874
  
    so I think the issue with the locality is that it resets the time  (3s 
wait) whenever it schedules any task at the particular locality level (in this 
case node local) on any node. So it can take a lot longer then 3 seconds for it 
to fall back to rack local for any specific task.  So if no tasks are node 
local on this node it can wait a long time to fall back. 
    I think its more ideal we look at it on a per task basis.  I don't see a 
reason to have a task wait 60 seconds+ skipping over rack local nodes.  
Locality doesn't matter that much for the majority of applications and you are 
just wasting time starting.
    I still need to look more at the scheduler logic to confirm and stuff, but 
either way I think this change is good to have.   I'm going to be filing a 
separate jira for that shortly


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to