Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/1313#issuecomment-50253029 Hi, @mateiz , thanks for the comments If we just adding NO_PREF level, it can avoid the unnecessary waiting when we only have no-pref tasks, however, in the following scenario, we still need to wait for some time if we only have PROCESS_LOCAL and NO_PREFS if we have T1(PROCESS_LOCAL), T2(PROCESS_LOCAL), T3(NO_PREFS). then the valid localities would be PROCESS_LOCAL, NODE_LOCAL (because process_local is also NODE_LOCAL) and NO_PREFS. After we have scheduled T1 and T2, we need to wait for 3s to check if we have NODE_LOCAL, no, then go to NO_PREFS to launch T3 In the previous discussion, we thought that this type of waiting is also unnecessary (at least, it is not there in current master branch), the current PR ensures that NO_PREFS can only be launched after PROCESS_LOCAL and NODE_LOCAL and when these two higher prioritized ones are all consumed, we don't need to wait unnecessarily Maybe I missed some comments...I don't think I did some refactoring here? I'm rebasing the PR and adding the comments
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---