maryannxue opened a new pull request #26633: [SPARK-29994] Add WILDCARD task 
location
URL: https://github.com/apache/spark/pull/26633
 
 
   ### What changes were proposed in this pull request?
   This PR adds a new WILDCARD task location that can match any host. This 
WILDCARD location can be used together with other regular locations in the list 
of preferred locations to indicate that the task can be assigned to any 
host/executor if none of the preferred locations is available.
   
   ### Why are the changes needed?
   This is motivated by the requirement from LocalShuffledRowRDD. When the 
number of initial mappers of LocalShuffledRowRDD is smaller than the number of 
worker nodes, it can cause serious regressions if short-running tasks all wait 
on their preferred locations while they could have otherwise finished quickly 
on non-preferred locations too.
   
   We have a "locality wait time" configuration that allows a task set to 
downgrade locality requirement after a certain time has passed. Yet, this 
configuration affects all task sets in the scheduler, and tasks all differ in 
penalty of locality miss. Thus, we need this finer-grained option for 
individual tasks to opt out of locality.
   
   ### Does this PR introduce any user-facing change?
   No.
   
   ### How was this patch tested?
   Added UT.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to