[
https://issues.apache.org/jira/browse/SPARK-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187950#comment-15187950
]
Apache Spark commented on SPARK-13779:
--------------------------------------
User 'rdblue' has created a pull request for this issue:
https://github.com/apache/spark/pull/11612
> YarnAllocator cancels and resubmits container requests with no locality
> preference
> ----------------------------------------------------------------------------------
>
> Key: SPARK-13779
> URL: https://issues.apache.org/jira/browse/SPARK-13779
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Affects Versions: 1.6.0
> Reporter: Ryan Blue
>
> SPARK-9817 attempts to improve locality by considering the set of pending
> container requests. Pending requests with a locality preference that is no
> longer needed or no locality preference are cancelled, then resubmitted with
> updated locality preferences.
> When running over data in S3, some stages have no locality information so the
> result is that the current logic cancels all pending requests and resubmits
> them (still without locality preferences) on every call to
> [{{updateResourceRequests()}}|https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L273].
> I propose the following update to avoid this problem:
> # Cancel any pending requests with stale locality preferences
> # Calculate N new requests, where N is the number of new containers +
> cancelled stale requests + outstanding requests without locality
> # If the number of new requests with a locality preference was larger than
> the available count (new containers + cancelled stale requests), then cancel
> enough requests with no locality preference to be able to submit all of the
> requests with a locality preference.
> # If the number of new requests with a locality preference is smaller than
> the available count then submit all of the locality requests and a request
> with no locality preference for the remaining available count. No pending
> requests with no locality are cancelled.
> This strategy only cancels requests with no locality preference if a new
> request can be made that has a locality preference. Cancelling stale locality
> requests happens as it does today. I've tested this on large S3 jobs (50,000+
> tasks) and it fixes the request thrashing problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]