[jira] [Comment Edited] (SPARK-2089) With YARN, preferredNodeLocalityData isn't honored

Mridul Muralidharan (JIRA) Thu, 09 Jul 2015 01:37:32 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620103#comment-14620103
 ]


Mridul Muralidharan edited comment on SPARK-2089 at 7/9/15 8:36 AM:
--------------------------------------------------------------------

Let us keep both functionality separate.
Dynamic allocation is not used for a lot of usecases due requirements it has 
from the deployment (none of our workflows use it for example; and I do not see 
this changing anytime soon) - it would make sense to fix this regression 
introduced in 1.0


was (Author: mridulm80):
Let us keep both functionality separate.
Dynamic allocation is not used for a lot of usecases (none of our workflows use 
it for example; and I do not see this changing anytime soon) - it would make 
sense to fix this regression introduced in 1.0

> With YARN, preferredNodeLocalityData isn't honored 
> ---------------------------------------------------
>
>                 Key: SPARK-2089
>                 URL: https://issues.apache.org/jira/browse/SPARK-2089
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.0.0
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>            Priority: Critical
>
> When running in YARN cluster mode, apps can pass preferred locality data when 
> constructing a Spark context that will dictate where to request executor 
> containers.
> This is currently broken because of a race condition.  The Spark-YARN code 
> runs the user class and waits for it to start up a SparkContext.  During its 
> initialization, the SparkContext will create a YarnClusterScheduler, which 
> notifies a monitor in the Spark-YARN code that .  The Spark-Yarn code then 
> immediately fetches the preferredNodeLocationData from the SparkContext and 
> uses it to start requesting containers.
> But in the SparkContext constructor that takes the preferredNodeLocationData, 
> setting preferredNodeLocationData comes after the rest of the initialization, 
> so, if the Spark-YARN code comes around quickly enough after being notified, 
> the data that's fetched is the empty unset version.  The occurred during all 
> of my runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-2089) With YARN, preferredNodeLocalityData isn't honored

Reply via email to