[ 
https://issues.apache.org/jira/browse/SPARK-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095247#comment-14095247
 ] 

Mridul Muralidharan commented on SPARK-2089:
--------------------------------------------

Since I am not maintaining the code anymore, I dont have strong preference 
either way.
I am not sure what the format means btw - I see multiple nodes and racks 
mentioned in the same group ...

In general though, I am not convinced it is a good direction to take.
1) It is a workaround for a design issue and has non trivial performance 
implications (serializing into this form to immediately deserialize it is 
expensive for large inputs : not to mention, it gets shipped to executors for 
no reason).
2) It locks us into a format which provides inadequate information - number of 
blocks per node, size per block, etc is lost (or maybe I just did not 
understand what the format is !).
3) We are currently investigating evolving in the opposite direction - add more 
information so that we can be more specific about where to allocate executors.
For example: I can see the fairly near term need to associate executors with 
accelerator cards (and break the OFF_HEAP -> tachyon implicit assumption).
A string representation makes it fragile to evolve.

As I mentioned before, the current yarn allocation model in spark is a very 
naive implementation - which I did not expect to survive this long : it was 
directly from our prototype.
We really should be modifying it to consider cost of data transfer and 
prioritize allocation that way (number of blocks on a node/rack, size of 
blocks, number of replicas available, etc).
For small datasets on small enough clusters this is not relevant but has 
implications as we grow along both axis.

> With YARN, preferredNodeLocalityData isn't honored 
> ---------------------------------------------------
>
>                 Key: SPARK-2089
>                 URL: https://issues.apache.org/jira/browse/SPARK-2089
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.0.0
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>            Priority: Critical
>
> When running in YARN cluster mode, apps can pass preferred locality data when 
> constructing a Spark context that will dictate where to request executor 
> containers.
> This is currently broken because of a race condition.  The Spark-YARN code 
> runs the user class and waits for it to start up a SparkContext.  During its 
> initialization, the SparkContext will create a YarnClusterScheduler, which 
> notifies a monitor in the Spark-YARN code that .  The Spark-Yarn code then 
> immediately fetches the preferredNodeLocationData from the SparkContext and 
> uses it to start requesting containers.
> But in the SparkContext constructor that takes the preferredNodeLocationData, 
> setting preferredNodeLocationData comes after the rest of the initialization, 
> so, if the Spark-YARN code comes around quickly enough after being notified, 
> the data that's fetched is the empty unset version.  The occurred during all 
> of my runs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to