[ https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14572323#comment-14572323 ]
Saisai Shao commented on SPARK-4352: ------------------------------------ Hi Sandy, Thanks a lot for pointing out. So consider this situation, we have 20 tasks with no locality preference, 10 tasks with <a, b, d>, also we have 18 executors with 2 cores per executor, so based on your description, we will request: 5 container requests on <a, b, d> 12 container requests with no locality preference. But I think is it better for such allocation: 15 container requests on <a, b, d> 3 container requests with no locality preference. Besides I think {{task number <= executor number * cores}} will be ideally avoided by ExecutorAllocationManager, ideally only 15 executors is enough, but currently request 3 more executors. Sorry for any misunderstanding :). > Incorporate locality preferences in dynamic allocation requests > --------------------------------------------------------------- > > Key: SPARK-4352 > URL: https://issues.apache.org/jira/browse/SPARK-4352 > Project: Spark > Issue Type: Improvement > Components: Spark Core, YARN > Affects Versions: 1.2.0 > Reporter: Sandy Ryza > Assignee: Saisai Shao > Priority: Critical > Attachments: Supportpreferrednodelocationindynamicallocation.pdf > > > Currently, achieving data locality in Spark is difficult unless an > application takes resources on every node in the cluster. > preferredNodeLocalityData provides a sort of hacky workaround that has been > broken since 1.0. > With dynamic executor allocation, Spark requests executors in response to > demand from the application. When this occurs, it would be useful to look at > the pending tasks and communicate their location preferences to the cluster > resource manager. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org