[ https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567016#comment-14567016 ]
Saisai Shao commented on SPARK-4352: ------------------------------------ Hi [~sandyr], thanks a lot for your suggestion. IIUC the algorithm you describe is trying to make the executor request be proportional to the node preference, say your desired tasks on cluster are 3 : 3 : 2 : 1, so you're trying to allocate the executors following this, but I'm curious about algorithm on 7 and 18 situation, what you describe is: requests for 5 executors with nodes = <a, b, c, d> requests for 2 executors with nodes = <a, b, c> that is 7 : 7 : 7 : 5 is it better like this: requests for 2 executors with nodes = <a, b, c, d> requests for 2 executors with nodes = <a, b, c> requests for 3 executors with nodes = <a, b> here is 7 : 7 : 4 : 2 Also for 18 situation, why not: requests for 6 executors with nodes = <a, b, c, d> requests for 6 executors with nodes = <a, b, c> requests for 6 executors with nodes = <a, b> Would you please help to explain it, maybe I missed in some places:). > Incorporate locality preferences in dynamic allocation requests > --------------------------------------------------------------- > > Key: SPARK-4352 > URL: https://issues.apache.org/jira/browse/SPARK-4352 > Project: Spark > Issue Type: Improvement > Components: Spark Core, YARN > Affects Versions: 1.2.0 > Reporter: Sandy Ryza > Assignee: Saisai Shao > Priority: Critical > Attachments: Supportpreferrednodelocationindynamicallocation.pdf > > > Currently, achieving data locality in Spark is difficult unless an > application takes resources on every node in the cluster. > preferredNodeLocalityData provides a sort of hacky workaround that has been > broken since 1.0. > With dynamic executor allocation, Spark requests executors in response to > demand from the application. When this occurs, it would be useful to look at > the pending tasks and communicate their location preferences to the cluster > resource manager. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org