[ https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567345#comment-14567345 ]
Steve Loughran commented on SPARK-4352: --------------------------------------- As usual, when YARN-1042 is done, life gets easier: the AM asks YARN for the anti-affine placement. If you look at how other YARN clients have implemented anti-affinity (TWILL-82), the blacklist is used to block off all nodes in use, with a request-at-a-time ramp-up to avoid >1 outstanding request being granted on the same node. As well as anti-affinity, life would be even better with dynamic container resize: if a single executor could expand/relax CPU capacity on demand, you'd only need one per node and then handle multiple tasks by running more work there. (This does nothing for RAM consumption though) now, for some other fun, # you may want to consider which surplus containers to release, both outstanding requests and actually granted. In particular, if you want to cancel >1 outstanding request, which to choose? Any of them? The newest? The oldest? The node with the worst reliability statistics? Killing the newest works if you assume that the older containers have generated more host-local data that you wish to reuse. # history may also be a factor in placement. If you are starting a session which continues/extends previous work, the previous location of the executors may be the first locality clue. Ask for containers on those nodes and there's a high likelihood that all the output data from the previous session will be stored locally on one of the nodes a container is assigned. # Testing. There aren't any, are there? It's possible to simulate some of the basic operations, you just need to isolate the code which examines the application state and generates container request/release events from the actual interaction with the RM. I've done this before with the request to allocate/cancel [generating a list of operations to be submitted or simulated|https://github.com/apache/incubator-slider/blob/develop/slider-core/src/main/java/org/apache/slider/server/appmaster/state/AppState.java#L1908]. When combined with a [mock YARN engine|https://github.com/apache/incubator-slider/tree/develop/slider-core/src/test/groovy/org/apache/slider/server/appmaster/model/mock], let us do things like [test historical placement logic|https://github.com/apache/incubator-slider/tree/develop/slider-core/src/test/groovy/org/apache/slider/server/appmaster/model/history] as well as whether to re-request containers on nodes where containers have just recently failed. While that mock stuff isn't that realistic, it can be used to test basic placement and failure handling logic. More succinctly: you can write tests for this stuff by splitting request generation from the API calls & testing the request/release logic standalone > Incorporate locality preferences in dynamic allocation requests > --------------------------------------------------------------- > > Key: SPARK-4352 > URL: https://issues.apache.org/jira/browse/SPARK-4352 > Project: Spark > Issue Type: Improvement > Components: Spark Core, YARN > Affects Versions: 1.2.0 > Reporter: Sandy Ryza > Assignee: Saisai Shao > Priority: Critical > Attachments: Supportpreferrednodelocationindynamicallocation.pdf > > > Currently, achieving data locality in Spark is difficult unless an > application takes resources on every node in the cluster. > preferredNodeLocalityData provides a sort of hacky workaround that has been > broken since 1.0. > With dynamic executor allocation, Spark requests executors in response to > demand from the application. When this occurs, it would be useful to look at > the pending tasks and communicate their location preferences to the cluster > resource manager. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org