Re: How does Spark honor data locality when allocating computing resources for an application

eric wong Sat, 14 Mar 2015 07:37:01 -0700

you seem like not to note the configuration varible "spreadOutApps"


And it's comment:
  // As a temporary workaround before better ways of configuring memory, we
allow users to set
  // a flag that will perform round-robin scheduling across the nodes
(spreading out each app
  // among all the nodes) instead of trying to consolidate each app onto a
small # of nodes.

2015-03-14 10:41 GMT+08:00 bit1...@163.com <bit1...@163.com>:

> Hi, sparkers,
> When I read the code about computing resources allocation for the newly
> submitted application in the Master#schedule method,  I got a question
> about data locality:
>
> // Pack each app into as few nodes as possible until we've assigned all
> its cores
> for (worker <- workers if worker.coresFree > 0 && worker.state ==
> WorkerState.ALIVE) {
>    for (app <- waitingApps if app.coresLeft > 0) {
>       if (canUse(app, worker)) {
>           val coresToUse = math.min(worker.coresFree, app.coresLeft)
>          if (coresToUse > 0) {
>                 val exec = app.addExecutor(worker, coresToUse)
>                 launchExecutor(worker, exec)
>                 app.state = ApplicationState.RUNNING
>          }
>      }
>   }
> }
>
> Looks that the resource allocation policy here is that Master will assign
> as few workers as possible, so long as these few workers has enough
> resources for the application.
> My question is: Assume that the data the application will process is
> spread on all the worker nodes, then the data locality is lost if using
> the above policy?
> Not sure whether I have unstandood correctly or I have missed something.
>
>
> ------------------------------
> bit1...@163.com
>



-- 
王海华

Re: How does Spark honor data locality when allocating computing resources for an application

Reply via email to