[ 
https://issues.apache.org/jira/browse/YARN-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13715534#comment-13715534
 ] 

Omkar Vinit Joshi commented on YARN-685:
----------------------------------------

Thanks .. [~raviprak]...looking at your response it is fairly distributed and 
random. But ReduceTaskAttmptImpl doesn't seem to be doing anything special 
w.r.t. reduce task. It only requests the containers with requested memory on 
any node manager. Now the MR may get requested container on any node manager 
which satisfies the request from resource manager scheduler. Even though it is 
fairly random I don't see why we should do that?
{code}
  public ReduceTaskAttemptImpl(TaskId id, int attempt,
      EventHandler eventHandler, Path jobFile, int partition,
      int numMapTasks, JobConf conf,
      TaskAttemptListener taskAttemptListener,
      Token<JobTokenIdentifier> jobToken,
      Credentials credentials, Clock clock,
      AppContext appContext) {
    super(id, attempt, eventHandler, taskAttemptListener, jobFile, partition,
        conf, new String[] {}, jobToken, credentials, clock,
        appContext);
    this.numMapTasks = numMapTasks;
  }
{code}

[~devaraj.k] It is clearly random and fairly distributed across the 
cluster...However do we really need that? Why can't we look for reducers close 
to mappers? thoughts?
                
> Capacity Scheduler is not distributing the reducers tasks across the cluster
> ----------------------------------------------------------------------------
>
>                 Key: YARN-685
>                 URL: https://issues.apache.org/jira/browse/YARN-685
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.0.4-alpha
>            Reporter: Devaraj K
>
> If we have reducers whose total memory required to complete is less than the 
> total cluster memory, it is not assigning the reducers to all the nodes 
> uniformly(~uniformly). Also at that time there are no other jobs or job tasks 
> running in the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to