[jira] [Commented] (MAPREDUCE-5279) Jobs can deadlock if headroom is limited by cpu instead of memory

Zhijie Shen (JIRA) Thu, 18 Sep 2014 16:38:00 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139733#comment-14139733
 ]


Zhijie Shen commented on MAPREDUCE-5279:
----------------------------------------

[~vvasudev], thanks for the patch. The patch is following the existing logic, 
and expand the dimension from memory only to (memory, cpu). It looks fine to me 
overall, and here're some of my comments.

1. Can we reuse org.apache.hadoop.yarn.util.resource.Resources, and mark it 
\@LimitedPrivate to both yarn and mapreduce?

2. It's not related to this patch, but I think we need to fix the problem: 
SchedulerResourceTypes is the generated by protobuf, we shouldn't refer to it 
directly, which will potentially break the binary compatibility if we upgrade 
protobuf to new version. The other enums in proto adopt the following way: 
defining a SchedulerResourceTypes java enum, and a SchedulerResourceTypesProto 
protobuf enum. In ProtoUtils, defining convertTo/FromProtoFormat methods to 
convert one object to the other. Please file ticket to for this issue.

3. We should reflect the vcore usage in history as well, but it may not be the 
trivial changes. Let's file a separate ticket for it.
{code}
+          eventHandler.handle(new JobHistoryEvent(jobId,
+            new NormalizedResourceEvent(
+              org.apache.hadoop.mapreduce.TaskType.MAP, mapResourceRequest
+                .getMemory())));
{code}

4. Can we combine testReduceSchedulingWithNotEnoughCpu/Memory and only differ 
on the limit set?

> Jobs can deadlock if headroom is limited by cpu instead of memory
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-5279
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5279
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, scheduler
>    Affects Versions: 2.0.3-alpha
>            Reporter: Peng Zhang
>            Assignee: Peng Zhang
>            Priority: Critical
>         Attachments: MAPREDUCE-5279-v2.patch, MAPREDUCE-5279.patch, 
> apache-mapreduce-5279.3.patch, apache-mapreduce-5279.4.patch, 
> apache-mapreduce-5279.5.patch
>
>
> YARN-2 imported cpu dimension scheduling, but MR RMContainerAllocator doesn't 
> take into account virtual cores while scheduling reduce tasks.
> This may cause more reduce tasks to be scheduled because memory is enough. 
> And on a small cluster, this will end with deadlock, all running containers 
> are reduce tasks but map phase is not finished. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-5279) Jobs can deadlock if headroom is limited by cpu instead of memory

Reply via email to