[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759786#comment-13759786 ] Omkar Vinit Joshi commented on YARN-957: [~ste...@apache.org] will post an update patch for branch-2. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Fix For: 2.1.1-beta > > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch, YARN-957-20130904.1.patch, > YARN-957-20130904.2.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759063#comment-13759063 ] Hudson commented on YARN-957: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1540 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1540/]) YARN-957. Fixed a bug in CapacityScheduler because of which requests that need more than a node's total capability were incorrectly allocated on that node causing apps to hang. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1520187) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Fix For: 2.1.1-beta > > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch, YARN-957-20130904.1.patch, > YARN-957-20130904.2.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759045#comment-13759045 ] Hudson commented on YARN-957: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1513 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1513/]) YARN-957. Fixed a bug in CapacityScheduler because of which requests that need more than a node's total capability were incorrectly allocated on that node causing apps to hang. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1520187) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Fix For: 2.1.1-beta > > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch, YARN-957-20130904.1.patch, > YARN-957-20130904.2.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13758980#comment-13758980 ] Hudson commented on YARN-957: - SUCCESS: Integrated in Hadoop-Yarn-trunk #323 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/323/]) YARN-957. Fixed a bug in CapacityScheduler because of which requests that need more than a node's total capability were incorrectly allocated on that node causing apps to hang. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1520187) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Fix For: 2.1.1-beta > > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch, YARN-957-20130904.1.patch, > YARN-957-20130904.2.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13758642#comment-13758642 ] Hudson commented on YARN-957: - SUCCESS: Integrated in Hadoop-trunk-Commit #4369 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4369/]) YARN-957. Fixed a bug in CapacityScheduler because of which requests that need more than a node's total capability were incorrectly allocated on that node causing apps to hang. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1520187) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Fix For: 2.1.1-beta > > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch, YARN-957-20130904.1.patch, > YARN-957-20130904.2.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13758268#comment-13758268 ] Hadoop QA commented on YARN-957: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12601443/YARN-957-20130904.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1835//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1835//console This message is automatically generated. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch, YARN-957-20130904.1.patch, > YARN-957-20130904.2.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13758115#comment-13758115 ] Hadoop QA commented on YARN-957: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12601417/YARN-957-20130904.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1833//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1833//console This message is automatically generated. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch, YARN-957-20130904.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13758044#comment-13758044 ] Omkar Vinit Joshi commented on YARN-957: Thanks vinod. addressed the comments. bq. Use Resource.newInstance instead of RecordFactory. Fixed bq. The Log message in LeafQueue should be at WARN level fixed bq. The test looks good, but let's not have hard-coded waits like the following in the test Yes changed it. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch, YARN-957-20130904.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755115#comment-13755115 ] Hadoop QA commented on YARN-957: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12600849/YARN-957-20130830.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1810//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1810//console This message is automatically generated. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755089#comment-13755089 ] Omkar Vinit Joshi commented on YARN-957: uploading a patch which only fixes excess memory reservation issue. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749526#comment-13749526 ] Omkar Vinit Joshi commented on YARN-957: Thanks arun.. sure will separate and raise these issues separately. Reducing the scope of this ticket to only address node manager max resource check. I will upload the patch soon. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13725693#comment-13725693 ] Hadoop QA commented on YARN-957: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12595262/YARN-957-20130731.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1628//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1628//console This message is automatically generated. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724768#comment-13724768 ] Hadoop QA commented on YARN-957: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12595105/YARN-957-20130730.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1619//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/1619//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1619//console This message is automatically generated. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724741#comment-13724741 ] Omkar Vinit Joshi commented on YARN-957: updating test cases.. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724228#comment-13724228 ] Sandy Ryza commented on YARN-957: - I believe we fixed this for the Fair Scheduler in YARN-289 > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Attachments: YARN-957-20130730.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724219#comment-13724219 ] Omkar Vinit Joshi commented on YARN-957: attaching a preliminary patch...will add a patch with test cases soon. Today this is occurring because we are not checking for maximum resource capability of a single node before either assigning a container on it (i.e. reserving resource on it). Adding a check for this. [~sandyr] is this issue also present for fair scheduler? > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719006#comment-13719006 ] Omkar Vinit Joshi commented on YARN-957: thanks [~hitesh] for pointing this out..I am listing here so it will get fixed.. Today we may even have a situation where container other than AM is reserved on the same node manager where AM is running. Now say if we have node manager memory = 8GB AM memory = 2 GB additional container memory = 8GB. then there is no point in reserving this container on the underlying node manager. Probably we should look for another node manager instead of reserving on this node manager and wait forever. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718753#comment-13718753 ] Omkar Vinit Joshi commented on YARN-957: No this is completely different as RM here is trying to reserve memory on node manager which is more than what it has. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718644#comment-13718644 ] Bikas Saha commented on YARN-957: - Probably a duplicate of YARN-389 > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira