[jira] [Updated] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-957: - Attachment: YARN-957-20130904.2.patch Same patch with trailing white spaces removed. Will commit when Jenkins says okay. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch, YARN-957-20130904.1.patch, > YARN-957-20130904.2.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-957: --- Attachment: YARN-957-20130904.1.patch > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch, YARN-957-20130904.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-957: --- Attachment: YARN-957-20130830.1.patch > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch, > YARN-957-20130830.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-957: --- Target Version/s: 2.1.1-beta Let's simplify this fix just to not reserve in excess. The other fix (masterContainer) is hiding couple of other bugs: # Reservation exchange # Excess reservation Let's track that separately. > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-957: --- Priority: Blocker (was: Major) > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Blocker > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-957: --- Attachment: YARN-957-20130731.1.patch > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch, YARN-957-20130731.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-957: --- Attachment: YARN-957-20130730.3.patch > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch, > YARN-957-20130730.3.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-957: --- Attachment: YARN-957-20130730.2.patch > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Attachments: YARN-957-20130730.1.patch, YARN-957-20130730.2.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-957) Capacity Scheduler tries to reserve the memory more than what node manager reports.
[ https://issues.apache.org/jira/browse/YARN-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-957: --- Attachment: YARN-957-20130730.1.patch > Capacity Scheduler tries to reserve the memory more than what node manager > reports. > --- > > Key: YARN-957 > URL: https://issues.apache.org/jira/browse/YARN-957 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi > Attachments: YARN-957-20130730.1.patch > > > I have 2 node managers. > * one with 1024 MB memory.(nm1) > * second with 2048 MB memory.(nm2) > I am submitting simple map reduce application with 1 mapper and one reducer > with 1024mb each. The steps to reproduce this are > * stop nm2 with 2048MB memory.( This I am doing to make sure that this node's > heartbeat doesn't reach RM first). > * now submit application. As soon as it receives first node's (nm1) heartbeat > it will try to reserve memory for AM-container (2048MB). However it has only > 1024MB of memory. > * now start nm2 with 2048 MB memory. > It hangs forever... Ideally this has two potential issues. > * It should not try to reserve memory on a node manager which is never going > to give requested memory. i.e. Current max capability of node manager is > 1024MB but 2048MB is reserved on it. But it still does that. > * Say 2048MB is reserved on nm1 but nm2 comes back with 2048MB available > memory. In this case if the original request was made without any locality > then scheduler should unreserve memory on nm1 and allocate requested 2048MB > container on nm2. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira