[jira] [Updated] (YARN-10473) Implement application group manager and related API
[ https://issues.apache.org/jira/browse/YARN-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-10473: --- Attachment: Implement-application-group-feature for hadoop 2.9.2.patch > Implement application group manager and related API > --- > > Key: YARN-10473 > URL: https://issues.apache.org/jira/browse/YARN-10473 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.9.2 >Reporter: jialei weng >Priority: Major > Attachments: 0001-Implement-application-group-feature.patch, > Implement-application-group-feature for hadoop 2.9.2.patch > > > To implement application group manager and create, list and terminate > application group API. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10431) [Umbrella] Job group management
[ https://issues.apache.org/jira/browse/YARN-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240613#comment-17240613 ] jialei weng commented on YARN-10431: [~epayne], this design is a inner yarn level to organize different jobs. And it can unit different platform submitting jobs to yarn. > [Umbrella] Job group management > --- > > Key: YARN-10431 > URL: https://issues.apache.org/jira/browse/YARN-10431 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.9.2 >Reporter: jialei weng >Priority: Major > Attachments: YarnJobGroupImpl design.pdf > > > In current yarn job management, we don't have an efficient mechanism to > manage several jobs together. For example, one batch job may trigger several > sub-jobs to running at the same time, like one job to process the data and > another one monitor job metrics. And when we want to cancel these jobs, we > have to kill them one by one in current design. I proposal a job group > concept to handle such parent-child jobs as one unit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10473) Implement application group manager and related API
jialei weng created YARN-10473: -- Summary: Implement application group manager and related API Key: YARN-10473 URL: https://issues.apache.org/jira/browse/YARN-10473 Project: Hadoop YARN Issue Type: Sub-task Reporter: jialei weng To implement application group manager and create, list and terminate application group API. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10431) [Umbrella] Job group management
[ https://issues.apache.org/jira/browse/YARN-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-10431: --- Attachment: YarnJobGroupImpl design.pdf > [Umbrella] Job group management > --- > > Key: YARN-10431 > URL: https://issues.apache.org/jira/browse/YARN-10431 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.9.2 >Reporter: jialei weng >Priority: Major > Attachments: YarnJobGroupImpl design.pdf > > > In current yarn job management, we don't have an efficient mechanism to > manage several jobs together. For example, one batch job may trigger several > sub-jobs to running at the same time, like one job to process the data and > another one monitor job metrics. And when we want to cancel these jobs, we > have to kill them one by one in current design. I proposal a job group > concept to handle such parent-child jobs as one unit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10431) [Umbrella] Job group management
[ https://issues.apache.org/jira/browse/YARN-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-10431: --- Attachment: YarnJobObjectImpl Design.pdf > [Umbrella] Job group management > --- > > Key: YARN-10431 > URL: https://issues.apache.org/jira/browse/YARN-10431 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.9.2 >Reporter: jialei weng >Priority: Major > > In current yarn job management, we don't have an efficient mechanism to > manage several jobs together. For example, one batch job may trigger several > sub-jobs to running at the same time, like one job to process the data and > another one monitor job metrics. And when we want to cancel these jobs, we > have to kill them one by one in current design. I proposal a job group > concept to handle such parent-child jobs as one unit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10431) [Umbrella] Job group management
[ https://issues.apache.org/jira/browse/YARN-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-10431: --- Attachment: (was: YarnJobObjectImpl Design.pdf) > [Umbrella] Job group management > --- > > Key: YARN-10431 > URL: https://issues.apache.org/jira/browse/YARN-10431 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.9.2 >Reporter: jialei weng >Priority: Major > > In current yarn job management, we don't have an efficient mechanism to > manage several jobs together. For example, one batch job may trigger several > sub-jobs to running at the same time, like one job to process the data and > another one monitor job metrics. And when we want to cancel these jobs, we > have to kill them one by one in current design. I proposal a job group > concept to handle such parent-child jobs as one unit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10431) [Umbrella] Job group management
jialei weng created YARN-10431: -- Summary: [Umbrella] Job group management Key: YARN-10431 URL: https://issues.apache.org/jira/browse/YARN-10431 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.9.2 Reporter: jialei weng In current yarn job management, we don't have an efficient mechanism to manage several jobs together. For example, one batch job may trigger several sub-jobs to running at the same time, like one job to process the data and another one monitor job metrics. And when we want to cancel these jobs, we have to kill them one by one in current design. I proposal a job group concept to handle such parent-child jobs as one unit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9608) DecommissioningNodesWatcher should get lists of running applications on node from RMNode.
[ https://issues.apache.org/jira/browse/YARN-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860614#comment-16860614 ] jialei weng commented on YARN-9608: --- Thanks, [~abmodi]. Good to learn more. > DecommissioningNodesWatcher should get lists of running applications on node > from RMNode. > - > > Key: YARN-9608 > URL: https://issues.apache.org/jira/browse/YARN-9608 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9608.001.patch > > > At present, DecommissioningNodesWatcher tracks list of running applications > and triggers decommission of nodes when all the applications that ran on the > node completes. This Jira proposes to solve following problem: > # DecommissioningNodesWatcher skips tracking application containers on a > particular node before the node is in DECOMMISSIONING state. It only tracks > containers once the node is in DECOMMISSIONING state. This can lead to > shuffle data loss of apps whose containers ran on this node before it was > moved to decommissioning state. > # It is keeping track of running apps. We can leverage this directly from > RMNode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9608) DecommissioningNodesWatcher should get lists of running applications on node from RMNode.
[ https://issues.apache.org/jira/browse/YARN-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860470#comment-16860470 ] jialei weng edited comment on YARN-9608 at 6/11/19 1:58 AM: {color:#33}This solution provides an idea to extend life-cycle of node local data to the whole application running time. A small question here, if the application is long running job, the node decommission time will also take longer? And rely on the time-out? [~abmodi] Please correct me if I misunderstand.{color} was (Author: wjlei): {color:#33}This solution provides an idea to extend life-cycle of {color:#33}node local data to the whole application running time. A small question here, if the application is long running job, the node decommission time will also take longer? And rely on the time-out? Please correct me if I misunderstand.{color}{color} > DecommissioningNodesWatcher should get lists of running applications on node > from RMNode. > - > > Key: YARN-9608 > URL: https://issues.apache.org/jira/browse/YARN-9608 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9608.001.patch > > > At present, DecommissioningNodesWatcher tracks list of running applications > and triggers decommission of nodes when all the applications that ran on the > node completes. This Jira proposes to solve following problem: > # DecommissioningNodesWatcher skips tracking application containers on a > particular node before the node is in DECOMMISSIONING state. It only tracks > containers once the node is in DECOMMISSIONING state. This can lead to > shuffle data loss of apps whose containers ran on this node before it was > moved to decommissioning state. > # It is keeping track of running apps. We can leverage this directly from > RMNode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9608) DecommissioningNodesWatcher should get lists of running applications on node from RMNode.
[ https://issues.apache.org/jira/browse/YARN-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860470#comment-16860470 ] jialei weng commented on YARN-9608: --- {color:#33}This solution provides an idea to extend life-cycle of {color:#33}node local data to the whole application running time. A small question here, if the application is long running job, the node decommission time will also take longer? And rely on the time-out? Please correct me if I misunderstand.{color}{color} > DecommissioningNodesWatcher should get lists of running applications on node > from RMNode. > - > > Key: YARN-9608 > URL: https://issues.apache.org/jira/browse/YARN-9608 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9608.001.patch > > > At present, DecommissioningNodesWatcher tracks list of running applications > and triggers decommission of nodes when all the applications that ran on the > node completes. This Jira proposes to solve following problem: > # DecommissioningNodesWatcher skips tracking application containers on a > particular node before the node is in DECOMMISSIONING state. It only tracks > containers once the node is in DECOMMISSIONING state. This can lead to > shuffle data loss of apps whose containers ran on this node before it was > moved to decommissioning state. > # It is keeping track of running apps. We can leverage this directly from > RMNode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8690) Currently path not consistent in LocalResourceRequest to yarn 2.7
[ https://issues.apache.org/jira/browse/YARN-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-8690: -- Description: With YARN-1953 change, in yarn 2.9.1, we can not use path, like hdfs://hostname/path, to local resource allocation as it will be resolved to hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. It isn't a consistent change. If we can make it consistent without customer's change? [~leftnoteasy] |Handle resource location path in 2.7|Handle resource location logic in 2.9| | public static Path getPathFromYarnURL(URL url) throws URISyntaxException { String scheme = url.getScheme() == null ? "" : url.getScheme(); String authority = ""; if (url.getHost() != null) { authority = url.getHost(); if (url.getUserInfo() != null) { authority = url.getUserInfo() + "@" + authority; } {color:#d04437} if (url.getPort() > 0) {{color}{color:#d04437} authority += ":" + url.getPort();{color}{color:#d04437} }{color} } return new Path( (new URI(scheme, authority, url.getFile(), null, null)).normalize()); }| public Path toPath() throws URISyntaxException \{ return new Path(new URI(getScheme(), getUserInfo(), getHost(), getPort(), getFile(), null, null)); }| was: With YARN-1953 change, in yarn 2.9.1, we can not use path, like hdfs://hostname/path, to local resource allocation as it will be resolved to hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. It isn't a consistent change. If we can make it consistent without customer's change? [~leftnoteasy] |Handle resource location path in 2.7|Handle resource location logic in 2.9| | public static Path getPathFromYarnURL(URL url) throws URISyntaxException { String scheme = url.getScheme() == null ? "" : url.getScheme(); String authority = ""; if (url.getHost() != null) { authority = url.getHost(); if (url.getUserInfo() != null) { authority = url.getUserInfo() + "@" + authority; } {color:#d04437} if (url.getPort() > 0) {{color} {color:#d04437} authority += ":" + url.getPort();{color} {color:#d04437} }{color} } return new Path( (new URI(scheme, authority, url.getFile(), null, null)).normalize()); }| public Path toPath() throws URISyntaxException { return new Path(new URI(getScheme(), getUserInfo(), getHost(), getPort(), getFile(), null, null)); }| > Currently path not consistent in LocalResourceRequest to yarn 2.7 > - > > Key: YARN-8690 > URL: https://issues.apache.org/jira/browse/YARN-8690 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.1 >Reporter: jialei weng >Assignee: Wangda Tan >Priority: Major > > With YARN-1953 change, in yarn 2.9.1, we can not use path, like > hdfs://hostname/path, to local resource allocation as it will be resolved to > hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in > path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. > It isn't a consistent change. If we can make it consistent without > customer's change? [~leftnoteasy] > |Handle resource location path in 2.7|Handle resource location logic in 2.9| > | public static Path getPathFromYarnURL(URL url) throws URISyntaxException { > String scheme = url.getScheme() == null ? "" : url.getScheme(); > > String authority = ""; > if (url.getHost() != null) { > authority = url.getHost(); > if (url.getUserInfo() != null) { > authority = url.getUserInfo() + "@" + authority; > } > {color:#d04437} if (url.getPort() > 0) {{color}{color:#d04437} > authority += ":" + url.getPort();{color}{color:#d04437} }{color} } > > return new Path( > (new URI(scheme, authority, url.getFile(), null, null)).normalize()); > }| public Path toPath() throws URISyntaxException \{ return new > Path(new URI(getScheme(), getUserInfo(), getHost(), getPort(), > getFile(), null, null)); }| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8690) Currently path not consistent in LocalResourceRequest to yarn 2.7
[ https://issues.apache.org/jira/browse/YARN-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-8690: -- Description: With YARN-1953 change, in yarn 2.9.1, we can not use path, like hdfs://hostname/path, to local resource allocation as it will be resolved to hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. It isn't a consistent change. If we can make it consistent without customer's change? [~leftnoteasy] |Handle resource location path in 2.7|Handle resource location logic in 2.9| | public static Path getPathFromYarnURL(URL url) throws URISyntaxException { String scheme = url.getScheme() == null ? "" : url.getScheme(); String authority = ""; if (url.getHost() != null) { authority = url.getHost(); if (url.getUserInfo() != null) { authority = url.getUserInfo() + "@" + authority; } {color:#d04437} if (url.getPort() > 0) {{color} {color:#d04437} authority += ":" + url.getPort();{color} {color:#d04437} }{color} } return new Path( (new URI(scheme, authority, url.getFile(), null, null)).normalize()); }| public Path toPath() throws URISyntaxException { return new Path(new URI(getScheme(), getUserInfo(), getHost(), getPort(), getFile(), null, null)); }| was:With YARN-1953 change, in yarn 2.9.1, we can not use path, like hdfs://hostname/path, to local resource allocation as it will be resolved to hdfs://hostname{color:#FF}:0{color}/path. We have to add the port 443 in path, like hdfs://hostname{color:#FF}:443{color}/path, to make it work. It isn't a consistent change. If we can make it consistent without customer's change? [~leftnoteasy] > Currently path not consistent in LocalResourceRequest to yarn 2.7 > - > > Key: YARN-8690 > URL: https://issues.apache.org/jira/browse/YARN-8690 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.1 >Reporter: jialei weng >Assignee: Wangda Tan >Priority: Major > > With YARN-1953 change, in yarn 2.9.1, we can not use path, like > hdfs://hostname/path, to local resource allocation as it will be resolved to > hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in > path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. > It isn't a consistent change. If we can make it consistent without > customer's change? [~leftnoteasy] > |Handle resource location path in 2.7|Handle resource location logic in 2.9| > | public static Path getPathFromYarnURL(URL url) throws URISyntaxException { > String scheme = url.getScheme() == null ? "" : url.getScheme(); > > String authority = ""; > if (url.getHost() != null) { > authority = url.getHost(); > if (url.getUserInfo() != null) { > authority = url.getUserInfo() + "@" + authority; > } > {color:#d04437} if (url.getPort() > 0) {{color} > {color:#d04437} authority += ":" + url.getPort();{color} > {color:#d04437} }{color} > } > > return new Path( > (new URI(scheme, authority, url.getFile(), null, null)).normalize()); > }| public Path toPath() throws URISyntaxException { > return new Path(new URI(getScheme(), getUserInfo(), > getHost(), getPort(), getFile(), null, null)); > }| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8690) Currently path not consistent in LocalResourceRequest to yarn 2.7
jialei weng created YARN-8690: - Summary: Currently path not consistent in LocalResourceRequest to yarn 2.7 Key: YARN-8690 URL: https://issues.apache.org/jira/browse/YARN-8690 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.9.1 Reporter: jialei weng Assignee: Wangda Tan With YARN-1953 change, in yarn 2.9.1, we can not use path, like hdfs://hostname/path, to local resource allocation as it will be resolved to hdfs://hostname{color:#FF}:0{color}/path. We have to add the port 443 in path, like hdfs://hostname{color:#FF}:443{color}/path, to make it work. It isn't a consistent change. If we can make it consistent without customer's change? [~leftnoteasy] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6266) Extend the resource class to support ports management
[ https://issues.apache.org/jira/browse/YARN-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005920#comment-16005920 ] jialei weng edited comment on YARN-6266 at 5/11/17 5:46 AM: not same as anti-affinity scheduling, just bring one way to manager ports as resource in yarn. was (Author: wjlei): not same as anti-affinity scheduling, just bring one way to manager ports as resource. > Extend the resource class to support ports management > - > > Key: YARN-6266 > URL: https://issues.apache.org/jira/browse/YARN-6266 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > Attachments: YARN-6266.001.patch > > > Just like the vcores and memory, ports is an important resource for job to > allocate. We should add the ports management logic to yarn. It can support > the user to allocate two jobs(with same port requirement) to different > machines. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6266) Extend the resource class to support ports management
[ https://issues.apache.org/jira/browse/YARN-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005920#comment-16005920 ] jialei weng commented on YARN-6266: --- not same as anti-affinity scheduling, just bring one way to manager ports as resource. > Extend the resource class to support ports management > - > > Key: YARN-6266 > URL: https://issues.apache.org/jira/browse/YARN-6266 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > Attachments: YARN-6266.001.patch > > > Just like the vcores and memory, ports is an important resource for job to > allocate. We should add the ports management logic to yarn. It can support > the user to allocate two jobs(with same port requirement) to different > machines. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6266) Extend the resource class to support ports management
[ https://issues.apache.org/jira/browse/YARN-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng reassigned YARN-6266: - Assignee: jialei weng > Extend the resource class to support ports management > - > > Key: YARN-6266 > URL: https://issues.apache.org/jira/browse/YARN-6266 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-6266.001.patch > > > Just like the vcores and memory, ports is an important resource for job to > allocate. We should add the ports management logic to yarn. It can support > the user to allocate two jobs(with same port requirement) to different > machines. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6266) Extend the resource class to support ports management
[ https://issues.apache.org/jira/browse/YARN-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng reassigned YARN-6266: - Assignee: (was: jialei weng) > Extend the resource class to support ports management > - > > Key: YARN-6266 > URL: https://issues.apache.org/jira/browse/YARN-6266 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > Attachments: YARN-6266.001.patch > > > Just like the vcores and memory, ports is an important resource for job to > allocate. We should add the ports management logic to yarn. It can support > the user to allocate two jobs(with same port requirement) to different > machines. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6266) Extend the resource class to support ports management
[ https://issues.apache.org/jira/browse/YARN-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-6266: -- Attachment: YARN-6266.001.patch > Extend the resource class to support ports management > - > > Key: YARN-6266 > URL: https://issues.apache.org/jira/browse/YARN-6266 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > Attachments: YARN-6266.001.patch > > > Just like the vcores and memory, ports is an important resource for job to > allocate. We should add the ports management logic to yarn. It can support > the user to allocate two jobs(with same port requirement) to different > machines. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5606) Support multi-label merge into one node label
[ https://issues.apache.org/jira/browse/YARN-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002379#comment-16002379 ] jialei weng commented on YARN-5606: --- Agree with Bin, [~Naganarasimha] Can you consider Bin's suggestion? > Support multi-label merge into one node label > - > > Key: YARN-5606 > URL: https://issues.apache.org/jira/browse/YARN-5606 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: jialei weng > Labels: oct16-hard > Attachments: YARN-5606.001.patch, YARN-5606.002.patch > > > Support multi-label merge into one node label > 1. we want to support multo-label like SSD,GPU,FPGA label merged into single > machine, joined by &. like SSD,GPU,FPGA-> SSD > 2. we add wildcard match to extend the job request. we define the wildcard > like *GPU*, it will math all the node labels with GPU as part of multi-label > merged label. For example, *GPU* will match SSD, GPU We > define SSD={SSD,GPU,FPGA}, and GPU is one of {SSD,GPU,FPGA}, so the > job can run on SSD node. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6266) Extend the resource class to support ports management
[ https://issues.apache.org/jira/browse/YARN-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893891#comment-15893891 ] jialei weng edited comment on YARN-6266 at 3/3/17 8:11 AM: --- Hi, [~grey], can you give more detail info about anti-affinity scheduling? Currently we are using the 2.7.0 version, which version is it published? was (Author: wjlei): Hi, [~grey], Can you give more detail info about anti-affinity scheduling? currently we are using the 2.7.0 version, which version it is published? > Extend the resource class to support ports management > - > > Key: YARN-6266 > URL: https://issues.apache.org/jira/browse/YARN-6266 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > > Just like the vcores and memory, ports is an important resource for job to > allocate. We should add the ports management logic to yarn. It can support > the user to allocate two jobs(with same port requirement) to different > machines. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6266) Extend the resource class to support ports management
[ https://issues.apache.org/jira/browse/YARN-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893891#comment-15893891 ] jialei weng edited comment on YARN-6266 at 3/3/17 8:10 AM: --- Hi, [~grey], Can you give more detail info about anti-affinity scheduling? currently we are using the 2.7.0 version, which version it is published? was (Author: wjlei): [~grey]Hi, Lei, Can you give more detail info about anti-affinity scheduling? currently we are using the 2.7.0 version, which version it is published? > Extend the resource class to support ports management > - > > Key: YARN-6266 > URL: https://issues.apache.org/jira/browse/YARN-6266 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > > Just like the vcores and memory, ports is an important resource for job to > allocate. We should add the ports management logic to yarn. It can support > the user to allocate two jobs(with same port requirement) to different > machines. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6266) Extend the resource class to support ports management
[ https://issues.apache.org/jira/browse/YARN-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893891#comment-15893891 ] jialei weng commented on YARN-6266: --- [~grey]Hi, Lei, Can you give more detail info about anti-affinity scheduling? currently we are using the 2.7.0 version, which version it is published? > Extend the resource class to support ports management > - > > Key: YARN-6266 > URL: https://issues.apache.org/jira/browse/YARN-6266 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > > Just like the vcores and memory, ports is an important resource for job to > allocate. We should add the ports management logic to yarn. It can support > the user to allocate two jobs(with same port requirement) to different > machines. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6266) Extend the resource class to support ports management
jialei weng created YARN-6266: - Summary: Extend the resource class to support ports management Key: YARN-6266 URL: https://issues.apache.org/jira/browse/YARN-6266 Project: Hadoop YARN Issue Type: New Feature Reporter: jialei weng Just like the vcores and memory, ports is an important resource for job to allocate. We should add the ports management logic to yarn. It can support the user to allocate two jobs(with same port requirement) to different machines. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475752#comment-15475752 ] jialei weng commented on YARN-4948: --- Thanks, Wangda, I get your point. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch, YARN-4948.006.patch, YARN-4948.007.patch > > > Support node labels store in zookeeper. The main scenario for this is to give > a way to decouple yarn with HDFS. Since nodelabel is a very important data > for yarn, if hdfs down, yarn will fail to start up,too. So it is meaningful > for make yarn much independence when user serve both yarn and HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5606) Support multi-label merge into one node label
[ https://issues.apache.org/jira/browse/YARN-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470038#comment-15470038 ] jialei weng commented on YARN-5606: --- yes, as current node label will divide the current cluster into several sub-clusters. If Constrain Label feature ready, it will help on this. So it is a temporary solution to solve the current situation. Welcome to update this. > Support multi-label merge into one node label > - > > Key: YARN-5606 > URL: https://issues.apache.org/jira/browse/YARN-5606 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > Attachments: YARN-5606.001.patch, YARN-5606.002.patch > > > Support multi-label merge into one node label > 1. we want to support multo-label like SSD,GPU,FPGA label merged into single > machine, joined by &. like SSD,GPU,FPGA-> SSD > 2. we add wildcard match to extend the job request. we define the wildcard > like *GPU*, it will math all the node labels with GPU as part of multi-label > merged label. For example, *GPU* will match SSD, GPU We > define SSD={SSD,GPU,FPGA}, and GPU is one of {SSD,GPU,FPGA}, so the > job can run on SSD node. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-5606) Support multi-label merge into one node label
[ https://issues.apache.org/jira/browse/YARN-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-5606: -- Comment: was deleted (was: yes, as current node label will divide the current cluster into several sub-clusters. If Constrain Label feature ready, it will help on this. So it is a temporary solution to solve the current situation. Welcome to update this. ) > Support multi-label merge into one node label > - > > Key: YARN-5606 > URL: https://issues.apache.org/jira/browse/YARN-5606 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > Attachments: YARN-5606.001.patch, YARN-5606.002.patch > > > Support multi-label merge into one node label > 1. we want to support multo-label like SSD,GPU,FPGA label merged into single > machine, joined by &. like SSD,GPU,FPGA-> SSD > 2. we add wildcard match to extend the job request. we define the wildcard > like *GPU*, it will math all the node labels with GPU as part of multi-label > merged label. For example, *GPU* will match SSD, GPU We > define SSD={SSD,GPU,FPGA}, and GPU is one of {SSD,GPU,FPGA}, so the > job can run on SSD node. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5606) Support multi-label merge into one node label
[ https://issues.apache.org/jira/browse/YARN-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470037#comment-15470037 ] jialei weng commented on YARN-5606: --- yes, as current node label will divide the current cluster into several sub-clusters. If Constrain Label feature ready, it will help on this. So it is a temporary solution to solve the current situation. Welcome to update this. > Support multi-label merge into one node label > - > > Key: YARN-5606 > URL: https://issues.apache.org/jira/browse/YARN-5606 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > Attachments: YARN-5606.001.patch, YARN-5606.002.patch > > > Support multi-label merge into one node label > 1. we want to support multo-label like SSD,GPU,FPGA label merged into single > machine, joined by &. like SSD,GPU,FPGA-> SSD > 2. we add wildcard match to extend the job request. we define the wildcard > like *GPU*, it will math all the node labels with GPU as part of multi-label > merged label. For example, *GPU* will match SSD, GPU We > define SSD={SSD,GPU,FPGA}, and GPU is one of {SSD,GPU,FPGA}, so the > job can run on SSD node. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5606) Support multi-label merge into one node label
[ https://issues.apache.org/jira/browse/YARN-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-5606: -- Attachment: YARN-5606.002.patch > Support multi-label merge into one node label > - > > Key: YARN-5606 > URL: https://issues.apache.org/jira/browse/YARN-5606 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > Attachments: YARN-5606.001.patch, YARN-5606.002.patch > > > Support multi-label merge into one node label > 1. we want to support multo-label like SSD,GPU,FPGA label merged into single > machine, joined by &. like SSD,GPU,FPGA-> SSD > 2. we add wildcard match to extend the job request. we define the wildcard > like *GPU*, it will math all the node labels with GPU as part of multi-label > merged label. For example, *GPU* will match SSD, GPU We > define SSD={SSD,GPU,FPGA}, and GPU is one of {SSD,GPU,FPGA}, so the > job can run on SSD node. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5606) Support multi-label merge into one node label
[ https://issues.apache.org/jira/browse/YARN-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-5606: -- Attachment: YARN-5606.001.patch > Support multi-label merge into one node label > - > > Key: YARN-5606 > URL: https://issues.apache.org/jira/browse/YARN-5606 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: jialei weng > Attachments: YARN-5606.001.patch > > > Support multi-label merge into one node label > 1. we want to support multo-label like SSD,GPU,FPGA label merged into single > machine, joined by &. like SSD,GPU,FPGA-> SSD > 2. we add wildcard match to extend the job request. we define the wildcard > like *GPU*, it will math all the node labels with GPU as part of multi-label > merged label. For example, *GPU* will match SSD, GPU We > define SSD={SSD,GPU,FPGA}, and GPU is one of {SSD,GPU,FPGA}, so the > job can run on SSD node. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5606) Support multi-label merge into one node label
jialei weng created YARN-5606: - Summary: Support multi-label merge into one node label Key: YARN-5606 URL: https://issues.apache.org/jira/browse/YARN-5606 Project: Hadoop YARN Issue Type: New Feature Reporter: jialei weng Support multi-label merge into one node label 1. we want to support multo-label like SSD,GPU,FPGA label merged into single machine, joined by &. like SSD,GPU,FPGA-> SSD 2. we add wildcard match to extend the job request. we define the wildcard like *GPU*, it will math all the node labels with GPU as part of multi-label merged label. For example, *GPU* will match SSD, GPU We define SSD={SSD,GPU,FPGA}, and GPU is one of {SSD,GPU,FPGA}, so the job can run on SSD node. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395812#comment-15395812 ] jialei weng commented on YARN-4948: --- HI, [~Naganarasimha], Do you have any update for this? or need I to add some document? > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch, YARN-4948.006.patch, YARN-4948.007.patch > > > Support node labels store in zookeeper. The main scenario for this is to give > a way to decouple yarn with HDFS. Since nodelabel is a very important data > for yarn, if hdfs down, yarn will fail to start up,too. So it is meaningful > for make yarn much independence when user serve both yarn and HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369090#comment-15369090 ] jialei weng commented on YARN-4948: --- Sure, [~Naganarasimha], just ask me if you has some confusion. The change is for the use case of cloud and zookeeper is a good tool to support fail-over and high reliable. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch, YARN-4948.006.patch, YARN-4948.007.patch > > > Support node labels store in zookeeper. The main scenario for this is to give > a way to decouple yarn with HDFS. Since nodelabel is a very important data > for yarn, if hdfs down, yarn will fail to start up,too. So it is meaningful > for make yarn much independence when user serve both yarn and HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: YARN-4948.007.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch, YARN-4948.006.patch, YARN-4948.007.patch > > > Support node labels store in zookeeper. The main scenario for this is to give > a way to decouple yarn with HDFS. Since nodelabel is a very important data > for yarn, if hdfs down, yarn will fail to start up,too. So it is meaningful > for make yarn much independence when user serve both yarn and HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: YARN-4948.006.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch, YARN-4948.006.patch > > > Support node labels store in zookeeper. The main scenario for this is to give > a way to decouple yarn with HDFS. Since nodelabel is a very important data > for yarn, if hdfs down, yarn will fail to start up,too. So it is meaningful > for make yarn much independence when user serve both yarn and HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367065#comment-15367065 ] jialei weng commented on YARN-4948: --- Thanks [~vinodkv], I will follow the rule. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch > > > Support node labels store in zookeeper. The main scenario for this is to give > a way to decouple yarn with HDFS. Since nodelabel is a very important data > for yarn, if hdfs down, yarn will fail to start up,too. So it is meaningful > for make yarn much independence when user serve both yarn and HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: (was: YARN-4948.005.patch) > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch > > > Support node labels store in zookeeper. The main scenario for this is to give > a way to decouple yarn with HDFS. Since nodelabel is a very important data > for yarn, if hdfs down, yarn will fail to start up,too. So it is meaningful > for make yarn much independence when user serve both yarn and HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Description: Support node labels store in zookeeper. The main scenario for this is to give a way to decouple yarn with HDFS. Since nodelabel is a very important data for yarn, if hdfs down, yarn will fail to start up,too. So it is meaningful for make yarn much independence when user serve both yarn and HDFS. (was: Support node labels store in zookeeper) > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch, YARN-4948.005.patch > > > Support node labels store in zookeeper. The main scenario for this is to give > a way to decouple yarn with HDFS. Since nodelabel is a very important data > for yarn, if hdfs down, yarn will fail to start up,too. So it is meaningful > for make yarn much independence when user serve both yarn and HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: YARN-4948.005.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch, YARN-4948.005.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: (was: YARN-4948.004.patch) > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch, YARN-4948.005.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366158#comment-15366158 ] jialei weng commented on YARN-4948: --- Hi, [~Naganarasimha], Thanks for reply. The main scenario for this is to give a way to decouple yarn with HDFS. Since nodelabel is a very important data for yarn, if hdfs down, yarn still fail to start up. So I think this change is meaningful for make yarn much independence when user serve both yarn and HDFS. I make a way to beyond the zknode 1MB storage limitation. In UT, it can support 10 nodemanager, recover all 10 node nodelabel cost 4 seconds. It is really efficiency and reliable. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch, YARN-4948.004.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: YARN-4948.004.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch, YARN-4948.004.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365648#comment-15365648 ] jialei weng edited comment on YARN-4948 at 7/7/16 9:06 AM: --- Hi, [~leftnoteasy] and [~Naganarasimha], what is the progress to bring this patch to trunk? i mean make the trunk take this change. was (Author: wjlei): Hi, [~leftnoteasy] adn [~Naganarasimha], what is the progress to bring this patch to trunk? i mean make the trunk take this change. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng reopened YARN-4948: --- need make the trunk take this change > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365648#comment-15365648 ] jialei weng edited comment on YARN-4948 at 7/7/16 5:48 AM: --- Hi, [~leftnoteasy] adn [~Naganarasimha], what is the progress to bring this patch to trunk? i mean make the trunk take this change. was (Author: wjlei): Hi, [~leftnoteasy] adn [~Naganarasimha] what is the progress to bring this patch to trunk? i mean make the trunk take this change. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365648#comment-15365648 ] jialei weng edited comment on YARN-4948 at 7/7/16 5:48 AM: --- Hi, [~leftnoteasy] adn [~Naganarasimha] what is the progress to bring this patch to trunk? i mean make the trunk take this change. was (Author: wjlei): Hi, [~leftnoteasy], what is the progress to bring this patch to trunk? i mean make the trunk take this change. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365648#comment-15365648 ] jialei weng commented on YARN-4948: --- Hi, [~leftnoteasy], what is the progress to bring this patch to trunk? i mean make the trunk take this change. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: YARN-4948.003.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch, > YARN-4948.003.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242296#comment-15242296 ] jialei weng commented on YARN-4948: --- Can you tell me or help me to trigger the QA? Thanks. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: YARN-4948.002.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240883#comment-15240883 ] jialei weng commented on YARN-4948: --- How could I trigger the Hadoop QA to run my patch again to test? > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: jialei weng > Attachments: YARN-4948.001.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: YARN-4948.001.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: jialei weng > Attachments: YARN-4948.001.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: (was: YARN-4948.001.patch) > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: jialei weng > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: YARN-4948.001.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: jialei weng > Attachments: YARN-4948.001.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: (was: YARN-4948.001.patch) > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: jialei weng > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: (was: YARN-4948-branch-2.7.0.001.patch) > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: jialei weng > Attachments: YARN-4948.001.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238763#comment-15238763 ] jialei weng commented on YARN-4948: --- Ok, I will. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: jialei weng > Attachments: YARN-4948-branch-2.7.0.001.patch, YARN-4948.001.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: YARN-4948-branch-2.7.0.001.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: jialei weng > Attachments: YARN-4948-branch-2.7.0.001.patch, YARN-4948.001.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: YARN-4948.001.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: jialei weng > Attachments: YARN-4948.001.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: (was: Node-labels-store-in-zookeeper.patch) > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: jialei weng > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: (was: Node-labels-store-in-zookeeper-2.7.0.patch) > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: jialei weng > Attachments: Node-labels-store-in-zookeeper.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: Node-labels-store-in-zookeeper.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: jialei weng > Attachments: Node-labels-store-in-zookeeper.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4948: -- Attachment: Node-labels-store-in-zookeeper-2.7.0.patch > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: jialei weng > Attachments: Node-labels-store-in-zookeeper-2.7.0.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4948) Support node labels store in zookeeper
jialei weng created YARN-4948: - Summary: Support node labels store in zookeeper Key: YARN-4948 URL: https://issues.apache.org/jira/browse/YARN-4948 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Affects Versions: 2.7.0 Reporter: jialei weng Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4361) Total resource count mistake:NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model
[ https://issues.apache.org/jira/browse/YARN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037226#comment-15037226 ] jialei weng commented on YARN-4361: --- yes, I check the patch, it can also the issue. Thanks. > Total resource count mistake:NodeRemovedSchedulerEvent in > ReconnectNodeTransition will reduce the newNode.getTotalCapability() in > Multi-thread model > > > Key: YARN-4361 > URL: https://issues.apache.org/jira/browse/YARN-4361 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.2 >Reporter: jialei weng > Labels: patch > Attachments: YARN-4361v1.patch > > > Total resource count mistake: > NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the > newNode.getTotalCapability() in Multi-thread model. Since the RMNode and > scheduler in different queue. So it cannot guarantee the remove-update-add > operation in sequence. Sometimes the total resource will reduce the > newNode.getTotalCapability() when handling NodeRemovedSchedulerEvent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4361) Total resource count mistake:NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model
[ https://issues.apache.org/jira/browse/YARN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng resolved YARN-4361. --- Resolution: Duplicate > Total resource count mistake:NodeRemovedSchedulerEvent in > ReconnectNodeTransition will reduce the newNode.getTotalCapability() in > Multi-thread model > > > Key: YARN-4361 > URL: https://issues.apache.org/jira/browse/YARN-4361 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.2 >Reporter: jialei weng > Labels: patch > Attachments: YARN-4361v1.patch > > > Total resource count mistake: > NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the > newNode.getTotalCapability() in Multi-thread model. Since the RMNode and > scheduler in different queue. So it cannot guarantee the remove-update-add > operation in sequence. Sometimes the total resource will reduce the > newNode.getTotalCapability() when handling NodeRemovedSchedulerEvent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4361) Total resource count mistake:NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model
jialei weng created YARN-4361: - Summary: Total resource count mistake:NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model Key: YARN-4361 URL: https://issues.apache.org/jira/browse/YARN-4361 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: jialei weng Total resource count mistake: NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model. Since the RMNode and scheduler in different queue. So it cannot guarantee the remove-update-add operation in sequence. Usually the total resource will reduce the newNode.getTotalCapability() when handling NodeRemovedSchedulerEvent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4361) Total resource count mistake:NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model
[ https://issues.apache.org/jira/browse/YARN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4361: -- Attachment: 0001-Fix-Total-resource-count-mistake-NodeRemovedSchedule.patch A appropriate way to handle this issue. Just remove the if logic. > Total resource count mistake:NodeRemovedSchedulerEvent in > ReconnectNodeTransition will reduce the newNode.getTotalCapability() in > Multi-thread model > > > Key: YARN-4361 > URL: https://issues.apache.org/jira/browse/YARN-4361 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.2 >Reporter: jialei weng > Labels: patch > > Total resource count mistake: > NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the > newNode.getTotalCapability() in Multi-thread model. Since the RMNode and > scheduler in different queue. So it cannot guarantee the remove-update-add > operation in sequence. Usually the total resource will reduce the > newNode.getTotalCapability() when handling NodeRemovedSchedulerEvent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4361) Total resource count mistake:NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model
[ https://issues.apache.org/jira/browse/YARN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4361: -- Attachment: (was: 0001-Fix-Total-resource-count-mistake-NodeRemovedSchedule.patch) > Total resource count mistake:NodeRemovedSchedulerEvent in > ReconnectNodeTransition will reduce the newNode.getTotalCapability() in > Multi-thread model > > > Key: YARN-4361 > URL: https://issues.apache.org/jira/browse/YARN-4361 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.2 >Reporter: jialei weng > Labels: patch > > Total resource count mistake: > NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the > newNode.getTotalCapability() in Multi-thread model. Since the RMNode and > scheduler in different queue. So it cannot guarantee the remove-update-add > operation in sequence. Usually the total resource will reduce the > newNode.getTotalCapability() when handling NodeRemovedSchedulerEvent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4361) Total resource count mistake:NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model
[ https://issues.apache.org/jira/browse/YARN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4361: -- Attachment: YARN-4361v1.patch A appropriate way to solve the issue. Just remove the 'if' logic. > Total resource count mistake:NodeRemovedSchedulerEvent in > ReconnectNodeTransition will reduce the newNode.getTotalCapability() in > Multi-thread model > > > Key: YARN-4361 > URL: https://issues.apache.org/jira/browse/YARN-4361 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.2 >Reporter: jialei weng > Labels: patch > Attachments: YARN-4361v1.patch > > > Total resource count mistake: > NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the > newNode.getTotalCapability() in Multi-thread model. Since the RMNode and > scheduler in different queue. So it cannot guarantee the remove-update-add > operation in sequence. Usually the total resource will reduce the > newNode.getTotalCapability() when handling NodeRemovedSchedulerEvent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4361) Total resource count mistake:NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model
[ https://issues.apache.org/jira/browse/YARN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-4361: -- Description: Total resource count mistake: NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model. Since the RMNode and scheduler in different queue. So it cannot guarantee the remove-update-add operation in sequence. Sometimes the total resource will reduce the newNode.getTotalCapability() when handling NodeRemovedSchedulerEvent. was: Total resource count mistake: NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model. Since the RMNode and scheduler in different queue. So it cannot guarantee the remove-update-add operation in sequence. Usually the total resource will reduce the newNode.getTotalCapability() when handling NodeRemovedSchedulerEvent. > Total resource count mistake:NodeRemovedSchedulerEvent in > ReconnectNodeTransition will reduce the newNode.getTotalCapability() in > Multi-thread model > > > Key: YARN-4361 > URL: https://issues.apache.org/jira/browse/YARN-4361 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.2 >Reporter: jialei weng > Labels: patch > Attachments: YARN-4361v1.patch > > > Total resource count mistake: > NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the > newNode.getTotalCapability() in Multi-thread model. Since the RMNode and > scheduler in different queue. So it cannot guarantee the remove-update-add > operation in sequence. Sometimes the total resource will reduce the > newNode.getTotalCapability() when handling NodeRemovedSchedulerEvent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)