[jira] [Commented] (YARN-1609) Add Service Container type to NodeManager in YARN
[ https://issues.apache.org/jira/browse/YARN-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934675#comment-13934675 ] Wangda Tan commented on YARN-1609: -- I'm leaving my current company on next week, and am no longer involved in YARN-1609, one of my colleagues will take this Jira. Add Service Container type to NodeManager in YARN - Key: YARN-1609 URL: https://issues.apache.org/jira/browse/YARN-1609 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.2.0 Reporter: Wangda Tan Attachments: Add Service Container type to NodeManager in YARN-V1.pdf From our work to support running OpenMPI on YARN (MAPREDUCE-2911), we found that it’s important to have framework specific daemon process manage the tasks on each node directly. The daemon process, most likely similar in other frameworks as well, provides critical services to tasks running on that node(for example “wireup”, spawn user process in large numbers at once etc). In YARN, it’s hard, if not possible, to have the those processes to be managed by YARN. We propose to extend the container model on NodeManager side to support “Service Container” to run/manage such framework daemon/services process. We believe this is very useful to other application framework developers as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1609) Add Service Container type to NodeManager in YARN
[ https://issues.apache.org/jira/browse/YARN-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13875625#comment-13875625 ] Michael Lv commented on YARN-1609: -- Hi [~hitesh], thanks for the detailed comments. The essence of the proposal actually is to have server process be able to be managed by NodeManager, and in the mean time allow AM/RM to control/monitor resource consumption via existing AM-NM, and AM-RM protocol with minimal backwards compatible extensions. In many frameworks(e.g OMPI/ORTE)the server process running on each node manages it's child processes internally and that's hard to change. From performance/scaleability perspective, since it's scenario dependent IMO, I'll comment on your questions in Hamster case: {quote}Is the NM flow too slow for launching a new container or killing one?{quote} Today there is no difference in containers, they go thru the same life cycle and are independent from each other - it's more like running many tasks directed by AM(MRv2 framework comes to mind). If the task containers could be managed by server containers(slave daemon in Hamster, or TaskTracker in MR1?), the server daemon can launch processes (in batch or on demand) directly and faster. So yes, in hamster case, it can be faster if we can bypass some steps(e.g container localization etc) but not much as per node the scale is small. We'll have more data points as this patch being implemented further in our work {quote}Is it difficult to monitoring of the status of a container?{quote} Yes, esp when the server container wants to control the lifecycle of the child process and sync on child process state - we are using shared storage for that purpose for now(a workaround, for larger number processes it can be a problem) {quote}Can the NM not handle launching/managing a large no. of containers on a single machine?{quote} Not really Add Service Container type to NodeManager in YARN - Key: YARN-1609 URL: https://issues.apache.org/jira/browse/YARN-1609 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.2.0 Reporter: Wangda Tan Assignee: Wangda Tan Attachments: Add Service Container type to NodeManager in YARN-V1.pdf From our work to support running OpenMPI on YARN (MAPREDUCE-2911), we found that it’s important to have framework specific daemon process manage the tasks on each node directly. The daemon process, most likely similar in other frameworks as well, provides critical services to tasks running on that node(for example “wireup”, spawn user process in large numbers at once etc). In YARN, it’s hard, if not possible, to have the those processes to be managed by YARN. We propose to extend the container model on NodeManager side to support “Service Container” to run/manage such framework daemon/services process. We believe this is very useful to other application framework developers as well. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1609) Add Service Container type to NodeManager in YARN
[ https://issues.apache.org/jira/browse/YARN-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13875418#comment-13875418 ] Hitesh Shah commented on YARN-1609: --- [~lvm] I think I understand what the goals are though I am still unclear on the specifics as to what the real problems were when trying to launch processes directly via the NM as compared to using the service container. - Is the NM flow too slow for launching a new container or killing one? - Is it difficult to monitoring of the status of a container? - Can the NM not handle launching/managing a large no. of containers on a single machine? It would also be helpful to understand what kind of workarounds/hacks you needed to do so as to understand the underlying problems that you faced. Add Service Container type to NodeManager in YARN - Key: YARN-1609 URL: https://issues.apache.org/jira/browse/YARN-1609 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.2.0 Reporter: Wangda Tan Assignee: Wangda Tan Attachments: Add Service Container type to NodeManager in YARN-V1.pdf From our work to support running OpenMPI on YARN (MAPREDUCE-2911), we found that it’s important to have framework specific daemon process manage the tasks on each node directly. The daemon process, most likely similar in other frameworks as well, provides critical services to tasks running on that node(for example “wireup”, spawn user process in large numbers at once etc). In YARN, it’s hard, if not possible, to have the those processes to be managed by YARN. We propose to extend the container model on NodeManager side to support “Service Container” to run/manage such framework daemon/services process. We believe this is very useful to other application framework developers as well. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1609) Add Service Container type to NodeManager in YARN
[ https://issues.apache.org/jira/browse/YARN-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873523#comment-13873523 ] Hitesh Shah commented on YARN-1609: --- bq. The daemon process, most likely similar in other frameworks as well, provides critical services to tasks running on that node(for example “wireup”, spawn user process in large numbers at once etc). In YARN, it’s hard, if not possible, to have the those processes to be managed by YARN. Could you add more details about what is missing/hard to do and why it cannot be done in a performant manner in YARN today. Also, any other analysis/perf benchmarks that you have done to make the case for the above would be useful to look at too. Add Service Container type to NodeManager in YARN - Key: YARN-1609 URL: https://issues.apache.org/jira/browse/YARN-1609 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.2.0 Reporter: Wangda Tan Assignee: Wangda Tan Attachments: Add Service Container type to NodeManager in YARN-V1.pdf From our work to support running OpenMPI on YARN (MAPREDUCE-2911), we found that it’s important to have framework specific daemon process manage the tasks on each node directly. The daemon process, most likely similar in other frameworks as well, provides critical services to tasks running on that node(for example “wireup”, spawn user process in large numbers at once etc). In YARN, it’s hard, if not possible, to have the those processes to be managed by YARN. We propose to extend the container model on NodeManager side to support “Service Container” to run/manage such framework daemon/services process. We believe this is very useful to other application framework developers as well. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1609) Add Service Container type to NodeManager in YARN
[ https://issues.apache.org/jira/browse/YARN-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13874338#comment-13874338 ] Michael Lv commented on YARN-1609: -- Hi [~hitesh], what is missing IMO is that we do not have a good way currently in YARN, specifically in NodeManager, to allow a user land daemon to manage its own process, and in the mean time, those processes, daemon or not, are also managed by ContainerManager. The key here is to allow daemon service container to own the lifecycle of the task/process it spawns and monitors. Those daemon or node level services, could be implemented as auxiliary services, but in that way they would be out side of resource allocation, and execution/tracking by YARN. If those daemon process can be supported in NM as a first class citizen, many of today's workaround/hacky code won't be needed. Add Service Container type to NodeManager in YARN - Key: YARN-1609 URL: https://issues.apache.org/jira/browse/YARN-1609 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.2.0 Reporter: Wangda Tan Assignee: Wangda Tan Attachments: Add Service Container type to NodeManager in YARN-V1.pdf From our work to support running OpenMPI on YARN (MAPREDUCE-2911), we found that it’s important to have framework specific daemon process manage the tasks on each node directly. The daemon process, most likely similar in other frameworks as well, provides critical services to tasks running on that node(for example “wireup”, spawn user process in large numbers at once etc). In YARN, it’s hard, if not possible, to have the those processes to be managed by YARN. We propose to extend the container model on NodeManager side to support “Service Container” to run/manage such framework daemon/services process. We believe this is very useful to other application framework developers as well. -- This message was sent by Atlassian JIRA (v6.1.5#6160)