[ https://issues.apache.org/jira/browse/YARN-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205252#comment-16205252 ]
Allen Wittenauer edited comment on YARN-7127 at 10/15/17 7:10 PM: ------------------------------------------------------------------ I thought some more about this topic this morning and had two more things to add: 1) I think an AM should have a way to tell the RM about any extra capabilities it might have. This feature isn't particularly useful for the RM, but it would be beneficial for any clients. For example, the MR AM might tag itself as "jobtracker" to note that it supports the extra features that the 'mapred' command uses. A Slider AM might tag itself as 'slider' or 'native' or whatever to signify that it supports those extensions. etc. etc. That would make extending the yarn application subcommand MUCH easier and potentially even open the door for extensions/plug-ins to that command from third parties. For example, turning the extra mapred subcommands into a hook off of yarn application would allow us to ultimately kill the mapred command once the timeline server is capable of doing everything that the history server can. 2) A large part of the discussion here is fueled by contradicting views on this project's place within Hadoop. If one takes the belief that it's "just another framework, like MapReduce," then creating separate sub-commands, documentation, daemons, etc. seems logical. If one takes the view that it's "part of YARN," then adding new sub-commands, a separate documentation section, and a ton of new daemons does not make sense. But it doesn't appear that either of those choices has been made. Portions of the code base are in the separate framework type of mold, but other changes are to core YARN functionality, even if we push aside "obviously part of YARN" bits like RegistryDNS. It seems as though the folks working on this branch need to make that decision and drive it to completion: is it part of YARN or is it not? If it's the former, then that means full integration: no more separate API daemon, no different subcommand structure, etc., etc. If it's the latter, then that means total separation: it needs to be a separate subproject, no shared code base, new top-level command, etc., etc. Having a foot in both is what is ultimately driving this disagreement and will eventually confuse users. was (Author: aw): I thought some more about this topic this morning and had two more thoughts: 1) I think an AM should have a way to tell the RM about any extra capabilities it might have. This feature isn't particularly useful for the RM, but it would be beneficial for any clients. For example, the MR AM might tag itself as "jobtracker" to note that it supports the extra features that the 'mapred' command uses. A Slider AM might tag itself as 'slider' or 'native' or whatever to signify that it supports those extensions. etc. etc. That would make extending the yarn application subcommand MUCH easier and potentially even open the door for extensions/plug-ins to that command from third parties. For example, turning the extra mapred subcommands into a hook off of yarn application would allow us to ultimately kill the mapred command once the timeline server is capable of doing everything that the history server can. 2) A large part of the discussion here is fueled by contradicting views on this project's place within Hadoop. If one takes the belief that it's "just another framework, like MapReduce," then creating separate sub-commands, documentation, daemons, etc. seems logical. If one takes the view that it's "part of YARN," then adding new sub-commands, a separate documentation section, and a ton of new daemons does not make sense. But it doesn't appear that either of those choices has been made. Portions of the code base are in the separate framework type of mold, but other changes are to core YARN functionality, even if we push aside "obviously part of YARN" bits like RegistryDNS. It seems as though the folks working on this branch need to make that decision and drive it to completion: is it part of YARN or is it not? If it's the former, then that means full integration: no more separate API daemon, no different subcommand structure, etc., etc. If it's the latter, then that means total separation: it needs to be a separate subproject, no shared code base, new top-level command, etc., etc. Having a foot in both is what is ultimately driving this disagreement and will eventually confuse users. > Merge yarn-native-service branch into trunk > ------------------------------------------- > > Key: YARN-7127 > URL: https://issues.apache.org/jira/browse/YARN-7127 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Jian He > Assignee: Jian He > Attachments: YARN-7127.01.patch, YARN-7127.02.patch, > YARN-7127.03.patch, YARN-7127.04.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org