[ 
https://issues.apache.org/jira/browse/YARN-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205252#comment-16205252
 ] 

Allen Wittenauer edited comment on YARN-7127 at 10/15/17 7:10 PM:
------------------------------------------------------------------

I thought some more about this topic this morning and had two more things to 
add:

1) I think an AM should have a way to tell the RM about any extra capabilities 
it might have.  This feature isn't particularly useful for the RM, but it would 
be beneficial for any clients.  For example, the MR AM might tag itself as 
"jobtracker" to note that it supports the extra features that the 'mapred' 
command uses.  A Slider AM might tag itself as 'slider' or 'native' or whatever 
to signify that it supports those extensions. etc. etc.   That would make 
extending the yarn application subcommand MUCH easier and potentially even open 
the door for extensions/plug-ins to that command from third parties. For 
example, turning the extra mapred subcommands into a hook off of yarn 
application would allow us to ultimately kill the mapred command once the 
timeline server is capable of doing everything that the history server can.

2) A large part of the discussion here is fueled by contradicting views on this 
project's place within Hadoop.  If one takes the belief that it's "just another 
framework, like MapReduce," then creating separate sub-commands, documentation, 
daemons, etc. seems logical.   If one takes the view that it's "part of YARN," 
then adding new sub-commands, a separate documentation section, and a ton of 
new daemons does not make sense.

But it doesn't appear that either of those choices has been made. Portions of 
the code base are in the separate framework type of mold, but other changes are 
to core YARN functionality, even if we push aside "obviously part of YARN" bits 
like RegistryDNS.

It seems as though the folks working on this branch need to make that decision 
and drive it to completion:  is it part of YARN or is it not?  If it's the 
former, then that means full integration: no more separate API daemon, no 
different subcommand structure, etc., etc.  If it's the latter, then that means 
total separation: it needs to be a separate subproject, no shared code base, 
new top-level command, etc., etc.

Having a foot in both is what is ultimately driving this disagreement and will 
eventually confuse users.  


was (Author: aw):
I thought some more about this topic this morning and had two more thoughts:

1) I think an AM should have a way to tell the RM about any extra capabilities 
it might have.  This feature isn't particularly useful for the RM, but it would 
be beneficial for any clients.  For example, the MR AM might tag itself as 
"jobtracker" to note that it supports the extra features that the 'mapred' 
command uses.  A Slider AM might tag itself as 'slider' or 'native' or whatever 
to signify that it supports those extensions. etc. etc.   That would make 
extending the yarn application subcommand MUCH easier and potentially even open 
the door for extensions/plug-ins to that command from third parties. For 
example, turning the extra mapred subcommands into a hook off of yarn 
application would allow us to ultimately kill the mapred command once the 
timeline server is capable of doing everything that the history server can.

2) A large part of the discussion here is fueled by contradicting views on this 
project's place within Hadoop.  If one takes the belief that it's "just another 
framework, like MapReduce," then creating separate sub-commands, documentation, 
daemons, etc. seems logical.   If one takes the view that it's "part of YARN," 
then adding new sub-commands, a separate documentation section, and a ton of 
new daemons does not make sense.

But it doesn't appear that either of those choices has been made. Portions of 
the code base are in the separate framework type of mold, but other changes are 
to core YARN functionality, even if we push aside "obviously part of YARN" bits 
like RegistryDNS.

It seems as though the folks working on this branch need to make that decision 
and drive it to completion:  is it part of YARN or is it not?  If it's the 
former, then that means full integration: no more separate API daemon, no 
different subcommand structure, etc., etc.  If it's the latter, then that means 
total separation: it needs to be a separate subproject, no shared code base, 
new top-level command, etc., etc.

Having a foot in both is what is ultimately driving this disagreement and will 
eventually confuse users.  

> Merge yarn-native-service branch into trunk
> -------------------------------------------
>
>                 Key: YARN-7127
>                 URL: https://issues.apache.org/jira/browse/YARN-7127
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jian He
>            Assignee: Jian He
>         Attachments: YARN-7127.01.patch, YARN-7127.02.patch, 
> YARN-7127.03.patch, YARN-7127.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to