[ 
https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217705#comment-16217705
 ] 

Allen Wittenauer edited comment on HADOOP-14976 at 10/24/17 9:10 PM:
---------------------------------------------------------------------

bq. since the calling script always knows what is necessary? 

I'd need to be convinced this is true.  A lot of the work done in the shell 
script rewrite and follow on work was to make the "front end" scripts as dumb 
as possible in order to centralize the program logic.  This gave huge benefits 
in the form of script consistency, testing, and more.

Besides, EXECNAME is used for *very* specific things:

e.g.:

https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L67
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh#L20

are great examples where the execname is exactly what needs to be reported. 

.. and that's even before 3rd party add-ons that might expect 
HADOOP_SHELL_EXECNAME to work as expected.


If distributions really are renaming the scripts (which is extremely 
problematic for lots of reasons), there isn't much of a reason they couldn't 
just tuck them away in a non-PATH directory and use the same names or even just 
rewrite the scripts directly.  (See above about removing as much logic as 
possible.)

I've had in my head a "vendor" version of hadoop-user-function.sh, but I'm not 
sure if even that would help here.  It really depends upon the why the bin 
scripts are getting renamed, if the problem being solved is actually more 
appropriate for hadoop-layout.sh, etc.

I see nothing but pain and misfortune for mucking with HADOOP_SHELL_EXECNAME 
though.


was (Author: aw):
bq. since the calling script always knows what is necessary? 

I'd need to be convinced this is true.  A lot of the work done in the shell 
script rewrite and follow on work was to make the "front end" scripts as dumb 
as possible in order to centralize the program logic.  This gave huge benefits 
in the form of script consistency, testing, and more.

Besides, CLASSNAME and EXECNAME are used for *very* different things and aren't 
guaranteed to match.

e.g.:

https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L67
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh#L20

are great examples where the execname is exactly what needs to be reported. 

.. and that's even before 3rd party add-ons that might expect 
HADOOP_SHELL_EXECNAME to work as expected.


If distributions really are renaming the scripts (which is extremely 
problematic for lots of reasons), there isn't much of a reason they couldn't 
just tuck them away in a non-PATH directory and use the same names or even just 
rewrite the scripts directly.  (See above about removing as much logic as 
possible.)

I've had in my head a "vendor" version of hadoop-user-function.sh, but I'm not 
sure if even that would help here.  It really depends upon the why the bin 
scripts are getting renamed, if the problem being solved is actually more 
appropriate for hadoop-layout.sh, etc.

I see nothing but pain and misfortune for mucking with HADOOP_SHELL_EXECNAME 
though.

> Allow overriding HADOOP_SHELL_EXECNAME
> --------------------------------------
>
>                 Key: HADOOP-14976
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14976
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Arpit Agarwal
>
> Some Hadoop shell scripts infer their own name using this bit of shell magic:
> {code}
>  18     MYNAME="${BASH_SOURCE-$0}"
>  19     HADOOP_SHELL_EXECNAME="${MYNAME##*/}"
> {code}
> e.g. see the 
> [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18]
>  script.
> The inferred shell script name is later passed to _hadoop-functions.sh_ which 
> uses it to construct the names of some environment variables. E.g. when 
> invoking _hdfs datanode_, the options variable name is inferred as follows:
> {code}
> # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS
> {code}
> This works well if the calling script name is standard {{hdfs}} or {{yarn}}. 
> If a distribution renames the script to something like foo.bar, , then the 
> variable names will be inferred as {{FOO.BAR_DATANODE_OPTS}}. This is not a 
> valid bash variable name.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to