[ https://issues.apache.org/jira/browse/HADOOP-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184540#comment-13184540 ]
Peter Linnell commented on HADOOP-7939: --------------------------------------- This first patch looks good and putting on my packager's hat on it a good first step to add some sanity. While I am new to Apache and Hadoop, FWIW, I've been building rpms for years and I have maintain(ed) a number of packages and repos for openSUSE and Fedora. My temptation is to give this a strong +1, with a couple of comments: Can you extend the comment in line 513 to add. 'Once disabled, you might need to reboot your machine' This is surely the case with SLES/openSUSE. I do not understand why libexec is getting moved or even used. I realize it's not your fault, but I am not a fan of doing that 1. It is not consistent across distros. 2. There is a move afoot to move all binaries to /usr see: https://fedoraproject.org/wiki/Features/UsrMove openSUSE 12.2 is currently planned to do this as well. As for https://issues.apache.org/jira/browse/HADOOP-6255, well its sub-optimal. That surely would never pass the smell test to get into a major linux distro as is. I might get ambitious and provide a patch - if it would be accepted. I do not see "integration" in the base OS, as so many standard macros and locations are overridden by un-needed defines. I do not see HADOOP-4868 as relevant and in fact think this is a more elegant solution sourcing environment variables in one place. Roman thanks for this. > Improve Hadoop subcomponent integration in Hadoop 0.23 > ------------------------------------------------------ > > Key: HADOOP-7939 > URL: https://issues.apache.org/jira/browse/HADOOP-7939 > Project: Hadoop Common > Issue Type: Improvement > Components: build, conf, documentation, scripts > Affects Versions: 0.23.0 > Reporter: Roman Shaposhnik > Assignee: Roman Shaposhnik > Fix For: 0.23.1 > > Attachments: HADOOP-7939.patch.txt, hadoop-layout.sh > > > h1. Introduction > For the rest of this proposal it is assumed that the current set > of Hadoop subcomponents is: > * hadoop-common > * hadoop-hdfs > * hadoop-yarn > * hadoop-mapreduce > It must be noted that this is an open ended list, though. For example, > implementations of additional frameworks on top of yarn (e.g. MPI) would > also be considered a subcomponent. > h1. Problem statement > Currently there's an unfortunate coupling and hard-coding present at the > level of launcher scripts, configuration scripts and Java implementation > code that prevents us from treating all subcomponents of Hadoop independently > of each other. In a lot of places it is assumed that bits and pieces > from individual subcomponents *must* be located at predefined places > and they can not be dynamically registered/discovered during the runtime. > This prevents a truly flexible deployment of Hadoop 0.23. > h1. Proposal > NOTE: this is NOT a proposal for redefining the layout from HADOOP-6255. > The goal here is to keep as much of that layout in place as possible, > while permitting different deployment layouts. > The aim of this proposal is to introduce the needed level of indirection and > flexibility in order to accommodate the current assumed layout of Hadoop > tarball > deployments and all the other styles of deployments as well. To this end the > following set of environment variables needs to be uniformly used in all of > the subcomponent's launcher scripts, configuration scripts and Java code > (<SC> stands for a literal name of a subcomponent). These variables are > expected to be defined by <SC>-env.sh scripts and sourcing those files is > expected to have the desired effect of setting the environment up correctly. > # HADOOP_<SC>_HOME > ## root of the subtree in a filesystem where a subcomponent is expected to > be installed > ## default value: $0/.. > # HADOOP_<SC>_JARS > ## a subdirectory with all of the jar files comprising subcomponent's > implementation > ## default value: $(HADOOP_<SC>_HOME)/share/hadoop/$(<SC>) > # HADOOP_<SC>_EXT_JARS > ## a subdirectory with all of the jar files needed for extended > functionality of the subcomponent (nonessential for correct work of the basic > functionality) > ## default value: $(HADOOP_<SC>_HOME)/share/hadoop/$(<SC>)/ext > # HADOOP_<SC>_NATIVE_LIBS > ## a subdirectory with all the native libraries that component requires > ## default value: $(HADOOP_<SC>_HOME)/share/hadoop/$(<SC>)/native > # HADOOP_<SC>_BIN > ## a subdirectory with all of the launcher scripts specific to the client > side of the component > ## default value: $(HADOOP_<SC>_HOME)/bin > # HADOOP_<SC>_SBIN > ## a subdirectory with all of the launcher scripts specific to the > server/system side of the component > ## default value: $(HADOOP_<SC>_HOME)/sbin > # HADOOP_<SC>_LIBEXEC > ## a subdirectory with all of the launcher scripts that are internal to > the implementation and should *not* be invoked directly > ## default value: $(HADOOP_<SC>_HOME)/libexec > # HADOOP_<SC>_CONF > ## a subdirectory containing configuration files for a subcomponent > ## default value: $(HADOOP_<SC>_HOME)/conf > # HADOOP_<SC>_DATA > ## a subtree in the local filesystem for storing component's persistent > state > ## default value: $(HADOOP_<SC>_HOME)/data > # HADOOP_<SC>_LOG > ## a subdirectory for subcomponents's log files to be stored > ## default value: $(HADOOP_<SC>_HOME)/log > # HADOOP_<SC>_RUN > ## a subdirectory with runtime system specific information > ## default value: $(HADOOP_<SC>_HOME)/run > # HADOOP_<SC>_TMP > ## a subdirectory with temprorary files > ## default value: $(HADOOP_<SC>_HOME)/tmp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira