[
https://issues.apache.org/jira/browse/HADOOP-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661089#action_12661089
]
Sharad Agarwal commented on HADOOP-4868:
----------------------------------------
bq. I would prefer that the included scripts were not directly in the bin/
directory, but rather in lib/ or a subdirectory. The bin/ directory should
ideally only contain end-user commands.
+1. we can have bin/includes directory
bq. Also, once we split the projects, we'd like the combination of core &
mapred and core & hdfs to be as simple as possible. Copying multiple scripts
into directories seems fragile. Ideally we'd have a single shell script to
bootstrap things and then get everything else from jars on the classpath, since
we need to combine libraries (core, hdfs, & mapred) together on the classpath
anyway.
For combining I see these options:
1. Install core separately before installing mapred or hdfs and refer it via
environment variable say HADOOP_CORE_HOME
2. Bundle core jar in mapred and hdfs' lib. There could be a target in build
file say setup which would unpack the core jar in a subdirectory of mapred and
hdfs say core-release. Refer it via environment variable say
HADOOP_CORE_RELEASE. By default it would point to mapred/core-release.
3. Bundle core jar in mapred and hdfs' lib. There could be a target in build
file say setup which would unpack the core jar in such a fashion that contents
of lib, conf and bin are copied to respective directories of mapred and hdfs.
Option1 clearly not preferable as users would have to download and install two
releases.
For option 2, we would require to explicitly invoke scripts from the core and
also would need to explicitly add libraries to classpath. There would be
multiple conf folder, one for core and other for mapred/hdfs, which needs to be
handled.
Option 3 looks to be simpler. The hadoop script can add all the libraries
present in the lib folder to the classpath, so it doesn't need to bother from
where it came from. We have single conf folder, so most of the things remain as
it is in terms of passing the different conf folder. This looks to be a good
option, the only constraint is that the folder structure must remain the same
for all - core, mapred and hdfs and there aren't clashes of filenames.
bq. Might it be simpler if the command dispatch were in Java? We might have a
CoreCommand, plus MapredCommand and HdfsCommand subclasses.
Do you mean we have only one hadoop script and don't need to have hadoop-mapred
and hadoop-hdfs ? In that case bin/hadoop script would need to know which one
it would call CoreCmdDispatcher, MapredCmdDisptacher or HDFSCmdDispatcher. One
way could be by some variable CMD_DISPATCHER_CLASS which gets overridden in the
mapred and hdfs. Not sure how, perhaps this variable can be set by the unpack
script itself.
Other way could be that CoreCmdDispatcher itself looks for the presence of
MapredCmdDisptacher and HDFSCmdDispatcher in the classpath. If found then
delegate to it. But this will mean reverse dependency although it won't be
compile time.
bq. The bin/hadoop script (from core) might, when invoked with 'bin/hadoop foo
...' run something like org.apache.hadoop.foo.FooCommand. Then we wouldn't need
the core.sh, mapred.sh and hdfs.sh include scripts.
This is similar to current functionality of bin/hadoop <CLASSNAME>, no? Just
having this won't be sufficient as we need to print help messages listing all
the available commands.
> Split the hadoop script into 3 parts
> ------------------------------------
>
> Key: HADOOP-4868
> URL: https://issues.apache.org/jira/browse/HADOOP-4868
> Project: Hadoop Core
> Issue Type: Improvement
> Components: scripts
> Reporter: Sharad Agarwal
> Assignee: Sharad Agarwal
> Attachments: 4868_v1.patch, 4868_v2.patch
>
>
> We need to split the bin/hadoop into 3 parts for core, mapred and hdfs. This
> will enable us to distribute the individual scripts with each component.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.