[ 
http://issues.apache.org/jira/browse/HADOOP-435?page=comments#action_12427608 ] 
            
Doug Cutting commented on HADOOP-435:
-------------------------------------

I think moving much of the logic from bin/hadoop into Java is a good idea.  We 
still need scripts to manage daemons, but client operation would be simplified.

I don't see your start and stop scripts here.

Specifying a conf directory could be done by manipulating the class loader or 
classpath, rather than adding a lot of special case code.

Shouldn't we standardize on commons-cli for command-line processing?  Shouldn't 
this somehow integrate with the Tool and ToolBase stuff?

> Encapsulating startup scripts and jars in a single Jar file.
> ------------------------------------------------------------
>
>                 Key: HADOOP-435
>                 URL: http://issues.apache.org/jira/browse/HADOOP-435
>             Project: Hadoop
>          Issue Type: New Feature
>    Affects Versions: 0.5.0
>            Reporter: Benjamin Reed
>         Attachments: hadoopit.patch
>
>
> Currently, hadoop is a set of scripts, configurations, and jar files. It 
> makes it a pain to install on compute and datanodes. It also makes it a pain 
> to setup clients so that they can use hadoop. Everytime things are updated 
> the pain begins again.
> I suggest that we should be able to build a single Jar file that has a 
> Main-Class defined with the configuration built in so that we can distribute 
> that one file to nodes and clients on updates. One nice thing that I haven't 
> done would be to make the jarfile downloadable from the JobTracker webpage so 
> that clients can easily submit the jobs.
> I currently use such a setup on my small cluster. To start the job tracker I 
> used "java -jar hadoop.jar -l /tmp/log jobtracker" to submit a job I use 
> "java -jar hadoop.jar jar wordcount.jar". I used the client on my linux and 
> Mac OSX machines and I'll I need installed in java and the hadoop.jar file.
> hadoop.jar helps with logfiles and configurations. The default of pulling the 
> config files from the jar file can be overridden by specifying a config 
> directory so that you can easily have machine specific configs and still have 
> the same hadoop.jar on all machines.
> Here are the available commands from hadoop.jar:
> USAGE: hadoop [-l logdir] command
>   User commands:
>     dfs          run a DFS admin client
>     jar          run a JAR file
>     job          manipulate MapReduce jobs
>     fsck         run a DFS filesystem check utility
>   Runtime startup commands:
>     datanode     run a DFS datanode
>     jobtracker   run the MapReduce job Tracker node
>     namenode     run the DFS namenode (namenode -format formats the FS)
>     tasktracker  run a MapReduce task Tracker node
>   HadoopLoader commands:
>     buildJar     builds the HadoopLoader jar file
>     conf         dump hadoop configuration
> Note, I don't have the classes for hadoop streaming built into this Jar file, 
> but if I had that would also be an option (it checks for needed classes 
> before displaying an option). It makes it very easy for users that just write 
> scripts to use hadoop straight from their machines.
> I'm also attaching the start.sh and stop.sh scripts that I use. These are the 
> only scripts I use to startup the daemons. They are very simple and the 
> start.sh script uses the config file to figure out whether or not to start 
> the jobtracker and the nameserver.
> The attached patch adds the HadoopIt patch, modifies the Configuration class 
> to find the config files correctly, and modifies the build to make a fully 
> contained hadoop.jar. To update the configuration in a hadoop.jar you simply 
> use "zip hadoop.jar hadoop-site.xml".

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to