[ 
http://issues.apache.org/jira/browse/HADOOP-435?page=comments#action_12428000 ] 
            
Yoram Arnon commented on HADOOP-435:
------------------------------------

any start/stop mechanism must allow separate control of DFS and map-reduce, and 
separate nodes for namenode, jobtracker, datanodes and tasktrackers.
jvm options, the pid matter etc. are also required.

A trivial method for starting up a toy hadoop cluster would be very welcome, 
lowering the bar for new users, but let's be careful before removing any 
functionality, or changing the behavior of the hadoop script.

> Encapsulating startup scripts and jars in a single Jar file.
> ------------------------------------------------------------
>
>                 Key: HADOOP-435
>                 URL: http://issues.apache.org/jira/browse/HADOOP-435
>             Project: Hadoop
>          Issue Type: New Feature
>    Affects Versions: 0.5.0
>            Reporter: Benjamin Reed
>         Attachments: hadoopit.patch, start.sh, stop.sh
>
>
> Currently, hadoop is a set of scripts, configurations, and jar files. It 
> makes it a pain to install on compute and datanodes. It also makes it a pain 
> to setup clients so that they can use hadoop. Everytime things are updated 
> the pain begins again.
> I suggest that we should be able to build a single Jar file that has a 
> Main-Class defined with the configuration built in so that we can distribute 
> that one file to nodes and clients on updates. One nice thing that I haven't 
> done would be to make the jarfile downloadable from the JobTracker webpage so 
> that clients can easily submit the jobs.
> I currently use such a setup on my small cluster. To start the job tracker I 
> used "java -jar hadoop.jar -l /tmp/log jobtracker" to submit a job I use 
> "java -jar hadoop.jar jar wordcount.jar". I used the client on my linux and 
> Mac OSX machines and I'll I need installed in java and the hadoop.jar file.
> hadoop.jar helps with logfiles and configurations. The default of pulling the 
> config files from the jar file can be overridden by specifying a config 
> directory so that you can easily have machine specific configs and still have 
> the same hadoop.jar on all machines.
> Here are the available commands from hadoop.jar:
> USAGE: hadoop [-l logdir] command
>   User commands:
>     dfs          run a DFS admin client
>     jar          run a JAR file
>     job          manipulate MapReduce jobs
>     fsck         run a DFS filesystem check utility
>   Runtime startup commands:
>     datanode     run a DFS datanode
>     jobtracker   run the MapReduce job Tracker node
>     namenode     run the DFS namenode (namenode -format formats the FS)
>     tasktracker  run a MapReduce task Tracker node
>   HadoopLoader commands:
>     buildJar     builds the HadoopLoader jar file
>     conf         dump hadoop configuration
> Note, I don't have the classes for hadoop streaming built into this Jar file, 
> but if I had that would also be an option (it checks for needed classes 
> before displaying an option). It makes it very easy for users that just write 
> scripts to use hadoop straight from their machines.
> I'm also attaching the start.sh and stop.sh scripts that I use. These are the 
> only scripts I use to startup the daemons. They are very simple and the 
> start.sh script uses the config file to figure out whether or not to start 
> the jobtracker and the nameserver.
> The attached patch adds the HadoopIt patch, modifies the Configuration class 
> to find the config files correctly, and modifies the build to make a fully 
> contained hadoop.jar. To update the configuration in a hadoop.jar you simply 
> use "zip hadoop.jar hadoop-site.xml".

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to