[ http://issues.apache.org/jira/browse/HADOOP-435?page=comments#action_12427621 ] Doug Cutting commented on HADOOP-435: -------------------------------------
> To start the cluster I simply do "for i in `cat machines'; do ssh > $i:hadoop/start.sh [ ...] Why is this better than the bin/slaves.sh script? > Encapsulating startup scripts and jars in a single Jar file. > ------------------------------------------------------------ > > Key: HADOOP-435 > URL: http://issues.apache.org/jira/browse/HADOOP-435 > Project: Hadoop > Issue Type: New Feature > Affects Versions: 0.5.0 > Reporter: Benjamin Reed > Attachments: hadoopit.patch, start.sh, stop.sh > > > Currently, hadoop is a set of scripts, configurations, and jar files. It > makes it a pain to install on compute and datanodes. It also makes it a pain > to setup clients so that they can use hadoop. Everytime things are updated > the pain begins again. > I suggest that we should be able to build a single Jar file that has a > Main-Class defined with the configuration built in so that we can distribute > that one file to nodes and clients on updates. One nice thing that I haven't > done would be to make the jarfile downloadable from the JobTracker webpage so > that clients can easily submit the jobs. > I currently use such a setup on my small cluster. To start the job tracker I > used "java -jar hadoop.jar -l /tmp/log jobtracker" to submit a job I use > "java -jar hadoop.jar jar wordcount.jar". I used the client on my linux and > Mac OSX machines and I'll I need installed in java and the hadoop.jar file. > hadoop.jar helps with logfiles and configurations. The default of pulling the > config files from the jar file can be overridden by specifying a config > directory so that you can easily have machine specific configs and still have > the same hadoop.jar on all machines. > Here are the available commands from hadoop.jar: > USAGE: hadoop [-l logdir] command > User commands: > dfs run a DFS admin client > jar run a JAR file > job manipulate MapReduce jobs > fsck run a DFS filesystem check utility > Runtime startup commands: > datanode run a DFS datanode > jobtracker run the MapReduce job Tracker node > namenode run the DFS namenode (namenode -format formats the FS) > tasktracker run a MapReduce task Tracker node > HadoopLoader commands: > buildJar builds the HadoopLoader jar file > conf dump hadoop configuration > Note, I don't have the classes for hadoop streaming built into this Jar file, > but if I had that would also be an option (it checks for needed classes > before displaying an option). It makes it very easy for users that just write > scripts to use hadoop straight from their machines. > I'm also attaching the start.sh and stop.sh scripts that I use. These are the > only scripts I use to startup the daemons. They are very simple and the > start.sh script uses the config file to figure out whether or not to start > the jobtracker and the nameserver. > The attached patch adds the HadoopIt patch, modifies the Configuration class > to find the config files correctly, and modifies the build to make a fully > contained hadoop.jar. To update the configuration in a hadoop.jar you simply > use "zip hadoop.jar hadoop-site.xml". -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira