Dear all, I was wondering if it is possible to format the HDFS at boot time. I have some VM's that are pre-set and pre-configured with Hadoop (datanodes [slaves] and a namenode [master]), and I'm looking for a way to obtain a cluster from them "out of the box", as they're launched (including the namenode).
I currently have the following boot-time init script on my namenode: #!/bin/bash # # # Starts a Hadoop Master # # chkconfig: 2345 90 10 # description: Hadoop master . /etc/rc.status . /home/oneadmin/mountpoint/hadoop-0.20.2/conf/hadoop-env.sh export HPATH=/home/oneadmin/mountpoint/hadoop-0.20.2 export HLOCK=/tmp RETVAL=0 PIDFILE=$HLOCK/hadoop-hdfs-master.pid desc="Hadoop Master daemon" start() { echo -n $"Starting $desc (hadoop): " /sbin/startproc -u 1001 $HPATH/bin/hadoop namenode -format /sbin/startproc -u 1001 $HPATH/bin/start-dfs.sh $1 /sbin/startproc -u 1001 $HPATH/bin/start-mapred.sh $1 RETVAL=$? echo [ $RETVAL -eq 0 ] && touch $HLOCK/hadoop-master return $RETVAL } stop() { echo -n $"Stopping $desc (hadoop): " /sbin/startproc -u 1001 $HPATH/bin/stop-all.sh RETVAL=$? sleep 5 echo [ $RETVAL -eq 0 ] && rm -f $HLOCK/hadoop-master $PIDFILE } checkstatus(){ jps |grep NameNode } restart() { stop start } format() { /sbin/startproc -u 1001 $HPATH/bin/hadoop master -format } case "$1" in start) start ;; upgrade) upgrade ;; format) format ;; stop) stop ;; status) checkstatus ;; restart) restart ;; *) echo $"Usage: $0 {start|stop|status|restart|try-restart}" exit 1 esac exit $RETVAL As you can see, I included the format command in the starting function, too, however it is not working. All the Hadoop processes except NameNode start, and the HDFS isn't being formatted. Is it possible to obtain such functionality that I'm looking for? Any suggestions would be highly appreciated. Thank you, Lehel Biro.