The SecondaryNameNode is necessary for automatic maintenance in long-running clusters (read: production), but is not necessary for, nor tied into the basic functions/operations of HDFS.
On 1.x, you can remove the script's startup of SNN by removing its host entry from the conf/masters file. On 2.x, you can selectively start the NN and DNs by using the hadoop-daemon.sh script commands. On Mon, Dec 17, 2012 at 10:34 PM, Ivan Ryndin <iryn...@gmail.com> wrote: > Hi all, > > is it necessary to run secondary namenode when starting HDFS? > I am dealing with Hadoop 1.1.1. > Looking at script $HADOOP_HOME/bin/start_dfs.sh > There are next lines in this file: > > # start dfs daemons > # start namenode after datanodes, to minimize time namenode is up w/o data > # note: datanodes will log connection errors until namenode starts > "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode > $nameStartOpt > "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode > $dataStartOpt > "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start > secondarynamenode > > So, will HDFS work if I turn off starting of secondarynamenode ? > > I do ask this because I am playing with Hadoop on two-node cluster only (and > machines in cluster do not have much RAM and disk space), and thus don't > want to run unnecessary processes. > > -- > Best regards, > Ivan P. Ryndin, > -- Harsh J