Re: Small glitch with setting up two node cluster...only secondary node starts (datanode and namenode don't show up in jps)

samir das mohapatra Wed, 30 May 2012 07:17:11 -0700

*Step wise Details (Ubantu 10.x version ): Go through properly and Run one
by one. it will sove your problem (You can change the path,IP ,Host name as
you like to do)*
---------------------------------------------------------------------------------------------------------
1. Start the terminal


2. Disable ipv6 on all machines
        pico /etc/sysctl.conf 10. Download and install hadoop:

3. Add these files to the EOF cd /usr/local/hadoop
net.ipv6.conf.all.disable_ipv6 = 1 sudo wget –c
http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u2.tar.gz
net.ipv6.conf.default.disable_ipv6 = 1 11. Unzip the tar
net.ipv6.conf.lo.disable_ipv6 = 1 sudo tar -zxvf
/usr/local/hadoop/hadoop-0.20.2-chd3u2.tar.gz
net.ipv6.conf.lo.disable_ipv6 = 1 12. Change permissions on hadoop folder
by granting all to hadoop

3. Reboot the system sudo chown -R hadoop:hadoop /usr/local/hadoop
sudo reboot sudo chmod 750 -R /usr/local/hadoop

4. Install java 13. Create the HDFS directory
sudo apt-get install openjdk-6-jdk openjdk-6-jre sudo mkdir
hadoop-datastore // inside the usr local hadoop folder

5. Check if ssh is installed, if not do so: sudo mkdir
hadoop-datastore/hadoop-hadoop
sudo apt-get install openssh-server openssh-client 14. Add the binaries
path and hadoop home in the environment file

6. Create a group and user called hadoop sudo pico /etc/environment
sudo addgroup hadoop set the bin path as well as hadoop home path
sudo adduser --ingroup hadoop hadoop source /etc/environment

7. Assign all the permissions to the Hadoop user 15. Configure the hadoop
env.sh file
sudo visudo cd /usr/local/hadoop/hadoop-0.20.2-cdh3u3/
Add the following line in the file sudo pico conf/hadoop-env.sh
hadoop ALL =(ALL) ALL add the following line in there:

8. Check if hadoop user has ssh installed export
HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
su hadoop export JAVA_HOME="/usr/lib/jvm/java-6-openjdk
ssh-keygen -t rsa -P "" <next page>
Press Enter when asked.
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
Copy the servers RSA public key from server to all nodes
in the authorized_keys file as shown in the above step

9. Make hadoop installation directory:
sudo mkdir /usr/local/


10. Download and install hadoop:
cd /usr/local/hadoop
sudo wget –c http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u2.tar.gz


11. Unzip the tar
sudo tar -zxvf /usr/local/hadoop/hadoop-0.20.2-chd3u2.tar.gz

12. Change permissions on hadoop folder by granting all to hadoop
sudo chown -R hadoop:hadoop /usr/local/hadoop
sudo chmod 750 -R /usr/local/hadoop

13. Create the HDFS directory
sudo mkdir hadoop-datastore // inside the usr local hadoop folder
sudo mkdir hadoop-datastore/hadoop-hadoop

14. Add the binaries path and hadoop home in the environment file
  sudo pico /etc/environment
 // set the bin path as well as hadoop home path
source /etc/environment

15. Configure the hadoop env.sh file

cd /usr/local/hadoop/hadoop-0.20.2-cdh3u3/

sudo pico conf/hadoop-env.sh
//add the following line in there:
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
export JAVA_HOME="/usr/lib/jvm/java-6-openjdk


16. Configuring the core-site.xml

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl"
href="configuration.xsl"?>
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/hadoop-datastore/hadoop-${user.name}</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://<IP of namenode>:54310</value>
<description>Location of the Namenode</description>
</property>
</configuration>

17. Configuring the hdfs-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl"
href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.</description>
</property>
</configuration>

18. Configuring the mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl"
href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value><IP of job tracker>:54311</value>
<description>Host and port of the jobtracker.
</description>
</property>
</configuration>

19. Add all the IP addresses in the conf/slaves file
sudo pico /usr/local/hadoop/hadoop-0.20.2-cdh3u2/conf/slaves
 Add the list of IP addresses that will host data nodes, in this file

---------------------------------------------------------------------------------------------------------------------------------------------

*Hadoop Commands: Now restart the hadoop cluster*
start-all.sh/stop-all.sh
start-dfs.sh/stop-dfs.sh
start-mapred.sh/stop-mapred.sh
hadoop dfs -ls /<virtual dfs path>
hadoop dfs copyFromLocal <local path> <dfs path>

Re: Small glitch with setting up two node cluster...only secondary node starts (datanode and namenode don't show up in jps)

Reply via email to