Hi,
I feel I am not doing something right. I have switched between several
AMIs and still unable to see hadoop running on the EC2 instance and unable
to run the basic "hadoop fs -ls /" command on it. Here is my configs and
outputs:
Andrei suggested that I use 64-bit 10.04 LTS. But, I saw similar issue
there too. Also, I think I was not able to find a 64-bit 10.04 LTS AMI that
runs on m1.small. (I have seen some built for t1.micro and other large
instances).
# 32-bit 10.04 LTS EBS
whirr.image-id=us-east-1/ami-ab36fbc2
whirr.hardware-id=m1.small
whirr.hadoop.install-function=install_cdh_hadoop
whirr.hadoop.configure-function=configure_cdh_hadoop
===========================================
sri@PeriyaData:~$ ssh -i ~/.ssh/id_rsa
ec2-174-129-113-79.compute-1.amazonaws.com
The authenticity of host
'ec2-174-129-113-79.compute-1.amazonaws.com(174.129.113.79)' can't be
established.
RSA key fingerprint is 0b:33:c6:f2:5f:0e:a2:97:8a:75:1c:be:37:2f:c2:85.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added
'ec2-174-129-113-79.compute-1.amazonaws.com,174.129.113.79'
(RSA) to the list of known hosts.
Linux domU-12-31-39-09-9D-E4 2.6.32-318-ec2 #38-Ubuntu SMP Thu Sep 1
17:54:33 UTC 2011 i686 GNU/Linux
Ubuntu 10.04.3 LTS
Welcome to Ubuntu!
* Documentation: https://help.ubuntu.com/
System information as of Mon Dec 5 03:29:47 UTC 2011
System load: 0.33 Processes: 65
Usage of /: 13.6% of 7.87GB Users logged in: 0
Memory usage: 14% IP address for eth0: 10.210.162.18
Swap usage: 0%
Graph this data and manage this system at https://landscape.canonical.com/
---------------------------------------------------------------------
At the moment, only the core of the system is installed. To tune the
system to your needs, you can choose to install one or more
predefined collections of software by running the following
command:
sudo tasksel --section server
---------------------------------------------------------------------
Get cloud support with Ubuntu Advantage Cloud Guest
http://www.ubuntu.com/business/services/cloud
Last login: Mon Dec 5 03:28:47 2011 from
108-90-42-72.lightspeed.sntcca.sbcglobal.net
sri@domU-12-31-39-09-9D-E4:~$
sri@domU-12-31-39-09-9D-E4:~$
sri@domU-12-31-39-09-9D-E4:~$ java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) Client VM (build 20.1-b02, mixed mode, sharing)
sri@domU-12-31-39-09-9D-E4:~$ hadoop version
hadoop: command not found
sri@domU-12-31-39-09-9D-E4:~$
sri@PeriyaData:~$ whirr list-cluster --config ~/pd/hadoop-cfg.properties
us-east-1/i-63615b00 us-east-1/ami-ab36fbc2 174.129.113.79
10.210.162.18 RUNNING us-east-1a hadoop-namenode,hadoop-jobtracker
us-east-1/i-e3615b80 us-east-1/ami-ab36fbc2 50.19.22.60
10.214.6.244 RUNNING us-east-1a hadoop-datanode,hadoop-tasktracker
us-east-1/i-e1615b82 us-east-1/ami-ab36fbc2 50.19.6.250
10.254.79.245 RUNNING us-east-1a hadoop-datanode,hadoop-tasktracker
sri@PeriyaData:~$
sri@PeriyaData:~$
sri@PeriyaData:~$ export HADOOP_CONF_DIR=~/.whirr/HadoopCluster/
sri@PeriyaData:~$
sri@domU-12-31-39-09-9D-E4:~$
*This is in the instance*
sri@domU-12-31-39-09-9D-E4:~$ hadoop fs -ls /
hadoop: command not found
sri@domU-12-31-39-09-9D-E4:~$ hadoop version
hadoop: command not found
sri@domU-12-31-39-09-9D-E4:~$
Now, this is in my local laptop:
sri@PeriyaData:~$ hadoop fs -ls /
11/12/04 19:47:02 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use
core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of
core-default.xml, mapred-default.xml and hdfs-default.xml respectively
11/12/04 19:47:04 INFO ipc.Client: Retrying connect to server:
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020. Already
tried 0 time(s).
11/12/04 19:47:05 INFO ipc.Client: Retrying connect to server:
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020. Already
tried 1 time(s).
11/12/04 19:47:06 INFO ipc.Client: Retrying connect to server:
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020. Already
tried 2 time(s).
11/12/04 19:47:07 INFO ipc.Client: Retrying connect to server:
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020. Already
tried 3 time(s).
11/12/04 19:47:08 INFO ipc.Client: Retrying connect to server:
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020. Already
tried 4 time(s).
11/12/04 19:47:09 INFO ipc.Client: Retrying connect to server:
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020. Already
tried 5 time(s).
11/12/04 19:47:10 INFO ipc.Client: Retrying connect to server:
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020. Already
tried 6 time(s).
11/12/04 19:47:11 INFO ipc.Client: Retrying connect to server:
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020. Already
tried 7 time(s).
11/12/04 19:47:12 INFO ipc.Client: Retrying connect to server:
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020. Already
tried 8 time(s).
11/12/04 19:47:13 INFO ipc.Client: Retrying connect to server:
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020. Already
tried 9 time(s).
*Bad connection to FS. command aborted. exception: Call to
ec2-174-129-113-79.compute-1.amazonaws.com/174.129.113.79:8020 failed on
local exception: java.net.SocketException: Connection refused*
sri@PeriyaData:~$
*This on the instance*
srivathsan@domU-12-31-39-09-9D-E4:/tmp/logs$ more stdout.log
Reading package lists...
hadoop-0.20.2.tar.gz: OK
srivathsan@domU-12-31-39-09-9D-E4:/tmp/logs$
============================================
*Questions:*
1. Assuming everything is fine, where does Hadoop gets installed on the
EC2 instance? What is the path?
2. Even if Hadoop is successfully installed on the EC2 instance, are the
env variables properly changed on that instance? Like, path must be updated
either on its .bashrc or .bash_profile ...right?
3. Am I missing any important step here which is not documented?
4. The stdout.log file on the instance says "reading package lists..". I
do not see logs about hadoop getting installed...as I see for Java
("setting up sun-java6-jdk" ...). Is there a way to enable verbose logging?
I am using m1.small hardware. So, I am sure it will have enough space to
install hadoop and run it.
5. If you know of any Ubuntu AMI that you have consistently run Hadoop,
please let me know. I will definitely try that.
I am asking the above questions because I feel I am not looking at the
right place. After switching several AMIs, if I still see the same
behavior, I must be looking at the wrong places.
I am doing something stupid here. Not sure what. I am properly exporting
the hadoop conf dir. The ssh key pairs are good. I do not know why
connection gets refused and do not understand the last line (highlighted in
yellow). Am I missing any important step?
Also, the funny thing is this: I am able to see the dfshealth.jsp page on
my firefox browser (after running the proxy shell script). But, when I
click on the link to show the filesystem, it is unable to display
them...connection to server problem!
Any suggestions/best practices?
Thanks,
PD