Here are the properties. Please note that I tried w/o specifying whirr.hadoop.install.runurl also and got the same problem.
whirr.service-name=hadoop whirr.cluster-name=relevancycluster whirr.instance-templates=1 jt+nn,1 dn+tt whirr.provider=cloudservers whirr.identity=<rackspace-id> whirr.credential=<rckspace-api-password> #whirr.private-key-file=/home/hadoop/.ssh/id_rsa #whirr.public-key-file=/home/hadoop/.ssh/id_rsa.pub # Uncomment out these lines to run CDH whirr.hadoop-install-runurl=cloudera/cdh/install whirr.hadoop-configure-runurl=cloudera/cdh/post-configure # The size of the instance to use. See http://www.rackspacecloud.com/cloud_hosting_products/serv$ # id 3: 1GB, 1 virtual core # id 4: 2GB, 2 virtual cores # id 5: 4GB, 2 virtual cores # id 6: 8GB, 4 virtual cores # id 7: 15.5GB, 4 virtual cores whirr.hardware-id=4 # Ubuntu 10.04 LTS Lucid whirr.image-id=49 ________________________________________ From: ext Tom White [...@cloudera.com] Sent: Monday, January 10, 2011 7:03 PM To: Peddi Praveen (Nokia-MS/Boston) Cc: ham...@cloudera.com; whirr-user@incubator.apache.org Subject: Re: Dynamic creation and destroying hadoop on Rackspace Can you post your Whirr properties file please (with credentials removed). Thanks Tom On Mon, Jan 10, 2011 at 3:59 PM, <praveen.pe...@nokia.com> wrote: > I am using the latest Whirr. For hadoop, I actually specified cloudera URL in > the properties file but on master hadoop machine I saw references to > hadoop-0.20. OS of my client is CentOS but I am going with default OS for > hadoop which is Ubuntu 10.04. > > On Jan 10, 2011, at 6:38 PM, "ext Tom White" <t...@cloudera.com> wrote: > >> On Mon, Jan 10, 2011 at 2:22 PM, <praveen.pe...@nokia.com> wrote: >>> Looks like hadoop was installed but never started on the master node. There >>> were no files under /var/log/hadoop on master node either. >>> >>> r...@hadoop-master:~# netstat -a | grep 50030 returns nothing >>> >>> Does Whirr install and start Hadoop as "root"? Is that the problem? When I >>> try to start Hadoop manually from hadoop master, I see following: >>> >>> -------------------------------- >>> r...@hadoop-master:~# /etc/alternatives/hadoop-lib/bin/start-all.sh >>> starting namenode, logging to >>> /etc/alternatives/hadoop-lib/bin/../logs/hadoop-root-namenode-184-106-96-62.static.cloud-ips.com.out >>> May not run daemons as root. Please specify HADOOP_NAMENODE_USER >> >> That's the problem. Which version of Whirr, Hadoop, OS are you using? >> >> Tom >> >>> The authenticity of host 'localhost (127.0.0.1)' can't be established. >>> RSA key fingerprint is d4:3c:55:4d:76:62:3d:b2:e1:74:a7:6f:bf:92:ab:3d. >>> Are you sure you want to continue connecting (yes/no)? yes >>> localhost: Warning: Permanently added 'localhost' (RSA) to the list of >>> known hosts. >>> localhost: starting datanode, logging to >>> /etc/alternatives/hadoop-lib/bin/../logs/hadoop-root-datanode-184-106-96-62.static.cloud-ips.com.out >>> localhost: May not run daemons as root. Please specify HADOOP_DATANODE_USER >>> localhost: starting secondarynamenode, logging to >>> /etc/alternatives/hadoop-lib/bin/../logs/hadoop-root-secondarynamenode-184-106-96-62.static.cloud-ips.com.out >>> localhost: May not run daemons as root. Please specify >>> HADOOP_SECONDARYNAMENODE_USER >>> starting jobtracker, logging to >>> /etc/alternatives/hadoop-lib/bin/../logs/hadoop-root-jobtracker-184-106-96-62.static.cloud-ips.com.out >>> May not run daemons as root. Please specify HADOOP_JOBTRACKER_USER >>> localhost: starting tasktracker, logging to >>> /etc/alternatives/hadoop-lib/bin/../logs/hadoop-root-tasktracker-184-106-96-62.static.cloud-ips.com.out >>> localhost: May not run daemons as root. Please specify >>> HADOOP_TASKTRACKER_USER >>> -------------------------------- >>> >>> Praveen >>> -----Original Message----- >>> From: ext Tom White [mailto:t...@cloudera.com] >>> Sent: Monday, January 10, 2011 5:08 PM >>> To: Peddi Praveen (Nokia-MS/Boston) >>> Cc: ham...@cloudera.com; whirr-user@incubator.apache.org >>> Subject: Re: Dynamic creation and destroying hadoop on Rackspace >>> >>> Can you connect to the jobtracker UI? It's running on the master, port >>> 50030. You can also ssh into the machine and look at the logs under >>> /var/log/hadoop to see if there are any errors. >>> >>> Tom >>> >>> On Mon, Jan 10, 2011 at 12:33 PM, <praveen.pe...@nokia.com> wrote: >>>> Hi Tom, >>>> Thank you very much for your response. We were able to figure out how to >>>> launch and destroy the cluster using the command line tool. We haven't >>>> tried Java client yet (we will do it soon). But with command line tool, we >>>> could not access hadoop fs and any of the hadoop command. We also ran the >>>> proxy script. Here is the error I am getting. My client node is not able >>>> to talk to hadoo master node. We tried as hadoop user and root but no >>>> luck. Do you think we are missing anything? >>>> >>>> [r...@hadoop-master ~]# /usr/local/software/hadoop/bin/hadoop fs -lsr >>>> / 11/01/10 20:29:17 WARN conf.Configuration: DEPRECATED: >>>> hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is >>>> deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml >>>> to override properties of core-default.xml, mapred-default.xml and >>>> hdfs-default.xml respectively 11/01/10 20:29:18 INFO ipc.Client: Retrying >>>> connect to server: >>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 0 >>>> time(s). >>>> 11/01/10 20:29:19 INFO ipc.Client: Retrying connect to server: >>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 1 >>>> time(s). >>>> 11/01/10 20:29:20 INFO ipc.Client: Retrying connect to server: >>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 2 >>>> time(s). >>>> 11/01/10 20:29:21 INFO ipc.Client: Retrying connect to server: >>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 3 >>>> time(s). >>>> 11/01/10 20:29:22 INFO ipc.Client: Retrying connect to server: >>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 4 >>>> time(s). >>>> 11/01/10 20:29:23 INFO ipc.Client: Retrying connect to server: >>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 5 >>>> time(s). >>>> 11/01/10 20:29:24 INFO ipc.Client: Retrying connect to server: >>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 6 >>>> time(s). >>>> 11/01/10 20:29:25 INFO ipc.Client: Retrying connect to server: >>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 7 >>>> time(s). >>>> 11/01/10 20:29:26 INFO ipc.Client: Retrying connect to server: >>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 8 >>>> time(s). >>>> 11/01/10 20:29:27 INFO ipc.Client: Retrying connect to server: >>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 9 >>>> time(s). >>>> >>>> I should say Whirr is cool so far! >>>> >>>> Thanks again >>>> Praveen >>>> >>>> -----Original Message----- >>>> From: ext Tom White [mailto:t...@cloudera.com] >>>> Sent: Monday, January 10, 2011 2:23 PM >>>> To: Peddi Praveen (Nokia-MS/Boston) >>>> Cc: whirr-user@incubator.apache.org; ham...@cloudera.com >>>> Subject: Re: Dynamic creation and destroying hadoop on Rackspace >>>> >>>> Hi Praveen, >>>> >>>> You should be able to do exactly this using Whirr. There's not a lot of >>>> documentation to describe what you want to do, but I recommend you start >>>> by having a look at http://incubator.apache.org/whirr/. The Hadoop unit >>>> tests will show you how to start and stop a cluster from Java and submit a >>>> job. E.g. >>>> >>>> http://svn.apache.org/repos/asf/incubator/whirr/trunk/services/hadoop/ >>>> src/test/java/org/apache/whirr/service/hadoop/integration/HadoopServic >>>> eController.java >>>> >>>> Finally, check out the recipes for advice on setting configuration for >>>> Rackspace: >>>> http://svn.apache.org/repos/asf/incubator/whirr/trunk/recipes/hadoop-rackspace.properties. >>>> >>>> Thanks, >>>> Tom >>>> >>>> On Mon, Jan 10, 2011 at 10:27 AM, <praveen.pe...@nokia.com> wrote: >>>>> Hello all, >>>>> W have few Hadoop jobs that we are running on Hadoop RackSpace Cloud. >>>>> The jobs run for a total of 3 to 5 hours a day. Currently I have >>>>> manually installed and configrued Hadoop on Rackspace which is a >>>>> laborious process (especially given that we have about 10 >>>>> environments that we need to configure). So my question is about >>>>> automatic creation and desrtoying of Hadoop cluster using a program >>>>> (preferably Java). >>>>> Here is my current deployment. >>>>> >>>>> Glassfish (Node 1) >>>>> Mysql (Node 2) >>>>> Hadoop with 1 master and 5 Slaves (Nodes 3 to 8) >>>>> >>>>> We can install Glassfish and MySql manually but we would like to >>>>> dynamically create/install hadoop cluster, start the servers, run >>>>> jobs and then destroy cluster on the cloud. Primary purpose of doing >>>>> this is to make deployment easy and save costs. Since the jobs are >>>>> run only for few hours a day we don't want to have Hadoop running on the >>>>> cloud for the whole day. >>>>> >>>>> Jeff Hammerbacher from Cloudera had suggested I look at Whirr and he >>>>> was positive that I can do the above steps using Whirr. Has anyone >>>>> done this using Whirr on Rackspace. I could not find any examples on >>>>> how to dynamically install Hadoop cluster on Rackspace. Any >>>>> information on this task would be greatly appreciated. >>>>> >>>>> Thanks >>>>> Praveen >>>>> >>>>> >>>>> >>>> >>> >