RE: Dynamic creation and destroying hadoop on Rackspace

praveen.peddi Mon, 10 Jan 2011 16:24:53 -0800

Here are the properties. Please note that I tried w/o specifying 
whirr.hadoop.install.runurl also and got the same problem.


whirr.service-name=hadoop
whirr.cluster-name=relevancycluster
whirr.instance-templates=1 jt+nn,1 dn+tt
whirr.provider=cloudservers
whirr.identity=<rackspace-id>
whirr.credential=<rckspace-api-password>
#whirr.private-key-file=/home/hadoop/.ssh/id_rsa
#whirr.public-key-file=/home/hadoop/.ssh/id_rsa.pub

# Uncomment out these lines to run CDH
whirr.hadoop-install-runurl=cloudera/cdh/install
whirr.hadoop-configure-runurl=cloudera/cdh/post-configure

# The size of the instance to use. See 
http://www.rackspacecloud.com/cloud_hosting_products/serv$
# id 3: 1GB, 1 virtual core
# id 4: 2GB, 2 virtual cores
# id 5: 4GB, 2 virtual cores
# id 6: 8GB, 4 virtual cores
# id 7: 15.5GB, 4 virtual cores
whirr.hardware-id=4
# Ubuntu 10.04 LTS Lucid
whirr.image-id=49

________________________________________
From: ext Tom White [...@cloudera.com]
Sent: Monday, January 10, 2011 7:03 PM
To: Peddi Praveen (Nokia-MS/Boston)
Cc: ham...@cloudera.com; whirr-user@incubator.apache.org
Subject: Re: Dynamic creation and destroying hadoop on Rackspace

Can you post your Whirr properties file please (with credentials removed).

Thanks
Tom

On Mon, Jan 10, 2011 at 3:59 PM,  <praveen.pe...@nokia.com> wrote:
> I am using the latest Whirr. For hadoop, I actually specified cloudera URL in 
> the properties file but on master hadoop machine I saw references to 
> hadoop-0.20. OS of my client is CentOS but I am going with default OS for 
> hadoop which is Ubuntu 10.04.
>
> On Jan 10, 2011, at 6:38 PM, "ext Tom White" <t...@cloudera.com> wrote:
>
>> On Mon, Jan 10, 2011 at 2:22 PM,  <praveen.pe...@nokia.com> wrote:
>>> Looks like hadoop was installed but never started on the master node. There 
>>> were no files under /var/log/hadoop on master node either.
>>>
>>> r...@hadoop-master:~# netstat -a | grep 50030 returns nothing
>>>
>>> Does Whirr install and start Hadoop as "root"? Is that the problem? When I 
>>> try to start Hadoop manually from hadoop master, I see following:
>>>
>>> --------------------------------
>>> r...@hadoop-master:~# /etc/alternatives/hadoop-lib/bin/start-all.sh
>>> starting namenode, logging to 
>>> /etc/alternatives/hadoop-lib/bin/../logs/hadoop-root-namenode-184-106-96-62.static.cloud-ips.com.out
>>> May not run daemons as root. Please specify HADOOP_NAMENODE_USER
>>
>> That's the problem. Which version of Whirr, Hadoop, OS are you using?
>>
>> Tom
>>
>>> The authenticity of host 'localhost (127.0.0.1)' can't be established.
>>> RSA key fingerprint is d4:3c:55:4d:76:62:3d:b2:e1:74:a7:6f:bf:92:ab:3d.
>>> Are you sure you want to continue connecting (yes/no)? yes
>>> localhost: Warning: Permanently added 'localhost' (RSA) to the list of 
>>> known hosts.
>>> localhost: starting datanode, logging to 
>>> /etc/alternatives/hadoop-lib/bin/../logs/hadoop-root-datanode-184-106-96-62.static.cloud-ips.com.out
>>> localhost: May not run daemons as root. Please specify HADOOP_DATANODE_USER
>>> localhost: starting secondarynamenode, logging to 
>>> /etc/alternatives/hadoop-lib/bin/../logs/hadoop-root-secondarynamenode-184-106-96-62.static.cloud-ips.com.out
>>> localhost: May not run daemons as root. Please specify 
>>> HADOOP_SECONDARYNAMENODE_USER
>>> starting jobtracker, logging to 
>>> /etc/alternatives/hadoop-lib/bin/../logs/hadoop-root-jobtracker-184-106-96-62.static.cloud-ips.com.out
>>> May not run daemons as root. Please specify HADOOP_JOBTRACKER_USER
>>> localhost: starting tasktracker, logging to 
>>> /etc/alternatives/hadoop-lib/bin/../logs/hadoop-root-tasktracker-184-106-96-62.static.cloud-ips.com.out
>>> localhost: May not run daemons as root. Please specify 
>>> HADOOP_TASKTRACKER_USER
>>> --------------------------------
>>>
>>> Praveen
>>> -----Original Message-----
>>> From: ext Tom White [mailto:t...@cloudera.com]
>>> Sent: Monday, January 10, 2011 5:08 PM
>>> To: Peddi Praveen (Nokia-MS/Boston)
>>> Cc: ham...@cloudera.com; whirr-user@incubator.apache.org
>>> Subject: Re: Dynamic creation and destroying hadoop on Rackspace
>>>
>>> Can you connect to the jobtracker UI? It's running on the master, port 
>>> 50030. You can also ssh into the machine and look at the logs under 
>>> /var/log/hadoop to see if there are any errors.
>>>
>>> Tom
>>>
>>> On Mon, Jan 10, 2011 at 12:33 PM,  <praveen.pe...@nokia.com> wrote:
>>>> Hi Tom,
>>>> Thank you very much for your response. We were able to figure out how to 
>>>> launch and destroy the cluster using the command line tool. We haven't 
>>>> tried Java client yet (we will do it soon). But with command line tool, we 
>>>> could not access hadoop fs and any of the hadoop command. We also ran the 
>>>> proxy script. Here is the error I am getting. My client node is not able 
>>>> to talk to hadoo master node. We tried as hadoop user and root but no 
>>>> luck. Do you think we are missing anything?
>>>>
>>>> [r...@hadoop-master ~]# /usr/local/software/hadoop/bin/hadoop fs -lsr
>>>> / 11/01/10 20:29:17 WARN conf.Configuration: DEPRECATED:
>>>> hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is 
>>>> deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml 
>>>> to override properties of core-default.xml, mapred-default.xml and 
>>>> hdfs-default.xml respectively 11/01/10 20:29:18 INFO ipc.Client: Retrying 
>>>> connect to server: 
>>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 0 
>>>> time(s).
>>>> 11/01/10 20:29:19 INFO ipc.Client: Retrying connect to server: 
>>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 1 
>>>> time(s).
>>>> 11/01/10 20:29:20 INFO ipc.Client: Retrying connect to server: 
>>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 2 
>>>> time(s).
>>>> 11/01/10 20:29:21 INFO ipc.Client: Retrying connect to server: 
>>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 3 
>>>> time(s).
>>>> 11/01/10 20:29:22 INFO ipc.Client: Retrying connect to server: 
>>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 4 
>>>> time(s).
>>>> 11/01/10 20:29:23 INFO ipc.Client: Retrying connect to server: 
>>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 5 
>>>> time(s).
>>>> 11/01/10 20:29:24 INFO ipc.Client: Retrying connect to server: 
>>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 6 
>>>> time(s).
>>>> 11/01/10 20:29:25 INFO ipc.Client: Retrying connect to server: 
>>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 7 
>>>> time(s).
>>>> 11/01/10 20:29:26 INFO ipc.Client: Retrying connect to server: 
>>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 8 
>>>> time(s).
>>>> 11/01/10 20:29:27 INFO ipc.Client: Retrying connect to server: 
>>>> 184-106-158-27.static.cloud-ips.com/184.106.158.27:8020. Already tried 9 
>>>> time(s).
>>>>
>>>> I should say Whirr is cool so far!
>>>>
>>>> Thanks again
>>>> Praveen
>>>>
>>>> -----Original Message-----
>>>> From: ext Tom White [mailto:t...@cloudera.com]
>>>> Sent: Monday, January 10, 2011 2:23 PM
>>>> To: Peddi Praveen (Nokia-MS/Boston)
>>>> Cc: whirr-user@incubator.apache.org; ham...@cloudera.com
>>>> Subject: Re: Dynamic creation and destroying hadoop on Rackspace
>>>>
>>>> Hi Praveen,
>>>>
>>>> You should be able to do exactly this using Whirr. There's not a lot of 
>>>> documentation to describe what you want to do, but I recommend you start 
>>>> by having a look at http://incubator.apache.org/whirr/. The Hadoop unit 
>>>> tests will show you how to start and stop a cluster from Java and submit a 
>>>> job. E.g.
>>>>
>>>> http://svn.apache.org/repos/asf/incubator/whirr/trunk/services/hadoop/
>>>> src/test/java/org/apache/whirr/service/hadoop/integration/HadoopServic
>>>> eController.java
>>>>
>>>> Finally, check out the recipes for advice on setting configuration for
>>>> Rackspace: 
>>>> http://svn.apache.org/repos/asf/incubator/whirr/trunk/recipes/hadoop-rackspace.properties.
>>>>
>>>> Thanks,
>>>> Tom
>>>>
>>>> On Mon, Jan 10, 2011 at 10:27 AM,  <praveen.pe...@nokia.com> wrote:
>>>>> Hello all,
>>>>> W have few Hadoop jobs that we are running on Hadoop RackSpace Cloud.
>>>>> The jobs run for a total of 3 to 5 hours a day. Currently I have
>>>>> manually installed and configrued Hadoop on Rackspace which is a
>>>>> laborious process (especially given that we have about 10
>>>>> environments that we need to configure). So my question is about
>>>>> automatic creation and desrtoying of Hadoop cluster using a program 
>>>>> (preferably Java).
>>>>> Here is my current deployment.
>>>>>
>>>>> Glassfish (Node 1)
>>>>> Mysql (Node 2)
>>>>> Hadoop with 1 master and 5 Slaves (Nodes 3 to 8)
>>>>>
>>>>> We can install Glassfish and MySql manually but we would like to
>>>>> dynamically create/install hadoop cluster, start the servers, run
>>>>> jobs and then destroy cluster on the cloud. Primary purpose of doing
>>>>> this is to make deployment easy and save costs. Since the jobs are
>>>>> run only for few hours a day we don't want to have Hadoop running on the 
>>>>> cloud for the whole day.
>>>>>
>>>>> Jeff Hammerbacher from Cloudera had suggested I look at Whirr and he
>>>>> was positive that I can do the above steps using Whirr. Has anyone
>>>>> done this using Whirr on Rackspace. I could not find any examples on
>>>>> how to dynamically install Hadoop cluster on Rackspace. Any
>>>>> information on this task would be greatly appreciated.
>>>>>
>>>>> Thanks
>>>>> Praveen
>>>>>
>>>>>
>>>>>
>>>>
>>>
>

RE: Dynamic creation and destroying hadoop on Rackspace

Reply via email to