Hi Prasen, 2) is now in the Hadoop Common repository, in src/contrib/cloud. This is where the development effort is focused, and the older bash scripts (1) will be deprecated over time (HADOOP-6403). The new cloud scripts are designed to support multiple cloud providers, as well as advanced features like Amazon's EBS, which the older scripts never did. Indeed, these features would be difficult to support in bash, which is why the new scripts use Python. Being able to take advantage of libcloud (http://incubator.apache.org/libcloud/) makes it feasible to offer support for more providers using a uniform interface.
It's true that boto is dependency (although one that is straightforward to install), but you shouldn't need to install simplejson unless you are using EBS (I haven't checked whether this is actually the case, but if not, it should be considered a bug). As for the error you are getting, you seem to have set the environment variable correctly, so I wonder if it is to do with the version of boto you are using. I have only used the scripts with version 1.8d, but 1.9b came out recently, and I haven't tried them with this version. Cheers, Tom On Sun, Jan 17, 2010 at 10:23 PM, Chandraprakash Bhagtani <cpbhagt...@gmail.com> wrote: > you need to set following environment variables > > > - AWS_ACCESS_KEY_ID - Your AWS Access Key ID > - AWS_SECRET_ACCESS_KEY - Your AWS Secret Access Key > > > On Mon, Jan 18, 2010 at 11:12 AM, prasenjit mukherjee > <prasen....@gmail.com>wrote: > >> Thanks for the suggestion. Now I am getting the following error with >> cloudera's distro. I have set AWS_SECRET_KEY appropriately though. Any >> pointers : >> >> pmukher...@ubuntu:~/apps/cloudera-for-hadoop-on-ec2-py-0.2.0-beta$ >> echo $AWS_SECRET_ACCESS_KEY >> <.........snipped......................................> >> pmukher...@ubuntu:~/apps/cloudera-for-hadoop-on-ec2-py-0.2.0-beta$ >> ./hadoop-ec2 list >> Traceback (most recent call last): >> File "./hadoop-ec2", line 124, in <module> >> list_all() >> File >> "/home/pmukherjee/apps/cloudera-for-hadoop-on-ec2-py-0.2.0-beta/hadoop/ec2/commands.py", >> line 43, in list_all >> clusters = get_clusters_with_role(MASTER) >> File >> "/home/pmukherjee/apps/cloudera-for-hadoop-on-ec2-py-0.2.0-beta/hadoop/ec2/cluster.py", >> line 29, in get_clusters_with_role >> all = EC2Connection().get_all_instances() >> File "/usr/lib/python2.5/site-packages/boto/ec2/connection.py", line >> 69, in __init__ >> self.region.endpoint, debug, https_connection_factory, path) >> File "/usr/lib/python2.5/site-packages/boto/connection.py", line >> 446, in __init__ >> debug, https_connection_factory, path) >> File "/usr/lib/python2.5/site-packages/boto/connection.py", line >> 169, in __init__ >> self.hmac = hmac.new(self.aws_secret_access_key, digestmod=sha) >> AttributeError: EC2Connection instance has no attribute >> 'aws_secret_access_key' >> >> -Prasen >> On Mon, Jan 18, 2010 at 10:47 AM, Zak Stone <zst...@gmail.com> wrote: >> > In my experience, the Cloudera distributions are excellent, actively >> > developed, and well-supported. >> > >> > Zak >> > >> > >> > On Mon, Jan 18, 2010 at 12:01 AM, Mark Kerzner <markkerz...@gmail.com> >> wrote: >> >> My personal experience led me to prefer cloudera. Can't talk for every >> >> situation, but for me the hadoop distro had many bugs and was >> unreliable. >> >> >> >> Mark >> >> >> >> On Sun, Jan 17, 2010 at 10:58 PM, prasenjit <prasen....@gmail.com> >> wrote: >> >> >> >>> >> >>> It seems there are 2 hadoop-ec2 scripts: >> >>> >> >>> 1) One which comes along with the hadoop distro : >> >>> <hadoop>/src/contrib/ec2/bin/hadoop-ec2 >> >>> >> >>> 2) Another which is downloadable from >> >>> >> >>> >> http://cloudera-packages.s3.amazonaws.com/cloudera-for-hadoop-on-ec2-py-0.3.0-beta.tar.gz >> >>> and is from cloudera folks. >> >>> >> >>> I prefer using the base hadoop, as I want to avoid dependencies on >> >>> boto/simplejson which is required for (2). My question is are they >> planned >> >>> to kept in sync. Which one is under active development and hence >> suggested >> >>> ( given my preference for hadoop's contrib package ) for stable use ? >> >>> >> >>> -Thanks, >> >>> Prasen >> >>> >> >>> -- >> >>> View this message in context: >> >>> >> http://old.nabble.com/Which-instance-type-on-Amazon-EC2--tp25667297p27206207.html >> >>> Sent from the Hadoop core-user mailing list archive at Nabble.com. >> >>> >> >>> >> >> >> > >> > > > > -- > Thanks & Regards, > Chandra Prakash Bhagtani, > Impetus Infotech (india) Pvt Ltd. >