Super ! Thanks much Andrei. Appreciate the detailed response and pointers, PD.
On Wed, Nov 30, 2011 at 2:22 AM, Andrei Savu <[email protected]> wrote: > > On Wed, Nov 30, 2011 at 9:19 AM, Periya.Data <[email protected]>wrote: > >> Hi, >> I am exploring options to deploy a small hadoop cluster on EC2. I >> read about Whirr and would like to try it out. I have a few questions >> before I dive into this: >> >> >> 1. My laptop currently runs Ubuntu 11.10 (Oneiric Ocelot). I am >> running Hadoop 0.20.2+923.142 - CDH3u2 on my laptop. Is there a compatible >> Whirr for this Hadoop release? I read this: >> >> http://ashenfad.blogspot.com/2011/01/hadoop-cluster-on-ec2-using-cloudera.html. >> >> > That should be fine. You can get the latest Whirr release from here: > http://www.apache.org/dyn/closer.cgi/incubator/whirr/ > > To start a CDH3u2 Hadoop cluster customise recipes/hadoop-cdh.properties > as needed. Make sure you uncomment the following lines: > > # Uncomment out these lines to run CDH > #whirr.hadoop.install-function=install_cdh_hadoop > #whirr.hadoop.configure-function=configure_cdh_hadoop > > >> 1. Is it really necessary to have the same Hadoop version running on >> my laptop as what the Whirr instance is using? >> >> > Nope. You can use the cluster by ssh-ing into the remote nodes with no > extra software on your local machine. > >> >> 1. Is there an example whirr config file that shows how to create an >> EC-2 instance with Hadoop 20.2, Hive, Sqoop and Flume? I guess I can >> configure Whirr to download and install all the latest hadoop ecosysem >> tools and then create a custom AMI out of that first instance. >> >> We don't support Hive, Sqoop and Flume as Whirr services yet. Having a > custom AMI is not really that useful for Whirr - the scripts are designed > to work with a vanilla OS install. > > >> >> >> Please let me know the caveats and other fine points I need to know to >> use all the latest packages and Whirr. What should I keep in mind while I >> begin to use Whirr? >> > Before doing more advanced things I recommend that you should take a quick > look at the following doc pages: > > * http://whirr.apache.org/docs/0.6.0/whirr-in-5-minutes.html > * http://whirr.apache.org/docs/0.6.0/quick-start-guide.html > * http://whirr.apache.org/docs/0.6.0/configuration-guide.html > * http://www.oscon.com/oscon2011/public/schedule/detail/19214 > > Also the following Github repos could be interesting for you: > > * https://github.com/tomwhite/whirr-service-example (experimental support > for flume) > * https://github.com/tomwhite/whirr-scm (adds the ability to use Cloudera > SCM to setup the cluster) > >> many thanks, >> >> PD/ >> > Feel free to asks any questions. We can assist you as needed. > > Cheers, > Andrei >
