I have a 20GB EBS with the code and hadoop already checked out. I just start
any instance. install svn mvn java(2-3 mins) and just start hadoop.


Robin


On Mon, Jan 11, 2010 at 8:11 AM, Ted Dunning <[email protected]> wrote:

> On Sun, Jan 10, 2010 at 6:08 PM, Grant Ingersoll <[email protected]
> >wrote:
>
> > Can you share the script, obviously removing the part for your prop.
> > software?
>
>
> Sure.  Apologies for *really* ugly code.  Below is the launch script, with
> some boring and *secret* bits expunged.  The only real downfall of this
> sort
> of approach is that the startup script needs to have stuff injected into it
> for different kinds of servers.  Doing it over, I would make a completely
> static boot script that looks to zookeeper to find out what tasks need
> doing.  Basically, by making the client-boot scripts dynamic, I was trying
> to inject configuration management via an inappropriate mechanism.  Since I
> had ZK running in the cloud already, I should have just used a real
> configuration management system instead of gross scripting.
>
> You need to make sure you have a command line program on the client to
> receive any secret keys because user-data is not considered secure.
> We do this with something like this:
>
>  # Send in secrets via stdin instead of command line to avoid snoopers.
>  echo
>  echo "Running remote script..."
>  echo "Process $$ : $CLOUD_SERVER:$ZK_PORT $VOLUME $ZK_INTERNAL
> $INT_HOST_NAME $VOLUME_ZONE"
>
>  ssh -n -o StrictHostKeyChecking=false -i $ADMIN_KEY_RSA r...@$i
> "echo $AWS_API_KEY $AWS_API_SECRET | /home/client/$script $ENV $REV
> "$CLOUD_SERVER":"$ZK_PORT" $VOLUME $ZK_INTERNAL $INT_HOST_NAME
> $VOLUME_ZONE"
>
>
> Here is a less than complete excerpt of the launch script.  It should have
> most of the bits you need.  It won't run as it stands because a fair bit of
> stuff has been expunged.
>
> #!/bin/sh
>
> #########
>
> # ASSUMPTIONS:
>
> # ---- keys are available and referenced
> # cert and secret key are available and correct perms
> # client-boot.sh has been construct to download and install all
> necessary software
> # file named cloud-key in the current directory contains the key
>
> # obtained using
> #
> # ec2-add-keypair cloud-admin-key
>
> # ---- environment variables
> # EC2_PRIVATE_KEY=~/.ec2/pk-xx.pem
> # EC2_CERT=~/.ec2/cert-xx.pem
> # EC2_HOME points to EC2 distro directory
> # AWS_API_KEY=xx
>
> # AWS_API_SECRET=xx/+yy/zz
> # path includes $EC2_HOME/bin
>
>
> #########
> # DEFINITIONS:
> ami=ami-1c5db975
>
> . ./.cloud_client_env_settings
>
> if [ $# -eq 5 -a "$5" = "-large" ]
>
> then
>    ami="ami-b1fe19d8 -t m1.large"
> fi
>
> #########
> # This script will accept two arguments:
> # Uasge: client_cloud_launch.sh 3 namenode.sh
> # the first parameter is the instance number want to launch
>
> # the second parameter is the script want to start on instance
> #
> # it will create a node start script and then launch a bunch of instances
>
> #START
>
> echo $ADMIN_KEY
> echo $ADMIN_KEY_RSA
> echo $ZK_ADMIN_KEY
>
> echo $ENV
> echo $REV
>
> #please pay attention to this key-pair, used for creating instances,
> you should have the corret $ADMIN_KEY_RSA go with this key pair
> KEY_PAIR=cloud-admin-key
>
>
> ZK_GROUP_NAME=zk_cluster
>
> CLIENT_GROUP_NAME=zk_client
> cluster_size=$1
> ZK_PORT=4099
>
> VOLUME_ZONE=us-east-1a
> VOLUME=...
> TIMEOUT=600
>
> start=$(date +%s)
> echo started at $(date)
>
> # try to do ec2-describe-instances to get information on what ZK
> servers are available for use
>
> # assumptions are that ZK_GROUP_NAME is the group with which one ZK
> cluster will be started, otherwise we would not know
> # which one is which
>
> ec2-describe-instances  > zk_instances.tmp.$$
>
> ... really silly code to do what should just be grep deleted here.
> all it does is hack zk_instances.tmp.$$ into better form in
> zk_instances.$$ ...
>
> if [ "$ALREADYREAD" = 1 ]
> then
>  sed -n "$NEXTLINE","$lineno"p zk_instances.tmp.$$ >> zk_instances.$$
> fi
>
> ZK_EXTERNAL=$(grep INSTANCE zk_instances.$$ | grep running | grep
> $ZK_ADMIN_KEY  | cut -f4 | tr '\n' '~' | sed -e 's/~/:2181,/g' -e
> 's/,$//')
>
> ZK_INTERNAL=$(grep INSTANCE zk_instances.$$ | grep running | grep
> $ZK_ADMIN_KEY  | cut -f5 | tr '\n' '~' | sed -e 's/~/:2181,/g' -e
> 's/,$//')
>
> CLOUD_SERVER=$(grep INSTANCE zk_instances.$$ | grep running | grep
> $ZK_ADMIN_KEY  | cut -f5 | head -1)
>
> echo $CLOUD_SERVER
>
> # launch client nodes.  This also causes client-boot.sh to be run on
> each node.  Somebody else should have built client-boot.sh for us
> echo starting $cluster_size instances now...
> ins_start_time=$(date +%s)
>
> ec2-run-instances $ami -g $CLIENT_GROUP_NAME -k $ADMIN_KEY -f
> client-boot.sh -z $VOLUME_ZONE  -n $cluster_size >
> client_instances.tmp.$$
>
> cat client_instances.tmp.$$ | grep INSTANCE | cut -f2 > client_instances.$$
>
> T1=0
> # this factor is 90% of total cluster we want to start,
> # once the number of running instance reaches this factor,
> # we will continue our job, killing rather than waiting for the last 10%
> factor=`awk -v x=$cluster_size BEGIN'{printf "%d\n",x*0.9+0.5 }'`
>
> while [ "$T1" != $cluster_size ]
> do
>        rm -f client_instances.tmp.$$
>        ec2-describe-instances | grep INSTANCE | grep running >
> current_running.$$
>        all_instance=`cat client_instances.$$`
>
>        T1=0
>
>        for inst in $all_instance; do
>                if [ -z "$inst" ]; then
>                        continue;
>                fi
>
>                ok=`cat current_running.$$ | grep $inst`
>                if [ -z "$ok" ]; then
>                        echo Wait a moment, $inst is not ready yet.
>
>                else
>                        T1=`expr $T1 + 1`
>                        echo $ok >> client_instances.tmp.$$
>                fi
>        done
>
>        # check timeout or not
>        ins_curr_time=$(date +%s)
>        elapse=`expr $ins_curr_time - $ins_start_time`
>
>        if [ $elapse -gt $TIMEOUT ]; then
>
>        # if we have had 90% instances started, we can stop waiting
> and continue the following process
>        if [ ! $T1 -lt $factor ]; then
>            echo We have had $T1 instances started, kill the unstarted
> ones...
>
>            #should KILL the unstarted instances here
>            for everyinst in $all_instance; do
>                isrunning=`cat current_running.$$ | grep $everyinst`
>                if [ -z "$isrunning" ]; then
>
>                    ec2-terminate-instances $everyinst
>                fi
>            done
>            break
>        fi
>
>                echo We have waited for $elapse seconds, but only $T1
> started, will
> not wait any more. Program will exit now!
>
>                #before exit we need to stop all instances we planed to
> start
>                for inst in $all_instance; do
>                        ec2-terminate-instances $inst
>                done
>                exit 1
>        fi
>
>        echo "Waiting for most instances to be running... $T1/$cluster_size
> so far."
>
> done
>
> rm -f current_running.$$
>
> ins_curr_time=$(date +%s)
> elapse=`expr $ins_curr_time - $ins_start_time`
> echo Congratulations! We have $T1 of $cluster_size running instances
> in $elapse seconds!
>
>
> chmod 600 $ADMIN_KEY_RSA
>
> chmod 700 $script
>
>
>
> # I need to record the instances public name and ZK_EXTERNAL in a file
> for uploading fasta files
> echo $ZK_EXTERNAL > .externalzk
>
>
> finished=$(date +%s)
>
> echo  completed after $(expr $finished - $start) seconds
>

Reply via email to