I have a 20GB EBS with the code and hadoop already checked out. I just start any instance. install svn mvn java(2-3 mins) and just start hadoop.
Robin On Mon, Jan 11, 2010 at 8:11 AM, Ted Dunning <[email protected]> wrote: > On Sun, Jan 10, 2010 at 6:08 PM, Grant Ingersoll <[email protected] > >wrote: > > > Can you share the script, obviously removing the part for your prop. > > software? > > > Sure. Apologies for *really* ugly code. Below is the launch script, with > some boring and *secret* bits expunged. The only real downfall of this > sort > of approach is that the startup script needs to have stuff injected into it > for different kinds of servers. Doing it over, I would make a completely > static boot script that looks to zookeeper to find out what tasks need > doing. Basically, by making the client-boot scripts dynamic, I was trying > to inject configuration management via an inappropriate mechanism. Since I > had ZK running in the cloud already, I should have just used a real > configuration management system instead of gross scripting. > > You need to make sure you have a command line program on the client to > receive any secret keys because user-data is not considered secure. > We do this with something like this: > > # Send in secrets via stdin instead of command line to avoid snoopers. > echo > echo "Running remote script..." > echo "Process $$ : $CLOUD_SERVER:$ZK_PORT $VOLUME $ZK_INTERNAL > $INT_HOST_NAME $VOLUME_ZONE" > > ssh -n -o StrictHostKeyChecking=false -i $ADMIN_KEY_RSA r...@$i > "echo $AWS_API_KEY $AWS_API_SECRET | /home/client/$script $ENV $REV > "$CLOUD_SERVER":"$ZK_PORT" $VOLUME $ZK_INTERNAL $INT_HOST_NAME > $VOLUME_ZONE" > > > Here is a less than complete excerpt of the launch script. It should have > most of the bits you need. It won't run as it stands because a fair bit of > stuff has been expunged. > > #!/bin/sh > > ######### > > # ASSUMPTIONS: > > # ---- keys are available and referenced > # cert and secret key are available and correct perms > # client-boot.sh has been construct to download and install all > necessary software > # file named cloud-key in the current directory contains the key > > # obtained using > # > # ec2-add-keypair cloud-admin-key > > # ---- environment variables > # EC2_PRIVATE_KEY=~/.ec2/pk-xx.pem > # EC2_CERT=~/.ec2/cert-xx.pem > # EC2_HOME points to EC2 distro directory > # AWS_API_KEY=xx > > # AWS_API_SECRET=xx/+yy/zz > # path includes $EC2_HOME/bin > > > ######### > # DEFINITIONS: > ami=ami-1c5db975 > > . ./.cloud_client_env_settings > > if [ $# -eq 5 -a "$5" = "-large" ] > > then > ami="ami-b1fe19d8 -t m1.large" > fi > > ######### > # This script will accept two arguments: > # Uasge: client_cloud_launch.sh 3 namenode.sh > # the first parameter is the instance number want to launch > > # the second parameter is the script want to start on instance > # > # it will create a node start script and then launch a bunch of instances > > #START > > echo $ADMIN_KEY > echo $ADMIN_KEY_RSA > echo $ZK_ADMIN_KEY > > echo $ENV > echo $REV > > #please pay attention to this key-pair, used for creating instances, > you should have the corret $ADMIN_KEY_RSA go with this key pair > KEY_PAIR=cloud-admin-key > > > ZK_GROUP_NAME=zk_cluster > > CLIENT_GROUP_NAME=zk_client > cluster_size=$1 > ZK_PORT=4099 > > VOLUME_ZONE=us-east-1a > VOLUME=... > TIMEOUT=600 > > start=$(date +%s) > echo started at $(date) > > # try to do ec2-describe-instances to get information on what ZK > servers are available for use > > # assumptions are that ZK_GROUP_NAME is the group with which one ZK > cluster will be started, otherwise we would not know > # which one is which > > ec2-describe-instances > zk_instances.tmp.$$ > > ... really silly code to do what should just be grep deleted here. > all it does is hack zk_instances.tmp.$$ into better form in > zk_instances.$$ ... > > if [ "$ALREADYREAD" = 1 ] > then > sed -n "$NEXTLINE","$lineno"p zk_instances.tmp.$$ >> zk_instances.$$ > fi > > ZK_EXTERNAL=$(grep INSTANCE zk_instances.$$ | grep running | grep > $ZK_ADMIN_KEY | cut -f4 | tr '\n' '~' | sed -e 's/~/:2181,/g' -e > 's/,$//') > > ZK_INTERNAL=$(grep INSTANCE zk_instances.$$ | grep running | grep > $ZK_ADMIN_KEY | cut -f5 | tr '\n' '~' | sed -e 's/~/:2181,/g' -e > 's/,$//') > > CLOUD_SERVER=$(grep INSTANCE zk_instances.$$ | grep running | grep > $ZK_ADMIN_KEY | cut -f5 | head -1) > > echo $CLOUD_SERVER > > # launch client nodes. This also causes client-boot.sh to be run on > each node. Somebody else should have built client-boot.sh for us > echo starting $cluster_size instances now... > ins_start_time=$(date +%s) > > ec2-run-instances $ami -g $CLIENT_GROUP_NAME -k $ADMIN_KEY -f > client-boot.sh -z $VOLUME_ZONE -n $cluster_size > > client_instances.tmp.$$ > > cat client_instances.tmp.$$ | grep INSTANCE | cut -f2 > client_instances.$$ > > T1=0 > # this factor is 90% of total cluster we want to start, > # once the number of running instance reaches this factor, > # we will continue our job, killing rather than waiting for the last 10% > factor=`awk -v x=$cluster_size BEGIN'{printf "%d\n",x*0.9+0.5 }'` > > while [ "$T1" != $cluster_size ] > do > rm -f client_instances.tmp.$$ > ec2-describe-instances | grep INSTANCE | grep running > > current_running.$$ > all_instance=`cat client_instances.$$` > > T1=0 > > for inst in $all_instance; do > if [ -z "$inst" ]; then > continue; > fi > > ok=`cat current_running.$$ | grep $inst` > if [ -z "$ok" ]; then > echo Wait a moment, $inst is not ready yet. > > else > T1=`expr $T1 + 1` > echo $ok >> client_instances.tmp.$$ > fi > done > > # check timeout or not > ins_curr_time=$(date +%s) > elapse=`expr $ins_curr_time - $ins_start_time` > > if [ $elapse -gt $TIMEOUT ]; then > > # if we have had 90% instances started, we can stop waiting > and continue the following process > if [ ! $T1 -lt $factor ]; then > echo We have had $T1 instances started, kill the unstarted > ones... > > #should KILL the unstarted instances here > for everyinst in $all_instance; do > isrunning=`cat current_running.$$ | grep $everyinst` > if [ -z "$isrunning" ]; then > > ec2-terminate-instances $everyinst > fi > done > break > fi > > echo We have waited for $elapse seconds, but only $T1 > started, will > not wait any more. Program will exit now! > > #before exit we need to stop all instances we planed to > start > for inst in $all_instance; do > ec2-terminate-instances $inst > done > exit 1 > fi > > echo "Waiting for most instances to be running... $T1/$cluster_size > so far." > > done > > rm -f current_running.$$ > > ins_curr_time=$(date +%s) > elapse=`expr $ins_curr_time - $ins_start_time` > echo Congratulations! We have $T1 of $cluster_size running instances > in $elapse seconds! > > > chmod 600 $ADMIN_KEY_RSA > > chmod 700 $script > > > > # I need to record the instances public name and ZK_EXTERNAL in a file > for uploading fasta files > echo $ZK_EXTERNAL > .externalzk > > > finished=$(date +%s) > > echo completed after $(expr $finished - $start) seconds >
