[ 
https://issues.apache.org/jira/browse/WHIRR-88?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051517#comment-13051517
 ] 

Tibor Kiss edited comment on WHIRR-88 at 6/18/11 2:06 PM:
----------------------------------------------------------

I've played with some image creation methods. There are no straightforward 
approaches currently, therefore I started to experiment it with my CDH service.
At first, I just made a manual image creation method. Having a newly started 
minimal cluster, I started to use onle the master node where manually I 
installed all the software packages which may exists on each type of the 
instance templates in that cluster. Then I disabled all the softwares, such as 
hadoop-0.20-* by using chkconfig .. off command. I also manually cleaned up the 
instance
{quote}
for service in /etc/init.d/hadoop-0.20-*; do sudo $service stop; done
yum install hadoop-0.20-datanode
yum install hadoop-0.20-tasktracker
chkconfig hadoop-0.20-datanode off
chkconfig hadoop-0.20-jobtracker off
chkconfig hadoop-0.20-namenode off
chkconfig hadoop-0.20-tasktracker off
rm -f /root/.*hist* $HOME/.*hist*
rm -f /var/log/*.gz
find /var/log -name mysql -prune -o -type f -print | while read i; do sudo cp 
/dev/null $i; done
rm -f /var/log/oozie/*
rm -f /var/log/hadoop/*
rm -rf /var/log/hadoop/history
rm -rf /var/log/hadoop/userlogs
{quote}

Then I made some changes to the /etc/sudoers, to allow login with ec2-user 
(personally I used Amazon Linux). Note that the whirr scripts is overwriting 
that, so we need to add it again at this step.
{quote}
cat /home/users/web/.ssh/authorized_keys >> /home/ec2-user/.ssh/authorized_keys
echo "ec2-user ALL = (ALL) NOPASSWD: ALL" >> /etc/sudoers
{quote}

Then I am logging in with ec2-user, then sudo su - root
{quote}
userdel --remove web
rm -rf /etc/hadoop/conf.dist
rm -rf /mnt/perf
rm -rf /data/perf
rm -f /home/ec2-user/setup-web
rm -rf /tmp/Jetty*
rm -rf /tmp/hsperf*
rm -rf /tmp/jclouds*
rm -rf /tmp/logs
rm -f /tmp/*
{quote}

While creating the AMI, I excluded 
/root/.ssh,/home/ec2-user/.ssh,/data,/data0,/data1 (in fact all /data*).

Then I planned to switch from 
{quote}
whirr.hadoop-install-function=install_cdh_hadoop
whirr.hadoop-configure-function=configure_cdh_hadoop
{quote}
to
{quote}
whirr.hadoop-install-function=prepare_cdh_hadoop
whirr.hadoop-configure-function=reconfigure_cdh_hadoop
whirr.image-id=us-east-1/ami-12345678
jclouds.ec2.ami-owners=123456789012
whirr.login-user=ec2-user
{quote}

Of course that change required to rewrite most of the hadoop related scripts in 
order to have such a separation (install/prepare and configure/reconfigure).

Unfortunately not only the functions from the exact service in cause I had to 
modify it, but also from the core. From the core, the 
core/src/main/resources/functions/install_java.sh I made idempotent, such as 
both functions, install_java_rpm and install_java_deb will get an if {quote}if 
[ ! -e $JDK_INSTALL_PATH ]{quote} respectively {quote}if [ ! -e 
/usr/lib/jvm/java-6-sun ]{quote} and closed before {quote}echo "export 
JAVA_HOME{quote}.
In the same way the 
services/hadoop/src/main/resources/functions/install_hadoop.sh also get a 
condition {quote}if [ ! -e $HADOOP_HOME ]{quote} immediately after line 
{quote}HADOOP_HOME=/usr/local/$(basename $HADOOP_TAR_URL .tar.gz){quote}. (Note 
that I don't really understand why the Apache version is unpacked while I am 
installing the CDH based services. Maybe is a bug only!)

At the end, I had in my functions directory the following files
{quote}configure_cdh_hadoop.sh  install_cdh_hadoop.sh  install_hadoop.sh  
install_java.sh  prepare_cdh_hadoop.sh  reconfigure_cdh_hadoop.sh{quote}

With this configuration switched as described earlier I was able to fire-up my 
CDH based Hadoop from my private AMI created manually. 
Not all of the changes I described here in detail, but at the end I get up and 
running a cluster without reinstalling from the scratch everything.

In my opinion this approach is not really a solid approach and not just because 
the image I have created it was manually created. For example, I removed the 
web user, just to be able to cope with the tricky steps used by whirr when 
starting... it starts with default ec2-user then there it starts a setup-web 
script which creates the web user. Without rewriting this tricky startup, which 
is implemented in core, I had only one solution, by removing the web user. So 
that web user is getting recreated when the instance is started. I don't really 
understand all the motivations behind this initialization, but I'm sure here is 
getting complicated any attempts to create private images.

>From a higher perspective, my opinion regarding the image creation.. is not 
>necessary to have a new command for image creation from the scratch such as 
>the cluster creation, but a different approach, such as fire-up the cluster as 
>it is now then do the tests with it and if you like it say persist the 
>cluster. Then later on, you may choose to switch your cluster startup from own 
>image or by building from the scratch. Maybe a we need to add 
>whirr.hadoop-cleanup-function which will be used when cleaning up. Most of the 
>changes can be made in scripts such as each entry function has to have two 
>entries or all of them made idempotent regarding that such install and 
>configure lines were already present or not.

Is getting complicated the process how we get a single instance which contains 
all the software from all template combinations, such as now in case of hadoop, 
we would like to persist all kind of hadoop softwares preinstalled but not set 
to auto start. Who knows all the rules may arise? Or there are scenarios when 
not just a single common private image needs to be created?
Anyway, this request raises several questions.

      was (Author: tibor.kiss):
    I've played with some image creation methods. There are no straightforward 
approaches currently, therefore I started to experiment it with my CDH service.
At first, I just made a manual image creation method. Having a newly started 
minimal cluster, I started to use onle the master node where manually I 
installed all the software packages which may exists on each type of the 
instance templates in that cluster. Then I disabled all the softwares, such as 
hadoop-0.20-* by using chkconfig .. off command. I also manually cleaned up the 
instance
{{{
for service in /etc/init.d/hadoop-0.20-*; do sudo $service stop; done
yum install hadoop-0.20-datanode
yum install hadoop-0.20-tasktracker
chkconfig hadoop-0.20-datanode off
chkconfig hadoop-0.20-jobtracker off
chkconfig hadoop-0.20-namenode off
chkconfig hadoop-0.20-tasktracker off
rm -f /root/.*hist* $HOME/.*hist*
rm -f /var/log/*.gz
find /var/log -name mysql -prune -o -type f -print | while read i; do sudo cp 
/dev/null $i; done
rm -f /var/log/oozie/*
rm -f /var/log/hadoop/*
rm -rf /var/log/hadoop/history
rm -rf /var/log/hadoop/userlogs
}}}

Then I made some changes to the /etc/sudoers, to allow login with ec2-user 
(personally I used Amazon Linux). Note that the whirr scripts is overwriting 
that, so we need to add it again at this step.
{{{
cat /home/users/web/.ssh/authorized_keys >> /home/ec2-user/.ssh/authorized_keys
echo "ec2-user ALL = (ALL) NOPASSWD: ALL" >> /etc/sudoers
}}}

Then I am logging in with ec2-user, then sudo su - root
{{{
userdel --remove web
rm -rf /etc/hadoop/conf.dist
rm -rf /mnt/perf
rm -rf /data/perf
rm -f /home/ec2-user/setup-web
rm -rf /tmp/Jetty*
rm -rf /tmp/hsperf*
rm -rf /tmp/jclouds*
rm -rf /tmp/logs
rm -f /tmp/*
}}}

While creating the AMI, I excluded 
/root/.ssh,/home/ec2-user/.ssh,/data,/data0,/data1 (in fact all /data*).

Then I planned to switch from 
{{{
whirr.hadoop-install-function=install_cdh_hadoop
whirr.hadoop-configure-function=configure_cdh_hadoop
}}}
to
{{{
whirr.hadoop-install-function=prepare_cdh_hadoop
whirr.hadoop-configure-function=reconfigure_cdh_hadoop
whirr.image-id=us-east-1/ami-12345678
jclouds.ec2.ami-owners=123456789012
whirr.login-user=ec2-user
}}}

Of course that change required to rewrite most of the hadoop related scripts in 
order to have such a separation (install/prepare and configure/reconfigure).

Unfortunately not only the functions from the exact service in cause I had to 
modify it, but also from the core. From the core, the 
core/src/main/resources/functions/install_java.sh I made idempotent, such as 
both functions, install_java_rpm and install_java_deb will get an if {{{if [ ! 
-e $JDK_INSTALL_PATH ]}}} respectively {{{if [ ! -e /usr/lib/jvm/java-6-sun 
]}}} and closed before {{{echo "export JAVA_HOME}}}.
In the same way the 
services/hadoop/src/main/resources/functions/install_hadoop.sh also get a 
condition {{{if [ ! -e $HADOOP_HOME ]}}} immediately after line 
{{{HADOOP_HOME=/usr/local/$(basename $HADOOP_TAR_URL .tar.gz)}}}. (Note that I 
don't really understand why the Apache version is unpacked while I am 
installing the CDH based services. Maybe is a bug only!)

At the end, I had in my functions directory the following files
{{{configure_cdh_hadoop.sh  install_cdh_hadoop.sh  install_hadoop.sh  
install_java.sh  prepare_cdh_hadoop.sh  reconfigure_cdh_hadoop.sh}}}

With this configuration switched as described earlier I was able to fire-up my 
CDH based Hadoop from my private AMI created manually. 
Not all of the changes I described here in detail, but at the end I get up and 
running a cluster without reinstalling from the scratch everything.

In my opinion this approach is not really a solid approach and not just because 
the image I have created it was manually created. For example, I removed the 
web user, just to be able to cope with the tricky steps used by whirr when 
starting... it starts with default ec2-user then there it starts a setup-web 
script which creates the web user. Without rewriting this tricky startup, which 
is implemented in core, I had only one solution, by removing the web user. So 
that web user is getting recreated when the instance is started. I don't really 
understand all the motivations behind this initialization, but I'm sure here is 
getting complicated any attempts to create private images.

>From a higher perspective, my opinion regarding the image creation.. is not 
>necessary to have a new command for image creation from the scratch such as 
>the cluster creation, but a different approach, such as fire-up the cluster as 
>it is now then do the tests with it and if you like it say persist the 
>cluster. Then later on, you may choose to switch your cluster startup from own 
>image or by building from the scratch. Maybe a we need to add 
>whirr.hadoop-cleanup-function which will be used when cleaning up. Most of the 
>changes can be made in scripts such as each entry function has to have two 
>entries or all of them made idempotent regarding that such install and 
>configure lines were already present or not.

Is getting complicated the process how we get a single instance which contains 
all the software from all template combinations, such as now in case of hadoop, 
we would like to persist all kind of hadoop softwares preinstalled but not set 
to auto start. Who knows all the rules may arise? Or there are scenarios when 
not just a single common private image needs to be created?
Anyway, this request raises several questions.
  
> Support image creation
> ----------------------
>
>                 Key: WHIRR-88
>                 URL: https://issues.apache.org/jira/browse/WHIRR-88
>             Project: Whirr
>          Issue Type: New Feature
>          Components: core
>            Reporter: Tom White
>
> Much of the time taken to start a cluster is in installing the software on 
> the instances. By allowing users to build their own images it would make 
> cluster launches faster. The way this could work is by having a create image 
> step that brings up an instance and runs the install scripts on it before 
> creating an image from it. The resulting image would then be used in 
> subsequent launches.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to