[
https://issues.apache.org/jira/browse/HADOOP-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tom White updated HADOOP-2409:
------------------------------
Attachment: hadoop-2409.patch
I've written a patch to make it easy to install Hadoop on startup, while still
supporting builds of AMIs that have it pre-installed.
* The default behavior is to download the specified version of Hadoop from S3
and install it on startup. (Hadoop releases will be mirrored in a publicly
readable S3 bucket.) The download takes around 1 second, so there is no
performance overhead in doing this. This makes it easy for people to use
patched versions of Hadoop by uploading them to their own S3 bucket and
changing a config setting.
* The scripts for creating an AMI retain the ability to install Hadoop so you
don't have to install it on start up. I propose that the publicly available
AMIs do not have Hadoop installed, and install it on start up, so they don't
have to be regenerated on each Hadoop release. This also makes it feasible to
install other Hadoop software in the future, and avoid the combinatorial
explosion of version numbers.
* I've created AMIs that work with these changes
(hadoop-images/hadoop-base-20090210-i386.manifest.xml and
hadoop-images/hadoop-base-20090210-x86_64.manifest.xml)
* I have also fixed HADOOP-5006 as a part of this change, which allowed me to
see Ganglia pages.
> Make EC2 image independent of Hadoop version
> --------------------------------------------
>
> Key: HADOOP-2409
> URL: https://issues.apache.org/jira/browse/HADOOP-2409
> Project: Hadoop Core
> Issue Type: Improvement
> Components: contrib/ec2
> Reporter: Tom White
> Attachments: hadoop-2409.patch, HADOOP-2409.patch
>
>
> Instead of building a new image for each released version of Hadoop, install
> Hadoop on instance start up. Since it is a small download this would not add
> significantly to startup time. Hadoop releases would be mirrored on S3 for
> scalability (and to avoid bandwidth costs). The version to install would be
> found from the instance metadata - this would be a download URL.
> More generally, the instance could retrieve a script to run on start up from
> a URL specified in the metadata. The script would install and configure
> Hadoop, but it could be extended to do cluster-specific set up.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.