Hi, I am glad to announce the availability of a new deployment method for Apache Hadoop and related projects in Apache Bigtop (incubating). Anyone can now download an image for their favourite virtualization technology with Apache Hadoop pre-installed and configured.
This may be useful in a lot of cases such as: * You want to get familiar with Apache Hadoop or related projects without worries regarding the setup * You want to run some tests in a reproducible environment without being afraid of breaking your system * You want to develop and test your project against some specific environments * You quickly want to see what is coming up in Apache Hadoop 0.23 branch * You want to maintain some specific images at your cloud provider (ex: AWS, ElasticHosts) * You use an operating system which is not compatible with Apache Hadoop but would like to write/test some jobs against Apache Hadoop or related projects * You quickly want to get familiar with Apache Hadoop or related project in a real distributed mode across multiple VMs Currently Apache Bigtop only provides a base appliance with Apache Hadoop running in pseudo-distributed mode on CentOS 6. Several jobs have been created on our jenkins instance to generate images based on the released version of Apache Hadoop 0.20.205 as well as the in development branches of Apache Hadoop 0.22 (needs to update to the released bits) and Apache Hadoop 0.23. Each of these jobs create images for the following virtualization technology: * KVM (libvirtd) * Virtualbox * VMware Under the hood we use BoxGrinder (http://boxgrinder.org/), a fantastic tool for generating all these VMs. Currently it can creates VM for RHEL/CentOS/Fedora/ScientificLinux, but could handle more OSes through plugins. The list of supported virtualization technology (KVM, VirtualBox, VMware, EC2) and delivery methods (local, sftp, s3, ebs, elastichosts, and soon local/remote libvirtd) is also pretty large. The BoxGrinder appliance definition format is also pretty simple to understand and the current appliance should be easy to modify and extend. There is also a convenient method to make an appliance inherits from one or several other appliances. This explains why the current appliance in Apache Bigtop is quite small, so one could easily create an appliance with all the packages provided by Apache Bigtop by just providing the additional packages. If you wish to download the VMs built from Apache Bigtop (incubating) jenkins instance, here are the links: * Apache Bigtop 0.2.0 (incubating), includes Apache Hadoop 0.20.205: - KVM: http://bit.ly/tlPZCz - VMware: http://bit.ly/s1V42p - VirtualBox: http://bit.ly/s5vuj8 * Branch hadoop-0.22 in Apache Bigtop (incubating): - KVM: http://bit.ly/sMSupy - VMware: http://bit.ly/tSfN6E - VirtualBox: http://bit.ly/sc0wvL * Branch hadoop-0.23 in Apache Bigtop (incubating): - KVM: http://bit.ly/sBdEVX - VMware: http://bit.ly/vj22mo - VirtualBox: http://bit.ly/ttwm5k * The jenkins job creating these images is located there: http://bit.ly/vM3tLP Once the chosen artefact is downloaded and expanded, you just need to tell your virtualization tool to import an existing disk. Be also careful regarding the RAM allocated to your VM. Apache Hadoop 0.23 has become quite memory hungry and will not be able to run the pi example on a VM with 1024MB of RAM. Please, don't hesitate to share your feedback, ideas or issues on Apache Bigtop (incubating) mailing list or the ticket tracker. Thanks, Bruno Mahé PS: I CCed the Apache Hadoop general since this email may interest a few folks focused on that mailing list
