Paco NATHAN wrote:
We use an EC2 image onto which we install Java, Ant, Hadoop, etc. To
make it simple, pull those from S3 buckets. That provides a flexible
pattern for managing the frameworks involved, more so than needing to
re-do an EC2 image whenever you want to add a patch to Hadoop.

Given that approach, you can add your Hadoop application code
similarly. Just upload the current stable build out of SVN, Git,
whatever, to an S3 bucket.

Nice. Your CI tool could upload the latest release tagged as good and the machines could pull it down.

The goal of cluster management is to make the addition/removal of an extra node an O(1) problem; you edit one entry in one place to increment or decrement the number of machines, and that's it.

If you find you have lots of images to keep alive, then your costs go up. Keep the # of images you have to 1 and you will stay in control.


We use a set of Python scripts to manage a daily, (mostly) automated
launch of 100+ EC2 nodes for a Hadoop cluster.  We also run a listener
on a local server, so that the Hadoop job can send notification when
it completes, and allow the local server to initiate download of
results.  Overall, that minimizes the need for having a sysadmin
dedicated to the Hadoop jobs -- a small dev team can handle it, while
focusing on algorithm development and testing.

1. We have some components that use google talk to relay messages to local boxes behind the firewall. I could imagine hooking up hadoop status events to that too.

2. There's an old paper of mine, "Making Web Services that Work", in which I talk about deployment centric development:
http://www.hpl.hp.com/techreports/2002/HPL-2002-274.html

The idea is that right from the outset, the dev team work on a cluster that resembles production, the CI server builds to it automatically, changes get pushed out to production semi-automatically (you tag the version you want pushed out in SVN, the CI server does the release). The article is focused on services exported to third parties, not back end stuff, so it may not all apply to hadoop deployments.

-steve



Reply via email to