Re: How to manage a large cluster?

Paco NATHAN Tue, 16 Sep 2008 05:42:48 -0700

Thanks, Steve -

Another flexible approach to handling messages across firewalls,
between jt and worker nodes, etc., would be to place an APMQ message
broker on the jobtracker and another inside our local network.  We're
experimenting with RabbitMQ for that.



On Tue, Sep 16, 2008 at 4:03 AM, Steve Loughran <[EMAIL PROTECTED]> wrote:

>> We use a set of Python scripts to manage a daily, (mostly) automated
>> launch of 100+ EC2 nodes for a Hadoop cluster.  We also run a listener
>> on a local server, so that the Hadoop job can send notification when
>> it completes, and allow the local server to initiate download of
>> results.  Overall, that minimizes the need for having a sysadmin
>> dedicated to the Hadoop jobs -- a small dev team can handle it, while
>> focusing on algorithm development and testing.
>
> 1. We have some components that use google talk to relay messages to local
> boxes behind the firewall. I could imagine hooking up hadoop status events
> to that too.
>
> 2. There's an old paper of mine, "Making Web Services that Work", in which I
> talk about deployment centric development:
> http://www.hpl.hp.com/techreports/2002/HPL-2002-274.html
>
> The idea is that right from the outset, the dev team work on a cluster that
> resembles production, the CI server builds to it automatically, changes get
> pushed out to production semi-automatically (you tag the version you want
> pushed out in SVN, the CI server does the release). The article is focused
> on services exported to third parties, not back end stuff, so it may not all
> apply to hadoop deployments.
>
> -steve

Re: How to manage a large cluster?

Reply via email to