Re: Hadoop Distributed Virtualisation

Steve Loughran Mon, 09 Jun 2008 03:40:02 -0700

Colin Freas wrote:

I've wondered about this using single or dual quad-core machines with one
spindle per core, and partitioning them out into 2, 4, 8, whatever virtual
machines, possibly marking each physical box as a "rack".

Or just host VM images with multi-cpu support and make it one 'machine'with the appopriate # of task trackers.


There would be some initial and ongoing sysadmin costs.  But could this
increase thoughput on a small cluster, of 2 or 3 boxes with 16 or 24 cores,
with many jobs by limiting the number of cores each job runs on, to say 8?
Has anyone tried such a setup?

I can see reasons for virtualisation -better isolation and security- butdont think performance would improve. More likely anythingclock-sensitive will get confused if, under load, some VMs get less cputime than they expect.

A better approach could be to improve the schedulers in hadoop to putwork where it is best, maybe even move if if things are taking too long,etc.

Re: Hadoop Distributed Virtualisation

Reply via email to