Re: Hadoop Distributed Virtualisation

Edward Capriolo Fri, 06 Jun 2008 09:41:49 -0700

I once asked a wise man in change of a rather large multi-datacenter
service, "Have you every considered virtualization?" He replied, "All
the CPU's here are pegged at 100%"


They may be applications for this type of processing. I have thought
about systems like this from time to time. This thinking goes in
circles. Hadoop is designed for storing and processing on different
hardware.  Virtualization lets you split a system into sub-systems.

Virtualization is great for proof of concept.
For example, I have deployed this: I installed VMware with two linux
systems on my windows host, I followed a hadoop multi-system-tutorial
running on two vmware nodes. I was able to get the word count
application working, I also confirmed that blocks were indeed being
stored on both virtual systems and that processing was being shared
via MAP/REDUCE.

The processing however was slow, of course this is the fault of
VMware. VMware has a very high emulation overhead. Xen has less
overhead. LinuxVserver and OpenVZ use software virtualization (they
have very little (almost no) overhead). Regardless of how much
overhead, overhead is overhead. Personally I find the Vmware falls
short of its promises

Re: Hadoop Distributed Virtualisation

Reply via email to