Re: Problems with HOD and HDFS

Steve Loughran Mon, 14 Jun 2010 08:50:10 -0700

Edward Capriolo wrote:


I have not used it much, but I think HOD is pretty cool. I guess most people
who are looking to (spin up, run job ,transfer off, spin down) are using
EC2. HOD does something like make private hadoop clouds on your hardware and
many probably do not have that use case. As schedulers advance and get
better HOD becomes less attractive, but I can always see a place for it.

I don't know who is using it, or maintaining it; we've been bringing upshort-lived Hadoop clusters different.

I think I should write a little article on the topic; I presented aboutit at Berlin Buzzwords last week.

Short lived Hadoop clusters on VMs are fine if you don't have enoughdata or CPU load to justify a set of dedicated physical machines, and isa good way of experimenting with Hadoop at scale. You can maybe lockdown the network better too, though that depends on your VM infrastructure.

Where VMs are weak is in disk IO performance, but there's no reason whythe VM infrastructure can't take a list of filenames/directories as ahint for VM placement (placement is the new scheduling, incidentally),and virtualized IO can only improve. If you can run Hadoop MapReducedirectly against SAN-mounted storage then you can stop worrying aboutlocality of data and still gain from parallelisation of the operations.



-steve

Re: Problems with HOD and HDFS

Reply via email to