Andrew,
     I was able to tag my nodes so that I could ensure that my hadoop nodes 
would deploy to the same compute node as their storage.  I also was able to use 
a network topology script to make sure that the data is distributed among two 
hosts at minimum.  The local storage pool thing is what is getting me at the 
moment.  To use my available local storage I had to set it up as one volume for 
cloudstack to be able to use all of it, but as a result I have that serving as 
a bottleneck for the three slave nodes on each host since it is already slowed 
down by the extra abstraction of the drives, and has three vms trying to slam 
it at once.
Thanks,     David Ortiz

> From: andrew.ba...@gmail.com
> Date: Tue, 4 Jun 2013 12:05:32 -0700
> Subject: Re: Hadoop cluster running in cloudstack
> To: users@cloudstack.apache.org
> 
> This is a very interesting question - I know I've asked for better control
> over where VMs end up going, specifically to be able to ensure rack
> locality for Hadoop nodes, but I don't know what the progress has been on
> that, nor do I know whether there's a way of doing multiple storage pools
> on a single host short of some really silly jumping through hoops, like
> running multiple cloud agents on a single host. But I could be wrong - I'm
> not as in touch with the internals as others may be.
> 
> A.
> 
> On Tue, Jun 4, 2013 at 11:43 AM, David Ortiz <dpor...@outlook.com> wrote:
> 
> > Hello,
> >     Has anyone tried running a hadoop cluster in a cloudstack environment?
> >  I have set one up, but I am finding that I am having some IO contention
> > between slave nodes on each host since they all share one local storage
> > pool.  As I understand it, there is not currently a method for using
> > multiple local storage pools with VMs through cloudstack.  Has anyone found
> > a workaround for this by any chance?
> > Thanks,     David Ortiz
                                          

Reply via email to