Hi Cole:

That link you posted refers to our work at ISI. We're currently running LXC as 
the hypervisor on our SGI UV. Other than performance, one of the issues with 
KVM is that it currently has a hard-coded limit on how many vCPUs you can run 
in a single instance, so we can't run, say, a 256 vcpus instance. 

Some of the LXC-related issues we've run into:

- The CPU affinity issue on LXC you mention. Running LXC with OpenStack, you 
don't get proper "space sharing" out of the box, each instance actually sees 
all of the available CPUs. It's possible to restrict this, but that 
functionality doesn't seem to be exposed through libvirt, so it would have to 
be implemented in nova.

- LXC doesn't currently support volume attachment through libvirt. We were able 
to implement a workaround by invoking "lxc-attach" inside of OpenStack instead  
(e.g., see 
<https://github.com/usc-isi/nova/blob/hpc-testing/nova/virt/libvirt/connection.py#L482>.
 But to be able to use lxc-attach, we had to upgrade the Linux kernel in 
RHEL6.1 from 2.6.32 to 2.6.38. This kernel isn't supported by SGI, which means 
that we aren't able to load the SGI numa-related kernel modules. 

Take care,

Lorin
--
Lorin Hochstein, Computer Scientist
USC Information Sciences Institute
703.812.3710
http://www.east.isi.edu/~lorin




On Dec 3, 2011, at 5:08 PM, Cole wrote:

> First and foremost: 
> http://wiki.openstack.org/HeterogeneousSgiUltraVioletSupport
> 
> With Numa and lightweight container technology (LXC / OpenVZ) you can achieve 
> very close to real hardware performance for certain HPC applications.  The 
> problem with technologies like LXC is there isn't a ton of logic to address 
> the cpu affinity that other hypervisors offer (which generally wouldn't be 
> ideal for HPC).
> 
> On the interconnect side.  There are plenty of 
> open-mx(http://open-mx.gforge.inria.fr/) HPC applications running on 
> everything from single channel 1 gig to bonded 10 gig.
> 
> This is an area I'm personally interested in and have done some testing and 
> will be doing more.  If you are going to try HPC with ethernet, Arista makes 
> the lowest latency switches in the business.
> 
> Cole
> Nebula
> 
> On Sat, Dec 3, 2011 at 11:11 AM, Tim Bell <tim.b...@cern.ch> wrote:
> At CERN, we are also faced with similar thoughts as we look to the cloud on 
> how to match the VM creation performance (typically O(minutes)) with the 
> required batch job system rates for a single program (O(sub-second)).
> 
> Data locality to aim that the job runs close to the source data makes this 
> more difficult along with fair share to align the priority of the jobs to 
> achieve the agreed quota between competing requests for limited and shared 
> resource.  The classic IaaS model of 'have credit card, will compute' does 
> not apply for some private cloud use cases/users.
> 
> We would be interested to discuss further with other sites.  There is further 
> background from OpenStack Boston at http://vimeo.com/31678577.
> 
> Tim
> tim.b...@cern.ch
> 
> 
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Reply via email to