Am Donnerstag, 19. September 2013, 12:33:21 schrieb Daniel P. Berrange:
> On Thu, Sep 19, 2013 at 01:26:52PM +0200, David Weber wrote:
> > Am Mittwoch, 11. September 2013, 11:27:30 schrieb Daniel P. Berrange:
> > > On Wed, Sep 11, 2013 at 10:47:08AM +0200, David Weber wrote:
> > > > Am Freitag, 6. September 2013, 12:10:04 schrieb Daniel P. Berrange:
> > > > > On Tue, Aug 27, 2013 at 09:09:25AM +0200, David Weber wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > we try to use vcpu pinning on a 2 socket server with Intel Xeon
> > > > > > E5620
> > > > > > cpus, HT enabled and 2*6*16GiB Ram but experience problems if we
> > > > > > try
> > > > > > to
> > > > > > start a guest on the second socket:
> > > > > > error: Failed to start domain test
> > > > > > error: internal error: process exited while connecting to monitor:
> > > > > > kvm_init_vcpu failed: Cannot allocate memory
> > > > 
> > > > # virsh freecell 0
> > > > 0: 86071624 KiB
> > > > 
> > > > # virsh freecell 1
> > > > 1: 75258628 KiB
> > > > 
> > > > # virsh edit test
> > > > <domain type='kvm'>
> > > > 
> > > >   <name>test</name>
> > > >   <uuid>08cdc389-78bf-450c-89f4-b4728edabdbf</uuid>
> > > >   <memory unit='KiB'>1048576</memory>
> > > >   <currentMemory unit='KiB'>1048576</currentMemory>
> > > >   <vcpu placement='static' cpuset='4-7'>1</vcpu>
> > > >   <numatune>
> > > >   
> > > >     <memory mode='strict' nodeset='1'/>
> > > >   
> > > >   </numatune>
> > > >   <os>
> > > >   
> > > >     <type arch='x86_64' machine='pc-i440fx-1.5'>hvm</type>
> > > >     <boot dev='hd'/>
> > > >   
> > > >   </os>
> > > >   <features>
> > > >   
> > > >     <acpi/>
> > > >     <apic/>
> > > >     <pae/>
> > > >   
> > > >   </features>
> > > >   <clock offset='utc'/>
> > > >   <on_poweroff>destroy</on_poweroff>
> > > >   <on_reboot>restart</on_reboot>
> > > >   <on_crash>restart</on_crash>
> > > >   <devices>
> > > >   
> > > >     <emulator>/usr/bin/qemu-kvm</emulator>
> > > >     <controller type='usb' index='0'>
> > > >     
> > > >       <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
> > > > 
> > > > function='0x2'/>
> > > > 
> > > >     </controller>
> > > >     <controller type='pci' index='0' model='pci-root'/>
> > > >     <controller type='ide' index='0'>
> > > >     
> > > >       <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
> > > > 
> > > > function='0x1'/>
> > > > 
> > > >     </controller>
> > > >     <input type='mouse' bus='ps2'/>
> > > >     <graphics type='vnc' port='-1' autoport='yes'/>
> > > >     <video>
> > > >     
> > > >       <model type='cirrus' vram='9216' heads='1'/>
> > > >       <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
> > > > 
> > > > function='0x0'/>
> > > > 
> > > >     </video>
> > > >     <memballoon model='virtio'>
> > > >     
> > > >       <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
> > > > 
> > > > function='0x0'/>
> > > > 
> > > >     </memballoon>
> > > >   
> > > >   </devices>
> > > > 
> > > > </domain>
> > > > 
> > > >  # virsh start test
> > > > 
> > > > error: Failed to start domain test
> > > > error: internal error: process exited while connecting to monitor:
> > > > kvm_init_vcpu failed: Cannot allocate memory
> > > > 
> > > > Allocating memory on this node with numactl  works fine
> > > > # numactl --cpubind=1 --membind=1 --  dd if=/dev/zero of=/dev/null
> > > > bs=2G
> > > > count=1
> > > > 0+1 records in
> > > > 0+1 records out
> > > > 2147479552 bytes (2.1 GB) copied, 0.60816 s, 3.5 GB/s
> > > 
> > > Hmm, this makes no sense at all to me. Your configuration looks totally
> > > valid and you have plenty of memory in both nodes.
> > 
> > After reading a bit more about cgroups, I now think I know whats going on.
> > 
> > Lets assume we have a 2 node dualcore system and start a guest named
> > 'test'
> > without cpu or memory pinning.
> > 
> > * libvirt creates a controller under cpuset/machine/test.libvirt-qemu:
> > cpuset/machine/test.libvirt-qemu/cpuset.cpus -> 0-3
> > cpuset/machine/test.libvirt-qemu/cpuset.mems -> 0-1
> > * libvirt creates a controller for every vcpu:
> > cpuset/machine/test.libvirt-qemu/vcpu*/cpuset.cpus -> 0-3
> > cpuset/machine/test.libvirt-qemu/vcpu*/cpuset.mems -> 0-1
> > * libvirt creates a controller for qemu:
> > cpuset/machine/test.libvirt-qemu/emulator/cpuset.cpus -> 0-3
> > cpuset/machine/test.libvirt-qemu/emulator/cpuset.mems -> 0-1
> > 
> > Now we want to pin the guest to the second node
> > virsh # numatune test --nodeset 1
> > error: Unable to change numa parameters
> > error: Unable to write to '/sys/fs/cgroup/cpuset/machine/Ubuntu.libvirt-
> > qemu/cpuset.mems': Device or resource busy
> > 
> > What happens is that Libvirt tries to set cpuset/machine/test.libvirt-
> > qemu/cpuset.mems to 1 but this is not possible because
> > cpuset/machine/test.libvirt-qemu/vcpu*/cpuset.mems and
> > cpuset/machine/test.libvirt-qemu/emulator/cpuset.mems still contain 0-1.
> > Libvirt has to change these values before!
> 
> Oooh, interesting hypothesis. I wonder if this is a kernel behaviour
> change. I'm fairly sure that in the past if you removed a cpu from the
> cpuset mask, it would automagicaly purge it from all children.
> 
> Please file a bug about this - it should be possible to make libvirt
> do the right thing and purge child masks explicitly first.
> 
Done:
https://bugzilla.redhat.com/show_bug.cgi?id=1009880

I have also tested Linux 3.2.51 so the change would have had to happen quite 
some time ago.

Cheers,
David

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to