Near as efficient as isolcpus, but can be used dynamically, during run: Use nohz_full / rcu_nocbs, to offload all rcu of your vm core to your OS-only cores Use cgroups, when you start vm, you keep only x core to the OS, when you shut it down, let the OS have all cores.
If vm is started and you need to have a power boost on linux, just use "echo $$ | sudo tee /cgroups/cgroup.procs", and you will have all cores for program run from this shell :) Linux only: all core, (but cores 1,2,3 are in nohz mode, offloaded by core 0) Linux + windows: 1 core to linux, 3 core to windows Need boost on linux: the little command line for this shell Example of cgroup usage: https://github.com/qdel/scripts/tree/master/vfio/scripts => shieldbuild / shieldbreak Which are called threw qemu hooks: https://github.com/qdel/scripts/tree/master/vfio/hooks I do not configure my io, i let qemu manage. Not one fun behavior: While idle, i am completely still at ~1000us, If i run a game, it goes down to a completely still 500us Example: http://b.qdel.fr/test.png Sorry for quality, vnc to 4k screen from 1080p all this... -- Deldycke Quentin On 29 February 2016 at 10:55, Rokas Kupstys <[email protected]> wrote: > Yes currently i am actually booted with vanilla archlinux kernel, no NO_HZ > and other stuff. > > Why does 2 core for the host is unacceptable? You plan to use it making > hard workloads while gaming? > > Problem with isolcpus is that it exempts cores from linux cpu scheduler. > This means even if VM is offline they will stand idle. While i dont do > anything on host while gaming i do plenty when not gaming and just throwing > away 6 cores of already disadvantaged AMD cpu is a real waste. > > This config is not good actually. > > Well.. It indeed looks bad on paper, however it is the only one that > yields bearable DPC latency. I tried what you mentioned, various > combinations. Pinning 0,2,4,6 cores to vm, 1,3 to emulator, 5,7 for io / > 1,3,5,7 cores to vm, 0,2 to emulator, 4,6 for io / 0,1,2,3 cores to vm, 4,5 > to emulator, 6,7 for io / 4,5,6,7 cores to vm, 0,1 to emulator, 2,3 for io. > All of them yield terrible latency. > > Would be interesting to hear someone who has AMD build, how (if) he solved > this. > > > On 2016.02.29 11:10, Bronek Kozicki wrote: > > Two things you can improve, IMO > > * disable NO_HZ > > * use isolcpus to dedicate your pinned CPUs to guest only - this > will also ensure they are not used for guest IO. > > B. > > On 29/02/2016 08:45, Rokas Kupstys wrote: > > > > > Yesterday i figured out my latency problem. All things listed > everywhere on internet failed. Last thing i tried was pinning one > vcpu to two physical cores and it brought latency down. Now i have > FX-8350 CPU which has shared FPU for each two cores so maybe thats > why. With just this pinning latency now is most of the time just > above 1000μs. However under load latency increases. I threw out > iothreads and emulator pinning and it did not affect much. > Superior latency could be achieved using isolcpus=2-7, however > leaving just two cores to host is unacceptable. With that setting > latency was around 500μs without load. Good part is that > Battlefield3 no longer lags, although i observed increased loading > times on textures compared to bare metal. Not so good part is that > there still is minor sound skipping/cracking since latency is > spiking up under load. That is very disappointing. I also tried > performance with two VM cores pinned to 4 host cores - bf3 lagged > enough to be unplayable. 3 vm cores pinned to 6 host cores was > already playable but sound was still cracking. I noticed little > difference between that and 4 vm cores pinned to 8 host cores. Be > nice if sound could be cleaned up. If anyone have any ideas im all > ears. Libvirt xml i use now: > > > > <vcpu > placement='static'>4</vcpu> > > <cputune> > > <vcpupin vcpu='0' cpuset='0-1'/> > > <vcpupin vcpu='1' cpuset='2-3'/> > > <vcpupin vcpu='2' cpuset='4-5'/> > > <vcpupin vcpu='3' cpuset='6-7'/> > > </cputune> > > <features> > > <acpi/> > > <apic/> > > <pae/> > > <hap/> > > <viridian/> > > <hyperv> > > <relaxed state='on'/> > > <vapic state='on'/> > > <spinlocks state='on' retries='8191'/> > > </hyperv> > > <kvm> > > <hidden state='on'/> > > </kvm> > > <pvspinlock state='on'/> > > </features> > > <cpu mode='host-passthrough'> > > <topology sockets='1' cores='4' threads='1'/> > > </cpu> > > <clock offset='utc'> > > <timer name='rtc' tickpolicy='catchup'/> > > <timer name='pit' tickpolicy='delay'/> > > <timer name='hpet' present='no'/> > > <timer name='hypervclock' present='yes'/> > > </clock> > > > > > Kernel configs > > CONFIG_NO_HZ_FULL=y > > CONFIG_RCU_NOCB_CPU_ALL=y > > CONFIG_HZ_1000=y > > CONFIG_HZ=1000 > > > I am not convinced 1000 hz tickrate is needed. Default one (300) > seems to perform equally as well from looking at latency charts. > Did not get chance to test it with bf3 yet however. > > > > > > On 2016.01.12 11:12, thibaut noah > wrote: > > > > > > > > [cut] > > > > > > > > > > > > _______________________________________________ > vfio-users mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/vfio-users > >
_______________________________________________ vfio-users mailing list [email protected] https://www.redhat.com/mailman/listinfo/vfio-users
