On Wed, Dec 15, 2010 at 12:14 PM, Michael S. Tsirkin <m...@redhat.com> wrote: > On Wed, Dec 15, 2010 at 11:42:12AM +0000, Stefan Hajnoczi wrote: >> On Mon, Dec 13, 2010 at 6:52 PM, Michael S. Tsirkin <m...@redhat.com> wrote: >> > On Mon, Dec 13, 2010 at 05:57:28PM +0000, Stefan Hajnoczi wrote: >> >> On Mon, Dec 13, 2010 at 4:28 PM, Stefan Hajnoczi <stefa...@gmail.com> >> >> wrote: >> >> > On Mon, Dec 13, 2010 at 4:12 PM, Michael S. Tsirkin <m...@redhat.com> >> >> > wrote: >> >> >> On Mon, Dec 13, 2010 at 03:27:06PM +0000, Stefan Hajnoczi wrote: >> >> >>> On Mon, Dec 13, 2010 at 1:36 PM, Michael S. Tsirkin <m...@redhat.com> >> >> >>> wrote: >> >> >>> > On Mon, Dec 13, 2010 at 03:35:38PM +0200, Michael S. Tsirkin wrote: >> >> >>> >> On Mon, Dec 13, 2010 at 01:11:27PM +0000, Stefan Hajnoczi wrote: >> >> >>> >> > Fresh results: >> >> >>> >> > >> >> >>> >> > 192.168.0.1 - host (runs netperf) >> >> >>> >> > 192.168.0.2 - guest (runs netserver) >> >> >>> >> > >> >> >>> >> > host$ src/netperf -H 192.168.0.2 -- -m 200 >> >> >>> >> > >> >> >>> >> > ioeventfd=on >> >> >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to >> >> >>> >> > 192.168.0.2 >> >> >>> >> > (192.168.0.2) port 0 AF_INET >> >> >>> >> > Recv Send Send >> >> >>> >> > Socket Socket Message Elapsed >> >> >>> >> > Size Size Size Time Throughput >> >> >>> >> > bytes bytes bytes secs. 10^6bits/sec >> >> >>> >> > 87380 16384 200 10.00 1759.25 >> >> >>> >> > >> >> >>> >> > ioeventfd=off >> >> >>> >> > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to >> >> >>> >> > 192.168.0.2 >> >> >>> >> > (192.168.0.2) port 0 AF_INET >> >> >>> >> > Recv Send Send >> >> >>> >> > Socket Socket Message Elapsed >> >> >>> >> > Size Size Size Time Throughput >> >> >>> >> > bytes bytes bytes secs. 10^6bits/sec >> >> >>> >> > >> >> >>> >> > 87380 16384 200 10.00 1757.15 >> >> >>> >> > >> >> >>> >> > The results vary approx +/- 3% between runs. >> >> >>> >> > >> >> >>> >> > Invocation: >> >> >>> >> > $ x86_64-softmmu/qemu-system-x86_64 -m 4096 -enable-kvm -netdev >> >> >>> >> > type=tap,id=net0,ifname=tap0,script=no,downscript=no -device >> >> >>> >> > virtio-net-pci,netdev=net0,ioeventfd=on|off -vnc :0 -drive >> >> >>> >> > if=virtio,cache=none,file=$HOME/rhel6-autobench-raw.img >> >> >>> >> > >> >> >>> >> > I am running qemu.git with v5 patches, based off >> >> >>> >> > 36888c6335422f07bbc50bf3443a39f24b90c7c6. >> >> >>> >> > >> >> >>> >> > Host: >> >> >>> >> > 1 Quad-Core AMD Opteron(tm) Processor 2350 @ 2 GHz >> >> >>> >> > 8 GB RAM >> >> >>> >> > RHEL 6 host >> >> >>> >> > >> >> >>> >> > Next I will try the patches on latest qemu-kvm.git >> >> >>> >> > >> >> >>> >> > Stefan >> >> >>> >> >> >> >>> >> One interesting thing is that I put virtio-net earlier on >> >> >>> >> command line. >> >> >>> > >> >> >>> > Sorry I mean I put it after disk, you put it before. >> >> >>> >> >> >>> I can't find a measurable difference when swapping -drive and -netdev. >> >> >> >> >> >> One other concern I have is that we are apparently using >> >> >> ioeventfd for all VQs. E.g. for virtio-net we probably should not >> >> >> use it for the control VQ - it's a waste of resources. >> >> > >> >> > One option is a per-device (block, net, etc) bitmap that masks out >> >> > virtqueues. Is that something you'd like to see? >> >> > >> >> > I'm tempted to mask out the RX vq too and see how that affects the >> >> > qemu-kvm.git specific issue. >> >> >> >> As expected, the rx virtqueue is involved in the degradation. I >> >> enabled ioeventfd only for the TX virtqueue and got the same good >> >> results as userspace virtio-net. >> >> >> >> When I enable only the rx virtqueue, performs decreases as we've seen >> >> above. >> >> >> >> Stefan >> > >> > Interesting. In particular this implies something's wrong with the >> > queue: we should not normally be getting notifications from rx queue >> > at all. Is it running low on buffers? Does it help to increase the vq >> > size? Any other explanation? >> >> I made a mistake, it is the *tx* vq that causes reduced performance on >> short packets with ioeventfd. I double-checked the results and the rx >> vq doesn't affect performance. >> >> Initially I thought the fix would be to adjust the tx mitigation >> mechanism since ioeventfd does its own mitigation of sorts. Multiple >> eventfd signals will be coalesced into one qemu-kvm event handler call >> if qemu-kvm didn't have a chance to handle the first event before the >> eventfd was signalled again. >> >> I added -device virtio-net-pci tx=immediate to flush the TX queue >> immediately instead of scheduling a BH or timer. Unfortunately this >> had little measurable effect and performance stayed the same. This >> suggests most of the latency is between the guest's pio write and >> qemu-kvm getting around to handling the event. >> >> You mentioned that vhost-net has the same performance issue on this >> benchmark. I guess a solution for vhost-net may help virtio-ioeventfd >> and vice versa. >> >> Are you happy with this patchset if I remove virtio-net-pci >> ioeventfd=on|off so only virtio-blk-pci has ioeventfd=on|off (with >> default on)? For block we've found it to be a win and the initial >> results looked good for net too. >> >> Stefan > > I'm concerned that the tests were done on qemu.git. > Could you check block with qemu-kvm too please?
The following results show qemu-kvm with virtio-ioeventfd v3 for both aio=native and aio=threads: http://wiki.qemu.org/Features/VirtioIoeventfd Stefan