Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On Wed, Jan 16, 2013 at 10:14:33AM -0600, Anthony Liguori wrote: > "Michael S. Tsirkin" writes: > > > On Wed, Jan 16, 2013 at 09:09:49AM -0600, Anthony Liguori wrote: > >> Jason Wang writes: > >> > >> > On 01/15/2013 03:44 AM, Anthony Liguori wrote: > >> >> Jason Wang writes: > >> >> > >> >>> Hello all: > >> >>> > >> >>> This seires is an update of last version of multiqueue virtio-net > >> >>> support. > >> >>> > >> >>> Recently, linux tap gets multiqueue support. This series implements > >> >>> basic > >> >>> support for multiqueue tap, nic and vhost. Then use it as an > >> >>> infrastructure to > >> >>> enable the multiqueue support for virtio-net. > >> >>> > >> >>> Both vhost and userspace multiqueue were implemented for virtio-net, > >> >>> but > >> >>> userspace could be get much benefits since dataplane like parallized > >> >>> mechanism > >> >>> were not implemented. > >> >>> > >> >>> User could start a multiqueue virtio-net card through adding a "queues" > >> >>> parameter to tap. > >> >>> > >> >>> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device > >> >>> virtio-net-pci,netdev=hn0 > >> >>> > >> >>> Management tools such as libvirt can pass multiple pre-created fds > >> >>> through > >> >>> > >> >>> ./qemu -netdev tap,id=hn0,queues=2,fd=X,fd=Y -device > >> >>> virtio-net-pci,netdev=hn0 > >> >> I'm confused/frightened that this syntax works. You shouldn't be > >> >> allowed to have two values for the same property. Better to have a > >> >> syntax like fd[0]=X,fd[1]=Y or something along those lines. > >> > > >> > Yes, but this what current a StringList type works for command line. > >> > Some other parameters such as dnssearch, hostfwd and guestfwd have > >> > already worked in this way. Looks like your suggestions need some > >> > extension on QemuOps visitor, maybe we can do this on top. > >> > >> It's a silly syntax and breaks compatibility. This is valid syntax: > >> > >> -net tap,fd=3,fd=4 > >> > >> In this case, it means 'fd=4' because the last fd overwrites the first > >> one. > >> > >> Now you've changed it to mean something else. Having one thing mean > >> something in one context, but something else in another context is > >> terrible interface design. > >> > >> Regards, > >> > >> Anthony Liguori > > > > Aha so just renaming the field 'fds' would address this issue? > > No, you still have the problem of different meanings. > > -netdev tap,fd=X,fd=Y > > -netdev tap,fds=X,fds=Y > > Would have wildly different behavior. I think even caring about -net tap,fd=1,fd=2 is a bit silly. If this resulted in fd=2 by mistake, I don't think it was ever intentionally legal. As Jason points out we have list support and for better or worse it is currently using repeated options, e.g. with dnssearch, hostfwd and guestfwd. Isn't it better to be consistent? > Just do: > > -netdev tap,fds=X:Y > > And then we're staying consistent wrt the interpretation of multiple > properties of the same name. > > Regards, > > Anthony Liguori This introduces : as a special character. However fds can be fd names passed in with getfd, where : is a legal character. -- MST
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On Wed, Jan 16, 2013 at 10:14:33AM -0600, Anthony Liguori wrote: > "Michael S. Tsirkin" writes: > > > On Wed, Jan 16, 2013 at 09:09:49AM -0600, Anthony Liguori wrote: > >> Jason Wang writes: > >> > >> > On 01/15/2013 03:44 AM, Anthony Liguori wrote: > >> >> Jason Wang writes: > >> >> > >> >>> Hello all: > >> >>> > >> >>> This seires is an update of last version of multiqueue virtio-net > >> >>> support. > >> >>> > >> >>> Recently, linux tap gets multiqueue support. This series implements > >> >>> basic > >> >>> support for multiqueue tap, nic and vhost. Then use it as an > >> >>> infrastructure to > >> >>> enable the multiqueue support for virtio-net. > >> >>> > >> >>> Both vhost and userspace multiqueue were implemented for virtio-net, > >> >>> but > >> >>> userspace could be get much benefits since dataplane like parallized > >> >>> mechanism > >> >>> were not implemented. > >> >>> > >> >>> User could start a multiqueue virtio-net card through adding a "queues" > >> >>> parameter to tap. > >> >>> > >> >>> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device > >> >>> virtio-net-pci,netdev=hn0 > >> >>> > >> >>> Management tools such as libvirt can pass multiple pre-created fds > >> >>> through > >> >>> > >> >>> ./qemu -netdev tap,id=hn0,queues=2,fd=X,fd=Y -device > >> >>> virtio-net-pci,netdev=hn0 > >> >> I'm confused/frightened that this syntax works. You shouldn't be > >> >> allowed to have two values for the same property. Better to have a > >> >> syntax like fd[0]=X,fd[1]=Y or something along those lines. > >> > > >> > Yes, but this what current a StringList type works for command line. > >> > Some other parameters such as dnssearch, hostfwd and guestfwd have > >> > already worked in this way. Looks like your suggestions need some > >> > extension on QemuOps visitor, maybe we can do this on top. > >> > >> It's a silly syntax and breaks compatibility. This is valid syntax: > >> > >> -net tap,fd=3,fd=4 > >> > >> In this case, it means 'fd=4' because the last fd overwrites the first > >> one. > >> > >> Now you've changed it to mean something else. Having one thing mean > >> something in one context, but something else in another context is > >> terrible interface design. > >> > >> Regards, > >> > >> Anthony Liguori > > > > Aha so just renaming the field 'fds' would address this issue? > > No, you still have the problem of different meanings. > > -netdev tap,fd=X,fd=Y > > -netdev tap,fds=X,fds=Y > > Would have wildly different behavior. fd=X,fd=Y is more a bug than a feature. It could have failed just as well. > Just do: > > -netdev tap,fds=X:Y > > And then we're staying consistent wrt the interpretation of multiple > properties of the same name. > > Regards, > > Anthony Liguori Issue is ':' would only work for a list of numbers. As Jason points out StringList is already used - do we really want to invent yet another syntax for a list that will work only for this case? -- MST
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
"Michael S. Tsirkin" writes: > On Wed, Jan 16, 2013 at 09:09:49AM -0600, Anthony Liguori wrote: >> Jason Wang writes: >> >> > On 01/15/2013 03:44 AM, Anthony Liguori wrote: >> >> Jason Wang writes: >> >> >> >>> Hello all: >> >>> >> >>> This seires is an update of last version of multiqueue virtio-net >> >>> support. >> >>> >> >>> Recently, linux tap gets multiqueue support. This series implements basic >> >>> support for multiqueue tap, nic and vhost. Then use it as an >> >>> infrastructure to >> >>> enable the multiqueue support for virtio-net. >> >>> >> >>> Both vhost and userspace multiqueue were implemented for virtio-net, but >> >>> userspace could be get much benefits since dataplane like parallized >> >>> mechanism >> >>> were not implemented. >> >>> >> >>> User could start a multiqueue virtio-net card through adding a "queues" >> >>> parameter to tap. >> >>> >> >>> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device >> >>> virtio-net-pci,netdev=hn0 >> >>> >> >>> Management tools such as libvirt can pass multiple pre-created fds >> >>> through >> >>> >> >>> ./qemu -netdev tap,id=hn0,queues=2,fd=X,fd=Y -device >> >>> virtio-net-pci,netdev=hn0 >> >> I'm confused/frightened that this syntax works. You shouldn't be >> >> allowed to have two values for the same property. Better to have a >> >> syntax like fd[0]=X,fd[1]=Y or something along those lines. >> > >> > Yes, but this what current a StringList type works for command line. >> > Some other parameters such as dnssearch, hostfwd and guestfwd have >> > already worked in this way. Looks like your suggestions need some >> > extension on QemuOps visitor, maybe we can do this on top. >> >> It's a silly syntax and breaks compatibility. This is valid syntax: >> >> -net tap,fd=3,fd=4 >> >> In this case, it means 'fd=4' because the last fd overwrites the first >> one. >> >> Now you've changed it to mean something else. Having one thing mean >> something in one context, but something else in another context is >> terrible interface design. >> >> Regards, >> >> Anthony Liguori > > Aha so just renaming the field 'fds' would address this issue? No, you still have the problem of different meanings. -netdev tap,fd=X,fd=Y -netdev tap,fds=X,fds=Y Would have wildly different behavior. Just do: -netdev tap,fds=X:Y And then we're staying consistent wrt the interpretation of multiple properties of the same name. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On Wed, Jan 16, 2013 at 09:09:49AM -0600, Anthony Liguori wrote: > Jason Wang writes: > > > On 01/15/2013 03:44 AM, Anthony Liguori wrote: > >> Jason Wang writes: > >> > >>> Hello all: > >>> > >>> This seires is an update of last version of multiqueue virtio-net support. > >>> > >>> Recently, linux tap gets multiqueue support. This series implements basic > >>> support for multiqueue tap, nic and vhost. Then use it as an > >>> infrastructure to > >>> enable the multiqueue support for virtio-net. > >>> > >>> Both vhost and userspace multiqueue were implemented for virtio-net, but > >>> userspace could be get much benefits since dataplane like parallized > >>> mechanism > >>> were not implemented. > >>> > >>> User could start a multiqueue virtio-net card through adding a "queues" > >>> parameter to tap. > >>> > >>> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device > >>> virtio-net-pci,netdev=hn0 > >>> > >>> Management tools such as libvirt can pass multiple pre-created fds through > >>> > >>> ./qemu -netdev tap,id=hn0,queues=2,fd=X,fd=Y -device > >>> virtio-net-pci,netdev=hn0 > >> I'm confused/frightened that this syntax works. You shouldn't be > >> allowed to have two values for the same property. Better to have a > >> syntax like fd[0]=X,fd[1]=Y or something along those lines. > > > > Yes, but this what current a StringList type works for command line. > > Some other parameters such as dnssearch, hostfwd and guestfwd have > > already worked in this way. Looks like your suggestions need some > > extension on QemuOps visitor, maybe we can do this on top. > > It's a silly syntax and breaks compatibility. This is valid syntax: > > -net tap,fd=3,fd=4 > > In this case, it means 'fd=4' because the last fd overwrites the first > one. > > Now you've changed it to mean something else. Having one thing mean > something in one context, but something else in another context is > terrible interface design. > > Regards, > > Anthony Liguori Aha so just renaming the field 'fds' would address this issue?
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
Jason Wang writes: > On 01/15/2013 03:44 AM, Anthony Liguori wrote: >> Jason Wang writes: >> >>> Hello all: >>> >>> This seires is an update of last version of multiqueue virtio-net support. >>> >>> Recently, linux tap gets multiqueue support. This series implements basic >>> support for multiqueue tap, nic and vhost. Then use it as an infrastructure >>> to >>> enable the multiqueue support for virtio-net. >>> >>> Both vhost and userspace multiqueue were implemented for virtio-net, but >>> userspace could be get much benefits since dataplane like parallized >>> mechanism >>> were not implemented. >>> >>> User could start a multiqueue virtio-net card through adding a "queues" >>> parameter to tap. >>> >>> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device >>> virtio-net-pci,netdev=hn0 >>> >>> Management tools such as libvirt can pass multiple pre-created fds through >>> >>> ./qemu -netdev tap,id=hn0,queues=2,fd=X,fd=Y -device >>> virtio-net-pci,netdev=hn0 >> I'm confused/frightened that this syntax works. You shouldn't be >> allowed to have two values for the same property. Better to have a >> syntax like fd[0]=X,fd[1]=Y or something along those lines. > > Yes, but this what current a StringList type works for command line. > Some other parameters such as dnssearch, hostfwd and guestfwd have > already worked in this way. Looks like your suggestions need some > extension on QemuOps visitor, maybe we can do this on top. It's a silly syntax and breaks compatibility. This is valid syntax: -net tap,fd=3,fd=4 In this case, it means 'fd=4' because the last fd overwrites the first one. Now you've changed it to mean something else. Having one thing mean something in one context, but something else in another context is terrible interface design. Regards, Anthony Liguori > > Thanks >> >> Regards, >> >> Anthony Liguori >> >>> You can fetch and try the code from: >>> git://github.com/jasowang/qemu.git >>> >>> Patch 1 adds a generic method of creating multiqueue taps and implement the >>> linux part. >>> Patch 2 - 4 introduce some helpers which could be used to refactor the nic >>> emulation codes to support multiqueue. >>> Patch 5 introduces multiqueue support for qemu networking code: each peers >>> of >>> NetClientState were abstracted as a queue. Though this, most of the codes >>> could >>> be reusued without change. >>> Patch 6 adds basic multiqueue support for vhost which could let vhost just >>> handle a subset of all virtqueues. >>> Patch 7-8 introduce new helpers of virtio which is needed by multiqueue >>> virtio-net. >>> Patch 9-12 implement the multiqueue support of virtio-net >>> >>> Changes from RFC v2: >>> - rebase the codes to latest qemu >>> - align the multiqueue virtio-net implementation to virtio spec >>> - split the patches into more smaller patches >>> - set_link and hotplug support >>> >>> Changes from RFC V1: >>> - rebase to the latest >>> - fix memory leak in parse_netdev >>> - fix guest notifiers assignment/de-assignment >>> - changes the command lines to: >>>qemu -netdev tap,queues=2 -device virtio-net-pci,queues=2 >>> >>> Reference: >>> v2: http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg04108.html >>> v1: http://comments.gmane.org/gmane.comp.emulators.qemu/100481 >>> >>> Perf Numbers: >>> >>> Two Intel Xeon 5620 with direct connected intel 82599EB >>> Host/Guest kernel: David net tree >>> vhost enabled >>> >>> - lots of improvents of both latency and cpu utilization in request-reponse >>> test >>> - get regression of guest sending small packets which because TCP tends to >>> batch >>> less when the latency were improved >>> >>> 1q/2q/4q >>> TCP_RR >>> size #sessions trans.rate norm trans.rate norm trans.rate norm >>> 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 >>> 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 >>> 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 >>> 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 >>> 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 >>> 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 >>> 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 >>> 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 >>> 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 >>> 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 >>> 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 >>> 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 >>> TCP_CRR >>> size #sessions trans.rate norm trans.rate norm trans.rate norm >>> 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 >>> 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 >>> 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 >>> 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 >>> 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 >>> 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 >>> 64 50 28585.72 582.54 40576.7
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On 01/15/2013 03:44 AM, Anthony Liguori wrote: > Jason Wang writes: > >> Hello all: >> >> This seires is an update of last version of multiqueue virtio-net support. >> >> Recently, linux tap gets multiqueue support. This series implements basic >> support for multiqueue tap, nic and vhost. Then use it as an infrastructure >> to >> enable the multiqueue support for virtio-net. >> >> Both vhost and userspace multiqueue were implemented for virtio-net, but >> userspace could be get much benefits since dataplane like parallized >> mechanism >> were not implemented. >> >> User could start a multiqueue virtio-net card through adding a "queues" >> parameter to tap. >> >> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0 >> >> Management tools such as libvirt can pass multiple pre-created fds through >> >> ./qemu -netdev tap,id=hn0,queues=2,fd=X,fd=Y -device >> virtio-net-pci,netdev=hn0 > I'm confused/frightened that this syntax works. You shouldn't be > allowed to have two values for the same property. Better to have a > syntax like fd[0]=X,fd[1]=Y or something along those lines. Yes, but this what current a StringList type works for command line. Some other parameters such as dnssearch, hostfwd and guestfwd have already worked in this way. Looks like your suggestions need some extension on QemuOps visitor, maybe we can do this on top. Thanks > > Regards, > > Anthony Liguori > >> You can fetch and try the code from: >> git://github.com/jasowang/qemu.git >> >> Patch 1 adds a generic method of creating multiqueue taps and implement the >> linux part. >> Patch 2 - 4 introduce some helpers which could be used to refactor the nic >> emulation codes to support multiqueue. >> Patch 5 introduces multiqueue support for qemu networking code: each peers of >> NetClientState were abstracted as a queue. Though this, most of the codes >> could >> be reusued without change. >> Patch 6 adds basic multiqueue support for vhost which could let vhost just >> handle a subset of all virtqueues. >> Patch 7-8 introduce new helpers of virtio which is needed by multiqueue >> virtio-net. >> Patch 9-12 implement the multiqueue support of virtio-net >> >> Changes from RFC v2: >> - rebase the codes to latest qemu >> - align the multiqueue virtio-net implementation to virtio spec >> - split the patches into more smaller patches >> - set_link and hotplug support >> >> Changes from RFC V1: >> - rebase to the latest >> - fix memory leak in parse_netdev >> - fix guest notifiers assignment/de-assignment >> - changes the command lines to: >>qemu -netdev tap,queues=2 -device virtio-net-pci,queues=2 >> >> Reference: >> v2: http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg04108.html >> v1: http://comments.gmane.org/gmane.comp.emulators.qemu/100481 >> >> Perf Numbers: >> >> Two Intel Xeon 5620 with direct connected intel 82599EB >> Host/Guest kernel: David net tree >> vhost enabled >> >> - lots of improvents of both latency and cpu utilization in request-reponse >> test >> - get regression of guest sending small packets which because TCP tends to >> batch >> less when the latency were improved >> >> 1q/2q/4q >> TCP_RR >> size #sessions trans.rate norm trans.rate norm trans.rate norm >> 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 >> 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 >> 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 >> 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 >> 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 >> 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 >> 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 >> 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 >> 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 >> 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 >> 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 >> 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 >> TCP_CRR >> size #sessions trans.rate norm trans.rate norm trans.rate norm >> 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 >> 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 >> 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 >> 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 >> 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 >> 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 >> 64 50 28585.72 582.54 40576.7 610.08 60167.89 656.56 >> 64 100 28747.37 584.17 49081.87 667.87 60612.94 662 >> 256 1 2772.08 160.51 2231.84 131.05 2003.62 113.45 >> 256 20 23086.35 559.8 30929.09 528.16 48454.9 555.22 >> 256 50 28354.7 579.85 40578.31 60760261.71 657.87 >> 256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72 >> TCP_STREAM guest receiving >> size #sessions throughput norm throughput norm throughput norm >> 1 1 16.27 1.33 16.11.12 16.13 0.99 >> 1 2 33.04 2.08 32.96 2.1
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
Jason Wang writes: > Hello all: > > This seires is an update of last version of multiqueue virtio-net support. > > Recently, linux tap gets multiqueue support. This series implements basic > support for multiqueue tap, nic and vhost. Then use it as an infrastructure to > enable the multiqueue support for virtio-net. > > Both vhost and userspace multiqueue were implemented for virtio-net, but > userspace could be get much benefits since dataplane like parallized mechanism > were not implemented. > > User could start a multiqueue virtio-net card through adding a "queues" > parameter to tap. > > ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0 > > Management tools such as libvirt can pass multiple pre-created fds through > > ./qemu -netdev tap,id=hn0,queues=2,fd=X,fd=Y -device > virtio-net-pci,netdev=hn0 I'm confused/frightened that this syntax works. You shouldn't be allowed to have two values for the same property. Better to have a syntax like fd[0]=X,fd[1]=Y or something along those lines. Regards, Anthony Liguori > > You can fetch and try the code from: > git://github.com/jasowang/qemu.git > > Patch 1 adds a generic method of creating multiqueue taps and implement the > linux part. > Patch 2 - 4 introduce some helpers which could be used to refactor the nic > emulation codes to support multiqueue. > Patch 5 introduces multiqueue support for qemu networking code: each peers of > NetClientState were abstracted as a queue. Though this, most of the codes > could > be reusued without change. > Patch 6 adds basic multiqueue support for vhost which could let vhost just > handle a subset of all virtqueues. > Patch 7-8 introduce new helpers of virtio which is needed by multiqueue > virtio-net. > Patch 9-12 implement the multiqueue support of virtio-net > > Changes from RFC v2: > - rebase the codes to latest qemu > - align the multiqueue virtio-net implementation to virtio spec > - split the patches into more smaller patches > - set_link and hotplug support > > Changes from RFC V1: > - rebase to the latest > - fix memory leak in parse_netdev > - fix guest notifiers assignment/de-assignment > - changes the command lines to: >qemu -netdev tap,queues=2 -device virtio-net-pci,queues=2 > > Reference: > v2: http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg04108.html > v1: http://comments.gmane.org/gmane.comp.emulators.qemu/100481 > > Perf Numbers: > > Two Intel Xeon 5620 with direct connected intel 82599EB > Host/Guest kernel: David net tree > vhost enabled > > - lots of improvents of both latency and cpu utilization in request-reponse > test > - get regression of guest sending small packets which because TCP tends to > batch > less when the latency were improved > > 1q/2q/4q > TCP_RR > size #sessions trans.rate norm trans.rate norm trans.rate norm > 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 > 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 > 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 > 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 > 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 > 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 > 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 > 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 > 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 > 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 > 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 > 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 > TCP_CRR > size #sessions trans.rate norm trans.rate norm trans.rate norm > 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 > 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 > 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 > 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 > 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 > 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 > 64 50 28585.72 582.54 40576.7 610.08 60167.89 656.56 > 64 100 28747.37 584.17 49081.87 667.87 60612.94 662 > 256 1 2772.08 160.51 2231.84 131.05 2003.62 113.45 > 256 20 23086.35 559.8 30929.09 528.16 48454.9 555.22 > 256 50 28354.7 579.85 40578.31 60760261.71 657.87 > 256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72 > TCP_STREAM guest receiving > size #sessions throughput norm throughput norm throughput norm > 1 1 16.27 1.33 16.11.12 16.13 0.99 > 1 2 33.04 2.08 32.96 2.19 32.75 1.98 > 1 4 66.62 6.83 68.35.56 66.14 2.65 > 64 1896.55 56.67 914.02 58.14 898.9 61.56 > 64 21830.46 91.02 1812.02 64.59 1835.57 66.26 > 64 43626.61 142.55 3636.25 100.64 3607.46 75.03 > 256 1 2619.49 131.23 2543.19 129.03 2618.69 132.39 > 256 2 5136.58 203.02 5163.31 141.11 5236.51 149.4 > 256 4 7063.99 242.83 9365.4 208.49 9421.03 159.94 > 512 1 3592.43 165.24 3603.12 167.19 35
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On 01/10/2013 07:49 PM, Stefan Hajnoczi wrote: > On Thu, Jan 10, 2013 at 05:34:14PM +0800, Jason Wang wrote: >> On 01/10/2013 04:44 PM, Stefan Hajnoczi wrote: >>> On Wed, Jan 09, 2013 at 11:33:25PM +0800, Jason Wang wrote: On 01/09/2013 11:32 PM, Michael S. Tsirkin wrote: > On Wed, Jan 09, 2013 at 03:29:24PM +0100, Stefan Hajnoczi wrote: >> On Fri, Dec 28, 2012 at 06:31:52PM +0800, Jason Wang wrote: >>> Perf Numbers: >>> >>> Two Intel Xeon 5620 with direct connected intel 82599EB >>> Host/Guest kernel: David net tree >>> vhost enabled >>> >>> - lots of improvents of both latency and cpu utilization in >>> request-reponse test >>> - get regression of guest sending small packets which because TCP tends >>> to batch >>> less when the latency were improved >>> >>> 1q/2q/4q >>> TCP_RR >>> size #sessions trans.rate norm trans.rate norm trans.rate norm >>> 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 >>> 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 >>> 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 >>> 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 >>> 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 >>> 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 >>> 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 >>> 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 >>> 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 >>> 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 >>> 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 >>> 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 >>> TCP_CRR >>> size #sessions trans.rate norm trans.rate norm trans.rate norm >>> 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 >>> 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 >>> 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 >>> 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 >>> 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 >>> 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 >>> 64 50 28585.72 582.54 40576.7 610.08 60167.89 656.56 >>> 64 100 28747.37 584.17 49081.87 667.87 60612.94 662 >>> 256 1 2772.08 160.51 2231.84 131.05 2003.62 113.45 >>> 256 20 23086.35 559.8 30929.09 528.16 48454.9 555.22 >>> 256 50 28354.7 579.85 40578.31 60760261.71 657.87 >>> 256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72 >>> TCP_STREAM guest receiving >>> size #sessions throughput norm throughput norm throughput norm >>> 1 1 16.27 1.33 16.11.12 16.13 0.99 >>> 1 2 33.04 2.08 32.96 2.19 32.75 1.98 >>> 1 4 66.62 6.83 68.35.56 66.14 2.65 >>> 64 1896.55 56.67 914.02 58.14 898.9 61.56 >>> 64 21830.46 91.02 1812.02 64.59 1835.57 66.26 >>> 64 43626.61 142.55 3636.25 100.64 3607.46 75.03 >>> 256 1 2619.49 131.23 2543.19 129.03 2618.69 132.39 >>> 256 2 5136.58 203.02 5163.31 141.11 5236.51 149.4 >>> 256 4 7063.99 242.83 9365.4 208.49 9421.03 159.94 >>> 512 1 3592.43 165.24 3603.12 167.19 3552.5 169.57 >>> 512 2 7042.62 246.59 7068.46 180.87 7258.52 186.3 >>> 512 4 6996.08 241.49 9298.34 206.12 9418.52 159.33 >>> 1024 1 4339.54 192.95 4370.2 191.92 4211.72 192.49 >>> 1024 2 7439.45 254.77 9403.99 215.24 9120.82 222.67 >>> 1024 4 7953.86 272.11 9403.87 208.23 9366.98 159.49 >>> 4096 1 7696.28 272.04 7611.41 270.38 7778.71 267.76 >>> 4096 2 7530.35 261.1 8905.43 246.27 8990.18 267.57 >>> 4096 4 7121.6 247.02 9411.75 206.71 9654.96 184.67 >>> 16384 1 7795.73 268.54 7780.94 267.2 7634.26 260.73 >>> 16384 2 7436.57 255.81 9381.86 220.85 9392220.36 >>> 16384 4 7199.07 247.81 9420.96 205.87 9373.69 159.57 >>> TCP_MAERTS guest sending >>> size #sessions throughput norm throughput norm throughput norm >>> 1 1 15.94 0.62 15.55 0.61 15.13 0.59 >>> 1 2 36.11 0.83 32.46 0.69 32.28 0.69 >>> 1 4 71.59 1 68.91 0.94 61.52 0.77 >>> 64 1630.71 22.52 622.11 22.35 605.09 21.84 >>> 64 21442.36 30.57 1292.15 25.82 1282.67 25.55 >>> 64 43186.79 42.59 2844.96 36.03 2529.69 30.06 >>> 256 1 1760.96 58.07 1738.44 57.43 1695.99 56.19 >>> 256 2 4834.23 95.19 3524.85 64.21 3511.94 64.45 >>> 256 4 9324.63 145.74 8956.49 116.39 6720.17 73.86 >>> 512 1 2678.03 84.1 2630.68 82.93 2636.54 82.57 >>> 512 2 9368.17 195.61 9408.82 204.53 5316.3 92.99 >>> 512 4 9186.34 209.68 9358.72 183.82 9489.29 160.42 >>> 1024 1 3620.71 109.88 3625.54 109.83 3606.61 112.35 >>> 1024 2 9429258.32 7082.79 120.55 7403.53 134.78 >>> 1024 4 9430.66 290.44 9499.29 232.
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On Thu, Jan 10, 2013 at 05:34:14PM +0800, Jason Wang wrote: > On 01/10/2013 04:44 PM, Stefan Hajnoczi wrote: > > On Wed, Jan 09, 2013 at 11:33:25PM +0800, Jason Wang wrote: > >> On 01/09/2013 11:32 PM, Michael S. Tsirkin wrote: > >>> On Wed, Jan 09, 2013 at 03:29:24PM +0100, Stefan Hajnoczi wrote: > On Fri, Dec 28, 2012 at 06:31:52PM +0800, Jason Wang wrote: > > Perf Numbers: > > > > Two Intel Xeon 5620 with direct connected intel 82599EB > > Host/Guest kernel: David net tree > > vhost enabled > > > > - lots of improvents of both latency and cpu utilization in > > request-reponse test > > - get regression of guest sending small packets which because TCP tends > > to batch > > less when the latency were improved > > > > 1q/2q/4q > > TCP_RR > > size #sessions trans.rate norm trans.rate norm trans.rate norm > > 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 > > 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 > > 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 > > 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 > > 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 > > 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 > > 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 > > 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 > > 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 > > 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 > > 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 > > 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 > > TCP_CRR > > size #sessions trans.rate norm trans.rate norm trans.rate norm > > 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 > > 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 > > 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 > > 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 > > 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 > > 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 > > 64 50 28585.72 582.54 40576.7 610.08 60167.89 656.56 > > 64 100 28747.37 584.17 49081.87 667.87 60612.94 662 > > 256 1 2772.08 160.51 2231.84 131.05 2003.62 113.45 > > 256 20 23086.35 559.8 30929.09 528.16 48454.9 555.22 > > 256 50 28354.7 579.85 40578.31 60760261.71 657.87 > > 256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72 > > TCP_STREAM guest receiving > > size #sessions throughput norm throughput norm throughput norm > > 1 1 16.27 1.33 16.11.12 16.13 0.99 > > 1 2 33.04 2.08 32.96 2.19 32.75 1.98 > > 1 4 66.62 6.83 68.35.56 66.14 2.65 > > 64 1896.55 56.67 914.02 58.14 898.9 61.56 > > 64 21830.46 91.02 1812.02 64.59 1835.57 66.26 > > 64 43626.61 142.55 3636.25 100.64 3607.46 75.03 > > 256 1 2619.49 131.23 2543.19 129.03 2618.69 132.39 > > 256 2 5136.58 203.02 5163.31 141.11 5236.51 149.4 > > 256 4 7063.99 242.83 9365.4 208.49 9421.03 159.94 > > 512 1 3592.43 165.24 3603.12 167.19 3552.5 169.57 > > 512 2 7042.62 246.59 7068.46 180.87 7258.52 186.3 > > 512 4 6996.08 241.49 9298.34 206.12 9418.52 159.33 > > 1024 1 4339.54 192.95 4370.2 191.92 4211.72 192.49 > > 1024 2 7439.45 254.77 9403.99 215.24 9120.82 222.67 > > 1024 4 7953.86 272.11 9403.87 208.23 9366.98 159.49 > > 4096 1 7696.28 272.04 7611.41 270.38 7778.71 267.76 > > 4096 2 7530.35 261.1 8905.43 246.27 8990.18 267.57 > > 4096 4 7121.6 247.02 9411.75 206.71 9654.96 184.67 > > 16384 1 7795.73 268.54 7780.94 267.2 7634.26 260.73 > > 16384 2 7436.57 255.81 9381.86 220.85 9392220.36 > > 16384 4 7199.07 247.81 9420.96 205.87 9373.69 159.57 > > TCP_MAERTS guest sending > > size #sessions throughput norm throughput norm throughput norm > > 1 1 15.94 0.62 15.55 0.61 15.13 0.59 > > 1 2 36.11 0.83 32.46 0.69 32.28 0.69 > > 1 4 71.59 1 68.91 0.94 61.52 0.77 > > 64 1630.71 22.52 622.11 22.35 605.09 21.84 > > 64 21442.36 30.57 1292.15 25.82 1282.67 25.55 > > 64 43186.79 42.59 2844.96 36.03 2529.69 30.06 > > 256 1 1760.96 58.07 1738.44 57.43 1695.99 56.19 > > 256 2 4834.23 95.19 3524.85 64.21 3511.94 64.45 > > 256 4 9324.63 145.74 8956.49 116.39 6720.17 73.86 > > 512 1 2678.03 84.1 2630.68 82.93 2636.54 82.57 > > 512 2 9368.17 195.61 9408.82 204.53 5316.3 92.99 > > 512 4 9186.34 209.68 9358.72 183.82 9489.29 160.42 > > 1024 1 3620.71 109.88 3625.54 109.83 3606.61 112.35 > > 1024 2 9429258.32 7082.79 120.55 7403.53 134.78 > > 1024 4 9430.66 290.44 9499.29 232.31 9414.6 190.92 > > 4096 1 9339.28 296.48 9
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On 01/10/2013 04:44 PM, Stefan Hajnoczi wrote: > On Wed, Jan 09, 2013 at 11:33:25PM +0800, Jason Wang wrote: >> On 01/09/2013 11:32 PM, Michael S. Tsirkin wrote: >>> On Wed, Jan 09, 2013 at 03:29:24PM +0100, Stefan Hajnoczi wrote: On Fri, Dec 28, 2012 at 06:31:52PM +0800, Jason Wang wrote: > Perf Numbers: > > Two Intel Xeon 5620 with direct connected intel 82599EB > Host/Guest kernel: David net tree > vhost enabled > > - lots of improvents of both latency and cpu utilization in > request-reponse test > - get regression of guest sending small packets which because TCP tends > to batch > less when the latency were improved > > 1q/2q/4q > TCP_RR > size #sessions trans.rate norm trans.rate norm trans.rate norm > 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 > 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 > 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 > 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 > 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 > 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 > 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 > 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 > 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 > 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 > 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 > 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 > TCP_CRR > size #sessions trans.rate norm trans.rate norm trans.rate norm > 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 > 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 > 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 > 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 > 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 > 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 > 64 50 28585.72 582.54 40576.7 610.08 60167.89 656.56 > 64 100 28747.37 584.17 49081.87 667.87 60612.94 662 > 256 1 2772.08 160.51 2231.84 131.05 2003.62 113.45 > 256 20 23086.35 559.8 30929.09 528.16 48454.9 555.22 > 256 50 28354.7 579.85 40578.31 60760261.71 657.87 > 256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72 > TCP_STREAM guest receiving > size #sessions throughput norm throughput norm throughput norm > 1 1 16.27 1.33 16.11.12 16.13 0.99 > 1 2 33.04 2.08 32.96 2.19 32.75 1.98 > 1 4 66.62 6.83 68.35.56 66.14 2.65 > 64 1896.55 56.67 914.02 58.14 898.9 61.56 > 64 21830.46 91.02 1812.02 64.59 1835.57 66.26 > 64 43626.61 142.55 3636.25 100.64 3607.46 75.03 > 256 1 2619.49 131.23 2543.19 129.03 2618.69 132.39 > 256 2 5136.58 203.02 5163.31 141.11 5236.51 149.4 > 256 4 7063.99 242.83 9365.4 208.49 9421.03 159.94 > 512 1 3592.43 165.24 3603.12 167.19 3552.5 169.57 > 512 2 7042.62 246.59 7068.46 180.87 7258.52 186.3 > 512 4 6996.08 241.49 9298.34 206.12 9418.52 159.33 > 1024 1 4339.54 192.95 4370.2 191.92 4211.72 192.49 > 1024 2 7439.45 254.77 9403.99 215.24 9120.82 222.67 > 1024 4 7953.86 272.11 9403.87 208.23 9366.98 159.49 > 4096 1 7696.28 272.04 7611.41 270.38 7778.71 267.76 > 4096 2 7530.35 261.1 8905.43 246.27 8990.18 267.57 > 4096 4 7121.6 247.02 9411.75 206.71 9654.96 184.67 > 16384 1 7795.73 268.54 7780.94 267.2 7634.26 260.73 > 16384 2 7436.57 255.81 9381.86 220.85 9392220.36 > 16384 4 7199.07 247.81 9420.96 205.87 9373.69 159.57 > TCP_MAERTS guest sending > size #sessions throughput norm throughput norm throughput norm > 1 1 15.94 0.62 15.55 0.61 15.13 0.59 > 1 2 36.11 0.83 32.46 0.69 32.28 0.69 > 1 4 71.59 1 68.91 0.94 61.52 0.77 > 64 1630.71 22.52 622.11 22.35 605.09 21.84 > 64 21442.36 30.57 1292.15 25.82 1282.67 25.55 > 64 43186.79 42.59 2844.96 36.03 2529.69 30.06 > 256 1 1760.96 58.07 1738.44 57.43 1695.99 56.19 > 256 2 4834.23 95.19 3524.85 64.21 3511.94 64.45 > 256 4 9324.63 145.74 8956.49 116.39 6720.17 73.86 > 512 1 2678.03 84.1 2630.68 82.93 2636.54 82.57 > 512 2 9368.17 195.61 9408.82 204.53 5316.3 92.99 > 512 4 9186.34 209.68 9358.72 183.82 9489.29 160.42 > 1024 1 3620.71 109.88 3625.54 109.83 3606.61 112.35 > 1024 2 9429258.32 7082.79 120.55 7403.53 134.78 > 1024 4 9430.66 290.44 9499.29 232.31 9414.6 190.92 > 4096 1 9339.28 296.48 9374.23 372.88 9348.76 298.49 > 4096 2 9410.53 378.69 9412.61 286.18 9409.75 278.31 > 4096 4 9487.35 374.1 9556.91 288.81 9441.94 221.64 > 16384 1 9380.43 403.8 9379.78 399.13 9382.42 393.55 > 16384 2 9367.69 406.93
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On Wed, Jan 09, 2013 at 11:33:25PM +0800, Jason Wang wrote: > On 01/09/2013 11:32 PM, Michael S. Tsirkin wrote: > > On Wed, Jan 09, 2013 at 03:29:24PM +0100, Stefan Hajnoczi wrote: > >> On Fri, Dec 28, 2012 at 06:31:52PM +0800, Jason Wang wrote: > >>> Perf Numbers: > >>> > >>> Two Intel Xeon 5620 with direct connected intel 82599EB > >>> Host/Guest kernel: David net tree > >>> vhost enabled > >>> > >>> - lots of improvents of both latency and cpu utilization in > >>> request-reponse test > >>> - get regression of guest sending small packets which because TCP tends > >>> to batch > >>> less when the latency were improved > >>> > >>> 1q/2q/4q > >>> TCP_RR > >>> size #sessions trans.rate norm trans.rate norm trans.rate norm > >>> 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 > >>> 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 > >>> 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 > >>> 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 > >>> 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 > >>> 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 > >>> 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 > >>> 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 > >>> 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 > >>> 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 > >>> 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 > >>> 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 > >>> TCP_CRR > >>> size #sessions trans.rate norm trans.rate norm trans.rate norm > >>> 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 > >>> 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 > >>> 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 > >>> 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 > >>> 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 > >>> 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 > >>> 64 50 28585.72 582.54 40576.7 610.08 60167.89 656.56 > >>> 64 100 28747.37 584.17 49081.87 667.87 60612.94 662 > >>> 256 1 2772.08 160.51 2231.84 131.05 2003.62 113.45 > >>> 256 20 23086.35 559.8 30929.09 528.16 48454.9 555.22 > >>> 256 50 28354.7 579.85 40578.31 60760261.71 657.87 > >>> 256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72 > >>> TCP_STREAM guest receiving > >>> size #sessions throughput norm throughput norm throughput norm > >>> 1 1 16.27 1.33 16.11.12 16.13 0.99 > >>> 1 2 33.04 2.08 32.96 2.19 32.75 1.98 > >>> 1 4 66.62 6.83 68.35.56 66.14 2.65 > >>> 64 1896.55 56.67 914.02 58.14 898.9 61.56 > >>> 64 21830.46 91.02 1812.02 64.59 1835.57 66.26 > >>> 64 43626.61 142.55 3636.25 100.64 3607.46 75.03 > >>> 256 1 2619.49 131.23 2543.19 129.03 2618.69 132.39 > >>> 256 2 5136.58 203.02 5163.31 141.11 5236.51 149.4 > >>> 256 4 7063.99 242.83 9365.4 208.49 9421.03 159.94 > >>> 512 1 3592.43 165.24 3603.12 167.19 3552.5 169.57 > >>> 512 2 7042.62 246.59 7068.46 180.87 7258.52 186.3 > >>> 512 4 6996.08 241.49 9298.34 206.12 9418.52 159.33 > >>> 1024 1 4339.54 192.95 4370.2 191.92 4211.72 192.49 > >>> 1024 2 7439.45 254.77 9403.99 215.24 9120.82 222.67 > >>> 1024 4 7953.86 272.11 9403.87 208.23 9366.98 159.49 > >>> 4096 1 7696.28 272.04 7611.41 270.38 7778.71 267.76 > >>> 4096 2 7530.35 261.1 8905.43 246.27 8990.18 267.57 > >>> 4096 4 7121.6 247.02 9411.75 206.71 9654.96 184.67 > >>> 16384 1 7795.73 268.54 7780.94 267.2 7634.26 260.73 > >>> 16384 2 7436.57 255.81 9381.86 220.85 9392220.36 > >>> 16384 4 7199.07 247.81 9420.96 205.87 9373.69 159.57 > >>> TCP_MAERTS guest sending > >>> size #sessions throughput norm throughput norm throughput norm > >>> 1 1 15.94 0.62 15.55 0.61 15.13 0.59 > >>> 1 2 36.11 0.83 32.46 0.69 32.28 0.69 > >>> 1 4 71.59 1 68.91 0.94 61.52 0.77 > >>> 64 1630.71 22.52 622.11 22.35 605.09 21.84 > >>> 64 21442.36 30.57 1292.15 25.82 1282.67 25.55 > >>> 64 43186.79 42.59 2844.96 36.03 2529.69 30.06 > >>> 256 1 1760.96 58.07 1738.44 57.43 1695.99 56.19 > >>> 256 2 4834.23 95.19 3524.85 64.21 3511.94 64.45 > >>> 256 4 9324.63 145.74 8956.49 116.39 6720.17 73.86 > >>> 512 1 2678.03 84.1 2630.68 82.93 2636.54 82.57 > >>> 512 2 9368.17 195.61 9408.82 204.53 5316.3 92.99 > >>> 512 4 9186.34 209.68 9358.72 183.82 9489.29 160.42 > >>> 1024 1 3620.71 109.88 3625.54 109.83 3606.61 112.35 > >>> 1024 2 9429258.32 7082.79 120.55 7403.53 134.78 > >>> 1024 4 9430.66 290.44 9499.29 232.31 9414.6 190.92 > >>> 4096 1 9339.28 296.48 9374.23 372.88 9348.76 298.49 > >>> 4096 2 9410.53 378.69 9412.61 286.18 9409.75 278.31 > >>> 4096 4 9487.35 374.1 9556.91 288.81 9441.94 221.64 > >>> 16384 1 9380.43 403.8 9379.78 399.13 9382.42 393.55 > >>> 16384 2 9367.69 406.93 9415.04 312.68 9409.29 300.9 > >>> 16384 4 9391.9
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On Wed, Jan 09, 2013 at 03:29:24PM +0100, Stefan Hajnoczi wrote: > On Fri, Dec 28, 2012 at 06:31:52PM +0800, Jason Wang wrote: > > Perf Numbers: > > > > Two Intel Xeon 5620 with direct connected intel 82599EB > > Host/Guest kernel: David net tree > > vhost enabled > > > > - lots of improvents of both latency and cpu utilization in request-reponse > > test > > - get regression of guest sending small packets which because TCP tends to > > batch > > less when the latency were improved > > > > 1q/2q/4q > > TCP_RR > > size #sessions trans.rate norm trans.rate norm trans.rate norm > > 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 > > 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 > > 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 > > 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 > > 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 > > 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 > > 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 > > 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 > > 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 > > 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 > > 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 > > 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 > > TCP_CRR > > size #sessions trans.rate norm trans.rate norm trans.rate norm > > 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 > > 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 > > 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 > > 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 > > 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 > > 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 > > 64 50 28585.72 582.54 40576.7 610.08 60167.89 656.56 > > 64 100 28747.37 584.17 49081.87 667.87 60612.94 662 > > 256 1 2772.08 160.51 2231.84 131.05 2003.62 113.45 > > 256 20 23086.35 559.8 30929.09 528.16 48454.9 555.22 > > 256 50 28354.7 579.85 40578.31 60760261.71 657.87 > > 256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72 > > TCP_STREAM guest receiving > > size #sessions throughput norm throughput norm throughput norm > > 1 1 16.27 1.33 16.11.12 16.13 0.99 > > 1 2 33.04 2.08 32.96 2.19 32.75 1.98 > > 1 4 66.62 6.83 68.35.56 66.14 2.65 > > 64 1896.55 56.67 914.02 58.14 898.9 61.56 > > 64 21830.46 91.02 1812.02 64.59 1835.57 66.26 > > 64 43626.61 142.55 3636.25 100.64 3607.46 75.03 > > 256 1 2619.49 131.23 2543.19 129.03 2618.69 132.39 > > 256 2 5136.58 203.02 5163.31 141.11 5236.51 149.4 > > 256 4 7063.99 242.83 9365.4 208.49 9421.03 159.94 > > 512 1 3592.43 165.24 3603.12 167.19 3552.5 169.57 > > 512 2 7042.62 246.59 7068.46 180.87 7258.52 186.3 > > 512 4 6996.08 241.49 9298.34 206.12 9418.52 159.33 > > 1024 1 4339.54 192.95 4370.2 191.92 4211.72 192.49 > > 1024 2 7439.45 254.77 9403.99 215.24 9120.82 222.67 > > 1024 4 7953.86 272.11 9403.87 208.23 9366.98 159.49 > > 4096 1 7696.28 272.04 7611.41 270.38 7778.71 267.76 > > 4096 2 7530.35 261.1 8905.43 246.27 8990.18 267.57 > > 4096 4 7121.6 247.02 9411.75 206.71 9654.96 184.67 > > 16384 1 7795.73 268.54 7780.94 267.2 7634.26 260.73 > > 16384 2 7436.57 255.81 9381.86 220.85 9392220.36 > > 16384 4 7199.07 247.81 9420.96 205.87 9373.69 159.57 > > TCP_MAERTS guest sending > > size #sessions throughput norm throughput norm throughput norm > > 1 1 15.94 0.62 15.55 0.61 15.13 0.59 > > 1 2 36.11 0.83 32.46 0.69 32.28 0.69 > > 1 4 71.59 1 68.91 0.94 61.52 0.77 > > 64 1630.71 22.52 622.11 22.35 605.09 21.84 > > 64 21442.36 30.57 1292.15 25.82 1282.67 25.55 > > 64 43186.79 42.59 2844.96 36.03 2529.69 30.06 > > 256 1 1760.96 58.07 1738.44 57.43 1695.99 56.19 > > 256 2 4834.23 95.19 3524.85 64.21 3511.94 64.45 > > 256 4 9324.63 145.74 8956.49 116.39 6720.17 73.86 > > 512 1 2678.03 84.1 2630.68 82.93 2636.54 82.57 > > 512 2 9368.17 195.61 9408.82 204.53 5316.3 92.99 > > 512 4 9186.34 209.68 9358.72 183.82 9489.29 160.42 > > 1024 1 3620.71 109.88 3625.54 109.83 3606.61 112.35 > > 1024 2 9429258.32 7082.79 120.55 7403.53 134.78 > > 1024 4 9430.66 290.44 9499.29 232.31 9414.6 190.92 > > 4096 1 9339.28 296.48 9374.23 372.88 9348.76 298.49 > > 4096 2 9410.53 378.69 9412.61 286.18 9409.75 278.31 > > 4096 4 9487.35 374.1 9556.91 288.81 9441.94 221.64 > > 16384 1 9380.43 403.8 9379.78 399.13 9382.42 393.55 > > 16384 2 9367.69 406.93 9415.04 312.68 9409.29 300.9 > > 16384 4 9391.96 405.17 9695.12 310.54 9423.76 223.47 > > Trying to understand the performance results: > > What is the host device configuration? tap + bridge? > > Did you use host CPU affinity for the vhost threads? > > Can multiqueue tap take advantage of multiqueue host NICs or is > virtio-net m
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On 01/09/2013 11:32 PM, Michael S. Tsirkin wrote: > On Wed, Jan 09, 2013 at 03:29:24PM +0100, Stefan Hajnoczi wrote: >> On Fri, Dec 28, 2012 at 06:31:52PM +0800, Jason Wang wrote: >>> Perf Numbers: >>> >>> Two Intel Xeon 5620 with direct connected intel 82599EB >>> Host/Guest kernel: David net tree >>> vhost enabled >>> >>> - lots of improvents of both latency and cpu utilization in request-reponse >>> test >>> - get regression of guest sending small packets which because TCP tends to >>> batch >>> less when the latency were improved >>> >>> 1q/2q/4q >>> TCP_RR >>> size #sessions trans.rate norm trans.rate norm trans.rate norm >>> 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 >>> 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 >>> 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 >>> 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 >>> 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 >>> 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 >>> 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 >>> 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 >>> 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 >>> 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 >>> 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 >>> 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 >>> TCP_CRR >>> size #sessions trans.rate norm trans.rate norm trans.rate norm >>> 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 >>> 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 >>> 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 >>> 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 >>> 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 >>> 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 >>> 64 50 28585.72 582.54 40576.7 610.08 60167.89 656.56 >>> 64 100 28747.37 584.17 49081.87 667.87 60612.94 662 >>> 256 1 2772.08 160.51 2231.84 131.05 2003.62 113.45 >>> 256 20 23086.35 559.8 30929.09 528.16 48454.9 555.22 >>> 256 50 28354.7 579.85 40578.31 60760261.71 657.87 >>> 256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72 >>> TCP_STREAM guest receiving >>> size #sessions throughput norm throughput norm throughput norm >>> 1 1 16.27 1.33 16.11.12 16.13 0.99 >>> 1 2 33.04 2.08 32.96 2.19 32.75 1.98 >>> 1 4 66.62 6.83 68.35.56 66.14 2.65 >>> 64 1896.55 56.67 914.02 58.14 898.9 61.56 >>> 64 21830.46 91.02 1812.02 64.59 1835.57 66.26 >>> 64 43626.61 142.55 3636.25 100.64 3607.46 75.03 >>> 256 1 2619.49 131.23 2543.19 129.03 2618.69 132.39 >>> 256 2 5136.58 203.02 5163.31 141.11 5236.51 149.4 >>> 256 4 7063.99 242.83 9365.4 208.49 9421.03 159.94 >>> 512 1 3592.43 165.24 3603.12 167.19 3552.5 169.57 >>> 512 2 7042.62 246.59 7068.46 180.87 7258.52 186.3 >>> 512 4 6996.08 241.49 9298.34 206.12 9418.52 159.33 >>> 1024 1 4339.54 192.95 4370.2 191.92 4211.72 192.49 >>> 1024 2 7439.45 254.77 9403.99 215.24 9120.82 222.67 >>> 1024 4 7953.86 272.11 9403.87 208.23 9366.98 159.49 >>> 4096 1 7696.28 272.04 7611.41 270.38 7778.71 267.76 >>> 4096 2 7530.35 261.1 8905.43 246.27 8990.18 267.57 >>> 4096 4 7121.6 247.02 9411.75 206.71 9654.96 184.67 >>> 16384 1 7795.73 268.54 7780.94 267.2 7634.26 260.73 >>> 16384 2 7436.57 255.81 9381.86 220.85 9392220.36 >>> 16384 4 7199.07 247.81 9420.96 205.87 9373.69 159.57 >>> TCP_MAERTS guest sending >>> size #sessions throughput norm throughput norm throughput norm >>> 1 1 15.94 0.62 15.55 0.61 15.13 0.59 >>> 1 2 36.11 0.83 32.46 0.69 32.28 0.69 >>> 1 4 71.59 1 68.91 0.94 61.52 0.77 >>> 64 1630.71 22.52 622.11 22.35 605.09 21.84 >>> 64 21442.36 30.57 1292.15 25.82 1282.67 25.55 >>> 64 43186.79 42.59 2844.96 36.03 2529.69 30.06 >>> 256 1 1760.96 58.07 1738.44 57.43 1695.99 56.19 >>> 256 2 4834.23 95.19 3524.85 64.21 3511.94 64.45 >>> 256 4 9324.63 145.74 8956.49 116.39 6720.17 73.86 >>> 512 1 2678.03 84.1 2630.68 82.93 2636.54 82.57 >>> 512 2 9368.17 195.61 9408.82 204.53 5316.3 92.99 >>> 512 4 9186.34 209.68 9358.72 183.82 9489.29 160.42 >>> 1024 1 3620.71 109.88 3625.54 109.83 3606.61 112.35 >>> 1024 2 9429258.32 7082.79 120.55 7403.53 134.78 >>> 1024 4 9430.66 290.44 9499.29 232.31 9414.6 190.92 >>> 4096 1 9339.28 296.48 9374.23 372.88 9348.76 298.49 >>> 4096 2 9410.53 378.69 9412.61 286.18 9409.75 278.31 >>> 4096 4 9487.35 374.1 9556.91 288.81 9441.94 221.64 >>> 16384 1 9380.43 403.8 9379.78 399.13 9382.42 393.55 >>> 16384 2 9367.69 406.93 9415.04 312.68 9409.29 300.9 >>> 16384 4 9391.96 405.17 9695.12 310.54 9423.76 223.47 >> Trying to understand the performance results: >> >> What is the host device configuration? tap + bridge? Yes. >> >> Did you use host CPU affinity for the vhost threads? I use numactl to pin cpu t
Re: [Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
On Fri, Dec 28, 2012 at 06:31:52PM +0800, Jason Wang wrote: > Perf Numbers: > > Two Intel Xeon 5620 with direct connected intel 82599EB > Host/Guest kernel: David net tree > vhost enabled > > - lots of improvents of both latency and cpu utilization in request-reponse > test > - get regression of guest sending small packets which because TCP tends to > batch > less when the latency were improved > > 1q/2q/4q > TCP_RR > size #sessions trans.rate norm trans.rate norm trans.rate norm > 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 > 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 > 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 > 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 > 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 > 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 > 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 > 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 > 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 > 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 > 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 > 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 > TCP_CRR > size #sessions trans.rate norm trans.rate norm trans.rate norm > 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 > 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 > 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 > 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 > 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 > 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 > 64 50 28585.72 582.54 40576.7 610.08 60167.89 656.56 > 64 100 28747.37 584.17 49081.87 667.87 60612.94 662 > 256 1 2772.08 160.51 2231.84 131.05 2003.62 113.45 > 256 20 23086.35 559.8 30929.09 528.16 48454.9 555.22 > 256 50 28354.7 579.85 40578.31 60760261.71 657.87 > 256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72 > TCP_STREAM guest receiving > size #sessions throughput norm throughput norm throughput norm > 1 1 16.27 1.33 16.11.12 16.13 0.99 > 1 2 33.04 2.08 32.96 2.19 32.75 1.98 > 1 4 66.62 6.83 68.35.56 66.14 2.65 > 64 1896.55 56.67 914.02 58.14 898.9 61.56 > 64 21830.46 91.02 1812.02 64.59 1835.57 66.26 > 64 43626.61 142.55 3636.25 100.64 3607.46 75.03 > 256 1 2619.49 131.23 2543.19 129.03 2618.69 132.39 > 256 2 5136.58 203.02 5163.31 141.11 5236.51 149.4 > 256 4 7063.99 242.83 9365.4 208.49 9421.03 159.94 > 512 1 3592.43 165.24 3603.12 167.19 3552.5 169.57 > 512 2 7042.62 246.59 7068.46 180.87 7258.52 186.3 > 512 4 6996.08 241.49 9298.34 206.12 9418.52 159.33 > 1024 1 4339.54 192.95 4370.2 191.92 4211.72 192.49 > 1024 2 7439.45 254.77 9403.99 215.24 9120.82 222.67 > 1024 4 7953.86 272.11 9403.87 208.23 9366.98 159.49 > 4096 1 7696.28 272.04 7611.41 270.38 7778.71 267.76 > 4096 2 7530.35 261.1 8905.43 246.27 8990.18 267.57 > 4096 4 7121.6 247.02 9411.75 206.71 9654.96 184.67 > 16384 1 7795.73 268.54 7780.94 267.2 7634.26 260.73 > 16384 2 7436.57 255.81 9381.86 220.85 9392220.36 > 16384 4 7199.07 247.81 9420.96 205.87 9373.69 159.57 > TCP_MAERTS guest sending > size #sessions throughput norm throughput norm throughput norm > 1 1 15.94 0.62 15.55 0.61 15.13 0.59 > 1 2 36.11 0.83 32.46 0.69 32.28 0.69 > 1 4 71.59 1 68.91 0.94 61.52 0.77 > 64 1630.71 22.52 622.11 22.35 605.09 21.84 > 64 21442.36 30.57 1292.15 25.82 1282.67 25.55 > 64 43186.79 42.59 2844.96 36.03 2529.69 30.06 > 256 1 1760.96 58.07 1738.44 57.43 1695.99 56.19 > 256 2 4834.23 95.19 3524.85 64.21 3511.94 64.45 > 256 4 9324.63 145.74 8956.49 116.39 6720.17 73.86 > 512 1 2678.03 84.1 2630.68 82.93 2636.54 82.57 > 512 2 9368.17 195.61 9408.82 204.53 5316.3 92.99 > 512 4 9186.34 209.68 9358.72 183.82 9489.29 160.42 > 1024 1 3620.71 109.88 3625.54 109.83 3606.61 112.35 > 1024 2 9429258.32 7082.79 120.55 7403.53 134.78 > 1024 4 9430.66 290.44 9499.29 232.31 9414.6 190.92 > 4096 1 9339.28 296.48 9374.23 372.88 9348.76 298.49 > 4096 2 9410.53 378.69 9412.61 286.18 9409.75 278.31 > 4096 4 9487.35 374.1 9556.91 288.81 9441.94 221.64 > 16384 1 9380.43 403.8 9379.78 399.13 9382.42 393.55 > 16384 2 9367.69 406.93 9415.04 312.68 9409.29 300.9 > 16384 4 9391.96 405.17 9695.12 310.54 9423.76 223.47 Trying to understand the performance results: What is the host device configuration? tap + bridge? Did you use host CPU affinity for the vhost threads? Can multiqueue tap take advantage of multiqueue host NICs or is virtio-net multiqueue unaware of the physical NIC multiqueue capabilities? The results seem pretty mixed - as a user it's not obvious what to choose as a good all-round setting. Any observations on how multiqueue should be configured? What is the "norm" statistic? St
[Qemu-devel] [PATCH 00/12] Multiqueue virtio-net
Hello all: This seires is an update of last version of multiqueue virtio-net support. Recently, linux tap gets multiqueue support. This series implements basic support for multiqueue tap, nic and vhost. Then use it as an infrastructure to enable the multiqueue support for virtio-net. Both vhost and userspace multiqueue were implemented for virtio-net, but userspace could be get much benefits since dataplane like parallized mechanism were not implemented. User could start a multiqueue virtio-net card through adding a "queues" parameter to tap. ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0 Management tools such as libvirt can pass multiple pre-created fds through ./qemu -netdev tap,id=hn0,queues=2,fd=X,fd=Y -device virtio-net-pci,netdev=hn0 You can fetch and try the code from: git://github.com/jasowang/qemu.git Patch 1 adds a generic method of creating multiqueue taps and implement the linux part. Patch 2 - 4 introduce some helpers which could be used to refactor the nic emulation codes to support multiqueue. Patch 5 introduces multiqueue support for qemu networking code: each peers of NetClientState were abstracted as a queue. Though this, most of the codes could be reusued without change. Patch 6 adds basic multiqueue support for vhost which could let vhost just handle a subset of all virtqueues. Patch 7-8 introduce new helpers of virtio which is needed by multiqueue virtio-net. Patch 9-12 implement the multiqueue support of virtio-net Changes from RFC v2: - rebase the codes to latest qemu - align the multiqueue virtio-net implementation to virtio spec - split the patches into more smaller patches - set_link and hotplug support Changes from RFC V1: - rebase to the latest - fix memory leak in parse_netdev - fix guest notifiers assignment/de-assignment - changes the command lines to: qemu -netdev tap,queues=2 -device virtio-net-pci,queues=2 Reference: v2: http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg04108.html v1: http://comments.gmane.org/gmane.comp.emulators.qemu/100481 Perf Numbers: Two Intel Xeon 5620 with direct connected intel 82599EB Host/Guest kernel: David net tree vhost enabled - lots of improvents of both latency and cpu utilization in request-reponse test - get regression of guest sending small packets which because TCP tends to batch less when the latency were improved 1q/2q/4q TCP_RR size #sessions trans.rate norm trans.rate norm trans.rate norm 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 TCP_CRR size #sessions trans.rate norm trans.rate norm trans.rate norm 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 64 20 23318.51 564.47 30982.44 530.24 49734.95 566.13 64 50 28585.72 582.54 40576.7 610.08 60167.89 656.56 64 100 28747.37 584.17 49081.87 667.87 60612.94 662 256 1 2772.08 160.51 2231.84 131.05 2003.62 113.45 256 20 23086.35 559.8 30929.09 528.16 48454.9 555.22 256 50 28354.7 579.85 40578.31 60760261.71 657.87 256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72 TCP_STREAM guest receiving size #sessions throughput norm throughput norm throughput norm 1 1 16.27 1.33 16.11.12 16.13 0.99 1 2 33.04 2.08 32.96 2.19 32.75 1.98 1 4 66.62 6.83 68.35.56 66.14 2.65 64 1896.55 56.67 914.02 58.14 898.9 61.56 64 21830.46 91.02 1812.02 64.59 1835.57 66.26 64 43626.61 142.55 3636.25 100.64 3607.46 75.03 256 1 2619.49 131.23 2543.19 129.03 2618.69 132.39 256 2 5136.58 203.02 5163.31 141.11 5236.51 149.4 256 4 7063.99 242.83 9365.4 208.49 9421.03 159.94 512 1 3592.43 165.24 3603.12 167.19 3552.5 169.57 512 2 7042.62 246.59 7068.46 180.87 7258.52 186.3 512 4 6996.08 241.49 9298.34 206.12 9418.52 159.33 1024 1 4339.54 192.95 4370.2 191.92 4211.72 192.49 1024 2 7439.45 254.77 9403.99 215.24 9120.82 222.67 1024 4 7953.86 272.11 9403.87 208.23 9366.98 159.49 4096 1 7696.28 272.04 7611.41 270.38 7778.71 267.76 4096 2 7530.35 261.1 8905.43 246.27 8990.18 267.57 4096 4 7121.6 247.02 9411.75 206.71 9654.96 184.67 16384 1 7795.