Re: [Qemu-devel] [PATCH V2 00/20] Multiqueue virtio-net
On 01/29/2013 01:36 PM, Wanlong Gao wrote: > On 01/28/2013 12:24 PM, Jason Wang wrote: >> On 01/28/2013 11:27 AM, Wanlong Gao wrote: >>> On 01/25/2013 06:35 PM, Jason Wang wrote: Hello all: This seires is an update of last version of multiqueue virtio-net support. This series tries to brings multiqueue support to virtio-net through a multiqueue support tap backend and multiple vhost threads. To support this, multiqueue nic support were added to qemu. This is done by introducing an array of NetClientStates in NICState, and make each pair of peers to be an queue of the nic. This is done in patch 1-7. Tap were also converted to be able to create a multiple queue backend. Currently, only linux support this by issuing TUNSETIFF N times with the same device name to create N queues. Each fd returned by TUNSETIFF were a queue supported by kernel. Three new command lines were introduced, "queues" were used to tell how many queues will be created by qemu; "fds" were used to pass multiple pre-created tap file descriptors to qemu; "vhostfds" were used to pass multiple pre-created vhost descriptors to qemu. This is done in patch 8-13. A method of deleting a queue and queue_index were also introduce for virtio, this is done in patch 14-15. Vhost were also changed to support multiqueue by introducing a start vq index which tracks the first virtqueue that will be used by vhost instead of the assumption that the vhost always use virtqueue from index 0. This is done in patch 16. The last part is the multiqueue userspace changes, this is done in patch 17-20. With this changes, user could start a multiqueue virtio-net device through ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0 Management tools such as libvirt can pass multiple pre-created fds/vhostfds through ./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device virtio-net-pci,netdev=hn0 No git tree this round since github is unavailable in China... >>> I saw that github had already been opened again. I can use it. >> Thanks for reminding, I've pushed the new bits to >> git://github.com/jasowang/qemu.git. > I got host kernel oops here using your qemu tree and 3.8-rc5 kernel on host, > > [31499.754779] BUG: unable to handle kernel NULL pointer dereference at > (null) > [31499.757098] IP: [] _raw_spin_lock_irqsave+0x1f/0x40 > [31499.758304] PGD 0 > [31499.759498] Oops: 0002 [#1] SMP > [31499.760704] Modules linked in: tcp_lp fuse xt_CHECKSUM lockd > ipt_MASQUERADE sunrpc bnep bluetooth rfkill bridge stp llc iptable_nat > nf_nat_ipv4 nf_nat iptable_mangle nf_conntr > ack_ipv4 nf_defrag_ipv4 nf_conntrack snd_hda_codec_realtek snd_hda_intel > snd_hda_codec vhost_net tun snd_hwdep macvtap snd_seq macvlan coretemp > kvm_intel snd_seq_device kvm snd_p > cm crc32c_intel r8169 snd_page_alloc snd_timer ghash_clmulni_intel snd mei > iTCO_wdt mii microcode iTCO_vendor_support uinput serio_raw wmi i2c_i801 > lpc_ich soundcore pcspkr mfd_c > ore i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: > ip6t_REJECT] > [31499.766412] CPU 2 > [31499.766426] Pid: 18742, comm: vhost-18728 Not tainted 3.8.0-rc5 #1 LENOVO > QiTianM4300/To be filled by O.E.M. > [31499.769340] RIP: 0010:[] [] > _raw_spin_lock_irqsave+0x1f/0x40 > [31499.770861] RSP: 0018:8801b2f9dd08 EFLAGS: 00010086 > [31499.772380] RAX: 0286 RBX: RCX: > > [31499.773916] RDX: 0100 RSI: 0286 RDI: > > [31499.775394] RBP: 8801b2f9dd08 R08: 880132ed4368 R09: > > [31499.776923] R10: 0001 R11: 0001 R12: > 880132ed8590 > [31499.778466] R13: 880232a6c290 R14: 880132ed42b0 R15: > 880132ed0078 > [31499.780012] FS: () GS:88023fb0() > knlGS: > [31499.781574] CS: 0010 DS: ES: CR0: 80050033 > [31499.783126] CR2: CR3: 000132d9c000 CR4: > 000427e0 > [31499.784696] DR0: DR1: DR2: > > [31499.786267] DR3: DR6: 0ff0 DR7: > 0400 > [31499.787822] Process vhost-18728 (pid: 18742, threadinfo 8801b2f9c000, > task 880036959740) > [31499.788821] Stack: > [31499.790392] 8801b2f9dd38 81082534 > 0001 > [31499.792029] 880132ed 880232a6c290 8801b2f9dd48 > a023fab6 > [31499.793677] 8801b2f9de28 a0242f64 8801b2f9ddb8 > 8109e0e0 > [31499.795332] Call Trace: > [31499.796974] [] remove_wait_queue+0x24/0x50 > [31499.798641] [] vhost_poll_stop+0x16/0x20 [vhost_net] > [31499.800313]
Re: [Qemu-devel] [PATCH V2 00/20] Multiqueue virtio-net
On 01/28/2013 12:24 PM, Jason Wang wrote: > On 01/28/2013 11:27 AM, Wanlong Gao wrote: >> On 01/25/2013 06:35 PM, Jason Wang wrote: >>> Hello all: >>> >>> This seires is an update of last version of multiqueue virtio-net support. >>> >>> This series tries to brings multiqueue support to virtio-net through a >>> multiqueue support tap backend and multiple vhost threads. >>> >>> To support this, multiqueue nic support were added to qemu. This is done by >>> introducing an array of NetClientStates in NICState, and make each pair of >>> peers >>> to be an queue of the nic. This is done in patch 1-7. >>> >>> Tap were also converted to be able to create a multiple queue >>> backend. Currently, only linux support this by issuing TUNSETIFF N times >>> with >>> the same device name to create N queues. Each fd returned by TUNSETIFF were >>> a >>> queue supported by kernel. Three new command lines were introduced, "queues" >>> were used to tell how many queues will be created by qemu; "fds" were used >>> to >>> pass multiple pre-created tap file descriptors to qemu; "vhostfds" were >>> used to >>> pass multiple pre-created vhost descriptors to qemu. This is done in patch >>> 8-13. >>> >>> A method of deleting a queue and queue_index were also introduce for virtio, >>> this is done in patch 14-15. >>> >>> Vhost were also changed to support multiqueue by introducing a start vq >>> index >>> which tracks the first virtqueue that will be used by vhost instead of the >>> assumption that the vhost always use virtqueue from index 0. This is done in >>> patch 16. >>> >>> The last part is the multiqueue userspace changes, this is done in patch >>> 17-20. >>> >>> With this changes, user could start a multiqueue virtio-net device through >>> >>> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device >>> virtio-net-pci,netdev=hn0 >>> >>> Management tools such as libvirt can pass multiple pre-created fds/vhostfds >>> through >>> >>> ./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device >>> virtio-net-pci,netdev=hn0 >>> >>> No git tree this round since github is unavailable in China... >> I saw that github had already been opened again. I can use it. > > Thanks for reminding, I've pushed the new bits to > git://github.com/jasowang/qemu.git. I got host kernel oops here using your qemu tree and 3.8-rc5 kernel on host, [31499.754779] BUG: unable to handle kernel NULL pointer dereference at (null) [31499.757098] IP: [] _raw_spin_lock_irqsave+0x1f/0x40 [31499.758304] PGD 0 [31499.759498] Oops: 0002 [#1] SMP [31499.760704] Modules linked in: tcp_lp fuse xt_CHECKSUM lockd ipt_MASQUERADE sunrpc bnep bluetooth rfkill bridge stp llc iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntr ack_ipv4 nf_defrag_ipv4 nf_conntrack snd_hda_codec_realtek snd_hda_intel snd_hda_codec vhost_net tun snd_hwdep macvtap snd_seq macvlan coretemp kvm_intel snd_seq_device kvm snd_p cm crc32c_intel r8169 snd_page_alloc snd_timer ghash_clmulni_intel snd mei iTCO_wdt mii microcode iTCO_vendor_support uinput serio_raw wmi i2c_i801 lpc_ich soundcore pcspkr mfd_c ore i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: ip6t_REJECT] [31499.766412] CPU 2 [31499.766426] Pid: 18742, comm: vhost-18728 Not tainted 3.8.0-rc5 #1 LENOVO QiTianM4300/To be filled by O.E.M. [31499.769340] RIP: 0010:[] [] _raw_spin_lock_irqsave+0x1f/0x40 [31499.770861] RSP: 0018:8801b2f9dd08 EFLAGS: 00010086 [31499.772380] RAX: 0286 RBX: RCX: [31499.773916] RDX: 0100 RSI: 0286 RDI: [31499.775394] RBP: 8801b2f9dd08 R08: 880132ed4368 R09: [31499.776923] R10: 0001 R11: 0001 R12: 880132ed8590 [31499.778466] R13: 880232a6c290 R14: 880132ed42b0 R15: 880132ed0078 [31499.780012] FS: () GS:88023fb0() knlGS: [31499.781574] CS: 0010 DS: ES: CR0: 80050033 [31499.783126] CR2: CR3: 000132d9c000 CR4: 000427e0 [31499.784696] DR0: DR1: DR2: [31499.786267] DR3: DR6: 0ff0 DR7: 0400 [31499.787822] Process vhost-18728 (pid: 18742, threadinfo 8801b2f9c000, task 880036959740) [31499.788821] Stack: [31499.790392] 8801b2f9dd38 81082534 0001 [31499.792029] 880132ed 880232a6c290 8801b2f9dd48 a023fab6 [31499.793677] 8801b2f9de28 a0242f64 8801b2f9ddb8 8109e0e0 [31499.795332] Call Trace: [31499.796974] [] remove_wait_queue+0x24/0x50 [31499.798641] [] vhost_poll_stop+0x16/0x20 [vhost_net] [31499.800313] [] handle_tx+0x4c4/0x680 [vhost_net] [31499.801995] [] ? idle_balance+0x1b0/0x2f0 [31499.803685] [] handle_tx_kick+0x15/0x20 [vhost_net] [31499.805128] [] vhost_worker+0xed/0x190 [vhost_net] [31499.806842] [] ? vhost_work_flus
Re: [Qemu-devel] [PATCH V2 00/20] Multiqueue virtio-net
On 01/28/2013 11:27 AM, Wanlong Gao wrote: > On 01/25/2013 06:35 PM, Jason Wang wrote: >> Hello all: >> >> This seires is an update of last version of multiqueue virtio-net support. >> >> This series tries to brings multiqueue support to virtio-net through a >> multiqueue support tap backend and multiple vhost threads. >> >> To support this, multiqueue nic support were added to qemu. This is done by >> introducing an array of NetClientStates in NICState, and make each pair of >> peers >> to be an queue of the nic. This is done in patch 1-7. >> >> Tap were also converted to be able to create a multiple queue >> backend. Currently, only linux support this by issuing TUNSETIFF N times with >> the same device name to create N queues. Each fd returned by TUNSETIFF were a >> queue supported by kernel. Three new command lines were introduced, "queues" >> were used to tell how many queues will be created by qemu; "fds" were used to >> pass multiple pre-created tap file descriptors to qemu; "vhostfds" were used >> to >> pass multiple pre-created vhost descriptors to qemu. This is done in patch >> 8-13. >> >> A method of deleting a queue and queue_index were also introduce for virtio, >> this is done in patch 14-15. >> >> Vhost were also changed to support multiqueue by introducing a start vq index >> which tracks the first virtqueue that will be used by vhost instead of the >> assumption that the vhost always use virtqueue from index 0. This is done in >> patch 16. >> >> The last part is the multiqueue userspace changes, this is done in patch >> 17-20. >> >> With this changes, user could start a multiqueue virtio-net device through >> >> ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0 >> >> Management tools such as libvirt can pass multiple pre-created fds/vhostfds >> through >> >> ./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device >> virtio-net-pci,netdev=hn0 >> >> No git tree this round since github is unavailable in China... > I saw that github had already been opened again. I can use it. Thanks for reminding, I've pushed the new bits to git://github.com/jasowang/qemu.git. > > Thanks, > Wanlong Gao > >
Re: [Qemu-devel] [PATCH V2 00/20] Multiqueue virtio-net
On 01/25/2013 06:35 PM, Jason Wang wrote: > Hello all: > > This seires is an update of last version of multiqueue virtio-net support. > > This series tries to brings multiqueue support to virtio-net through a > multiqueue support tap backend and multiple vhost threads. > > To support this, multiqueue nic support were added to qemu. This is done by > introducing an array of NetClientStates in NICState, and make each pair of > peers > to be an queue of the nic. This is done in patch 1-7. > > Tap were also converted to be able to create a multiple queue > backend. Currently, only linux support this by issuing TUNSETIFF N times with > the same device name to create N queues. Each fd returned by TUNSETIFF were a > queue supported by kernel. Three new command lines were introduced, "queues" > were used to tell how many queues will be created by qemu; "fds" were used to > pass multiple pre-created tap file descriptors to qemu; "vhostfds" were used > to > pass multiple pre-created vhost descriptors to qemu. This is done in patch > 8-13. > > A method of deleting a queue and queue_index were also introduce for virtio, > this is done in patch 14-15. > > Vhost were also changed to support multiqueue by introducing a start vq index > which tracks the first virtqueue that will be used by vhost instead of the > assumption that the vhost always use virtqueue from index 0. This is done in > patch 16. > > The last part is the multiqueue userspace changes, this is done in patch > 17-20. > > With this changes, user could start a multiqueue virtio-net device through > > ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0 > > Management tools such as libvirt can pass multiple pre-created fds/vhostfds > through > > ./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device > virtio-net-pci,netdev=hn0 > > No git tree this round since github is unavailable in China... I saw that github had already been opened again. I can use it. Thanks, Wanlong Gao
[Qemu-devel] [PATCH V2 00/20] Multiqueue virtio-net
Hello all: This seires is an update of last version of multiqueue virtio-net support. This series tries to brings multiqueue support to virtio-net through a multiqueue support tap backend and multiple vhost threads. To support this, multiqueue nic support were added to qemu. This is done by introducing an array of NetClientStates in NICState, and make each pair of peers to be an queue of the nic. This is done in patch 1-7. Tap were also converted to be able to create a multiple queue backend. Currently, only linux support this by issuing TUNSETIFF N times with the same device name to create N queues. Each fd returned by TUNSETIFF were a queue supported by kernel. Three new command lines were introduced, "queues" were used to tell how many queues will be created by qemu; "fds" were used to pass multiple pre-created tap file descriptors to qemu; "vhostfds" were used to pass multiple pre-created vhost descriptors to qemu. This is done in patch 8-13. A method of deleting a queue and queue_index were also introduce for virtio, this is done in patch 14-15. Vhost were also changed to support multiqueue by introducing a start vq index which tracks the first virtqueue that will be used by vhost instead of the assumption that the vhost always use virtqueue from index 0. This is done in patch 16. The last part is the multiqueue userspace changes, this is done in patch 17-20. With this changes, user could start a multiqueue virtio-net device through ./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0 Management tools such as libvirt can pass multiple pre-created fds/vhostfds through ./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device virtio-net-pci,netdev=hn0 No git tree this round since github is unavailable in China... Changes from V1: - silent checkpatch (Blue) - use fds/vhostfds instead of fd/vhostfd (Stefan) - use fds="X:Y:Z" instead of fd=X,fd=Y,fd=Z (Anthony) - split patches (Stefan) - typos in commit log (Stefan) - Warn 'queues=' when fds/vhostfds is used (Stefan) - rename __net_init_tap to net_init_tap_one (Stefan) - check the consistency of vnet_hdr of multiple tap fds (Stefan) - disable multiqueue support for bridge-helper (Stefan) - rename tap_attach()/tap_detach() to tap_enable()/tap_disable() (Stefan) - fix booting with legacy guest (WanLong) - don't bump the version when doing migration (Michael) - simplify the interface between virtio-net and multiqueue vhost_net (Michael) - rebase the patches to latest - re-order the patches that let the net part comes first to simplify the reviewing - simplify the interface between virtio-net and multiqueue vhost_net - move the guest notifiers setup from vhost to vhost_net - fix a build issue of hw/mcf_fce.c Changes from RFC v2: - rebase the codes to latest qemu - align the multiqueue virtio-net implementation to virtio spec - split the patches into more smaller patches - set_link and hotplug support Changes from RFC V1: - rebase to the latest - fix memory leak in parse_netdev - fix guest notifiers assignment/de-assignment - changes the command lines to: qemu -netdev tap,queues=2 -device virtio-net-pci,queues=2 Reference: V1: http://lists.nongnu.org/archive/html/qemu-devel/2012-12/msg03558.html RFC v2: http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg04108.html RFC v1: http://comments.gmane.org/gmane.comp.emulators.qemu/100481 Perf Numbers: - norm is short for normalize result - trans.rate is short for transaction rate Two Intel Xeon 5620 with direct connected intel 82599EB Host/Guest kernel: David net tree vhost enabled - lots of improvents of both latency and cpu utilization in request-reponse test - get regression of guest sending small packets which because TCP tends to batch less when the latency were improved 1q/2q/4q TCP_RR size #sessions trans.rate norm trans.rate norm trans.rate norm 1 1 9393.26 595.64 9408.18 597.34 9375.19 584.12 1 2072162.1 2214.24 129880.22 2456.13 196949.81 2298.13 1 50107513.38 2653.99 139721.93 2490.58 259713.82 2873.57 1 100 126734.63 2676.54 145553.5 2406.63 265252.68 2943 64 19453.42 632.33 9371.37 616.13 9338.19 615.97 64 20 70620.03 2093.68 125155.75 2409.15 191239.91 2253.32 64 50 1069662448.29 146518.67 2514.47 242134.07 2720.91 64 100 117046.35 2394.56 190153.09 2696.82 238881.29 2704.41 256 1 8733.29 736.36 8701.07 680.83 8608.92 530.1 256 20 69279.89 2274.45 115103.07 2299.76 144555.16 1963.53 256 50 97676.02 2296.09 150719.57 2522.92 254510.5 3028.44 256 100 150221.55 2949.56 197569.3 2790.92 300695.78 3494.83 TCP_CRR size #sessions trans.rate norm trans.rate norm trans.rate norm 1 1 2848.37 163.41 2230.39 130.89 2013.09 120.47 1 2023434.5 562.11 31057.43 531.07 49488.28 564.41 1 5028514.88 582.17 40494.23 605.92 60113.35 654.97 1 100 28827.22 584.73 48813.25 661.6 61783.62 676.56 64 12780.08 159.4 2201.07 127.96 2006.8 117.63 64 20 23318.51 564.47 30982.44 530.