Re: vhost + multiqueue + RSS question.
On Sun, Nov 16, 2014 at 08:56:04PM +0200, Michael S. Tsirkin wrote: > On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote: > > Hi Michael, > > > > I am playing with vhost multiqueue capability and have a question about > > vhost multiqueue and RSS (receive side steering). My setup has Mellanox > > ConnectX-3 NIC which supports multiqueue and RSS. Network related > > parameters for qemu are: > > > >-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4 > >-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10 > > > > In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue. > > > > I am running one tcp stream into the guest using iperf. Since there is > > only one tcp stream I expect it to be handled by one queue only but > > this seams to be not the case. ethtool -S on a host shows that the > > stream is handled by one queue in the NIC, just like I would expect, > > but in a guest all 4 virtio-input interrupt are incremented. Am I > > missing any configuration? > > I don't see anything obviously wrong with what you describe. > Maybe, somehow, same irqfd got bound to multiple MSI vectors? It does not look like this is what is happening judging by the way interrupts are distributed between queues. They are not distributed uniformly and often I see one queue gets most interrupt and others get much less and then it changes. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost + multiqueue + RSS question.
On Mon, Nov 17, 2014 at 01:30:06PM +0800, Jason Wang wrote: > On 11/17/2014 02:56 AM, Michael S. Tsirkin wrote: > > On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote: > >> Hi Michael, > >> > >> I am playing with vhost multiqueue capability and have a question about > >> vhost multiqueue and RSS (receive side steering). My setup has Mellanox > >> ConnectX-3 NIC which supports multiqueue and RSS. Network related > >> parameters for qemu are: > >> > >>-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4 > >>-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10 > >> > >> In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue. > >> > >> I am running one tcp stream into the guest using iperf. Since there is > >> only one tcp stream I expect it to be handled by one queue only but > >> this seams to be not the case. ethtool -S on a host shows that the > >> stream is handled by one queue in the NIC, just like I would expect, > >> but in a guest all 4 virtio-input interrupt are incremented. Am I > >> missing any configuration? > > I don't see anything obviously wrong with what you describe. > > Maybe, somehow, same irqfd got bound to multiple MSI vectors? > > To see, can you try dumping struct kvm_irqfd that's passed to kvm? > > > > > >> -- > >>Gleb. > > This sounds like a regression, which kernel/qemu version did you use? Sorry, should have mentioned it from the start. Host is a fedora 20 with kernel 3.16.6-200.fc20.x86_64 and qemu-system-x86-1.6.2-9.fc20.x86_64. Guest is also fedora 20 but with an older kernel 3.11.10-301. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nested KVM slower than QEMU with gnumach guest kernel
On 2014-11-16 23:18, Samuel Thibault wrote: > Hello, > > Jan Kiszka, le Wed 12 Nov 2014 00:42:52 +0100, a écrit : >> On 2014-11-11 19:55, Samuel Thibault wrote: >>> jenkins.debian.net is running inside a KVM VM, and it runs nested >>> KVM guests for its installation attempts. This goes fine with Linux >>> kernels, but it is extremely slow with gnumach kernels. > >> You can try to catch a trace (ftrace) on the physical host. >> >> I suspect the setup forces a lot of instruction emulation, either on L0 >> or L1. And that is slower than QEMU is KVM does not optimize like QEMU does. > > Here is a sample of trace-cmd output dump: the same kind of pattern > repeats over and over, with EXTERNAL_INTERRUPT happening mostly > every other microsecond: > > qemu-system-x86-9752 [003] 4106.187755: kvm_exit: reason > EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6 > qemu-system-x86-9752 [003] 4106.187756: kvm_entry:vcpu 0 > qemu-system-x86-9752 [003] 4106.187757: kvm_exit: reason > EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6 > qemu-system-x86-9752 [003] 4106.187758: kvm_entry:vcpu 0 > qemu-system-x86-9752 [003] 4106.187759: kvm_exit: reason > EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6 > qemu-system-x86-9752 [003] 4106.187760: kvm_entry:vcpu 0 You may want to turn on more trace events, if not all, to possibly see what Linux does then. The next level after that is function tracing (may require a kernel rebuild or a tracing kernel of the distro). > > The various functions being interrupted are vmx_vcpu_run > (0xa02848b1 and 0xa0284972), handle_io > (0xa027ee62), vmx_get_cpl (0xa027a7de), > load_vmc12_host_state (0xa027ea31), native_read_tscp > (0x81050a84), native_write_msr_safe (0x81050aa6), > vmx_decache_cr0_guest_bits (0xa027a384), > vmx_handle_external_intr (0xa027a54d). > > AIUI, the external interrupt is 0xf6, i.e. Linux' IRQ_WORK_VECTOR. I > however don't see any of them, neither in L0's /proc/interrupts, nor in > L1's /proc/interrupts... I suppose this is a SMP host and guest? Does reducing CPUs to 1 change to picture? If not, it may help to understand cause and effect easier. Jan signature.asc Description: OpenPGP digital signature
Re: vhost + multiqueue + RSS question.
On 11/17/2014 12:54 PM, Venkateswara Rao Nandigam wrote: > I have a question related this topic. So How do you set the RSS Key on the > Mellanox NIc? I mean from your Guest? I believe it's possible but not implemented currently. The issue is the implementation should not be vendor specific. TUN/TAP has its own automatic flow steering implementation (flow caches). > > If it being set as part of Host driver, is there a way to set it from Guest? > I mean my guest will choose a RSS Key and will try to set on the Physical NIC. Flow caches can co-operate with RFS/aRFS now, so there's indeed some kind of co-operation between host card and guest I believe. > > Thanks, > Venkatesh > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost + multiqueue + RSS question.
On 11/17/2014 02:56 AM, Michael S. Tsirkin wrote: > On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote: >> Hi Michael, >> >> I am playing with vhost multiqueue capability and have a question about >> vhost multiqueue and RSS (receive side steering). My setup has Mellanox >> ConnectX-3 NIC which supports multiqueue and RSS. Network related >> parameters for qemu are: >> >>-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4 >>-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10 >> >> In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue. >> >> I am running one tcp stream into the guest using iperf. Since there is >> only one tcp stream I expect it to be handled by one queue only but >> this seams to be not the case. ethtool -S on a host shows that the >> stream is handled by one queue in the NIC, just like I would expect, >> but in a guest all 4 virtio-input interrupt are incremented. Am I >> missing any configuration? > I don't see anything obviously wrong with what you describe. > Maybe, somehow, same irqfd got bound to multiple MSI vectors? > To see, can you try dumping struct kvm_irqfd that's passed to kvm? > > >> -- >> Gleb. This sounds like a regression, which kernel/qemu version did you use? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: vhost + multiqueue + RSS question.
I have a question related this topic. So How do you set the RSS Key on the Mellanox NIc? I mean from your Guest? If it being set as part of Host driver, is there a way to set it from Guest? I mean my guest will choose a RSS Key and will try to set on the Physical NIC. Thanks, Venkatesh -Original Message- From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf Of Michael S. Tsirkin Sent: Monday, November 17, 2014 12:26 AM To: Gleb Natapov Cc: kvm@vger.kernel.org; Jason Wang; virtualizat...@lists.linux-foundation.org Subject: Re: vhost + multiqueue + RSS question. On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote: > Hi Michael, > > I am playing with vhost multiqueue capability and have a question > about vhost multiqueue and RSS (receive side steering). My setup has > Mellanox > ConnectX-3 NIC which supports multiqueue and RSS. Network related > parameters for qemu are: > >-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4 >-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10 > > In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue. > > I am running one tcp stream into the guest using iperf. Since there is > only one tcp stream I expect it to be handled by one queue only but > this seams to be not the case. ethtool -S on a host shows that the > stream is handled by one queue in the NIC, just like I would expect, > but in a guest all 4 virtio-input interrupt are incremented. Am I > missing any configuration? I don't see anything obviously wrong with what you describe. Maybe, somehow, same irqfd got bound to multiple MSI vectors? To see, can you try dumping struct kvm_irqfd that's passed to kvm? > -- > Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] KVM: simplification to the memslots code
On 2014/11/14 20:11, Paolo Bonzini wrote: > Hi Igor and Takuya, > > here are a few small patches that simplify __kvm_set_memory_region > and associated code. Can you please review them? Ah, already queued. Sorry for being late to respond. Takuya > > Thanks, > > Paolo > > Paolo Bonzini (3): >kvm: memslots: track id_to_index changes during the insertion sort >kvm: commonize allocation of the new memory slots >kvm: simplify update_memslots invocation > > virt/kvm/kvm_main.c | 87 > ++--- > 1 file changed, 36 insertions(+), 51 deletions(-) > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH 2/2] kvm: x86: mmio: fix setting the present bit of mmio spte
On 2014/11/14 18:11, Paolo Bonzini wrote: On 14/11/2014 10:31, Tiejun Chen wrote: In PAE case maxphyaddr may be 52bit as well, we also need to disable mmio page fault. Here we can check MMIO_SPTE_GEN_HIGH_SHIFT directly to determine if we should set the present bit, and bring a little cleanup. Signed-off-by: Tiejun Chen --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/mmu.c | 23 +++ arch/x86/kvm/x86.c | 30 -- 3 files changed, 24 insertions(+), 30 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index dc932d3..667f2b6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -809,6 +809,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn_offset, unsigned long mask); void kvm_mmu_zap_all(struct kvm *kvm); +void kvm_set_mmio_spte_mask(void); void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm); unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm); void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages); diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index ac1c4de..8e4be36 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -295,6 +295,29 @@ static bool check_mmio_spte(struct kvm *kvm, u64 spte) return likely(kvm_gen == spte_gen); } +/* + * Set the reserved bits and the present bit of an paging-structure + * entry to generate page fault with PFER.RSV = 1. + */ +void kvm_set_mmio_spte_mask(void) +{ + u64 mask; + int maxphyaddr = boot_cpu_data.x86_phys_bits; + + /* Mask the reserved physical address bits. */ + mask = rsvd_bits(maxphyaddr, MMIO_SPTE_GEN_HIGH_SHIFT - 1); + + /* Magic bits are always reserved for 32bit host. */ + mask |= 0x3ull << 62; This should be enough to trigger the page fault on PAE systems. The problem is specific to non-EPT 64-bit hosts, where the PTEs have no reserved bits beyond 51:MAXPHYADDR. On EPT we use WX- permissions to trigger EPT misconfig, on 32-bit systems we have bit 62. Thanks for your explanation. + /* Set the present bit to enable mmio page fault. */ + if (maxphyaddr < MMIO_SPTE_GEN_HIGH_SHIFT) + mask = PT_PRESENT_MASK; Shouldn't this be "|=" anyway, instead of "="? Yeah, just miss this. Thanks a lot, I will fix this in next revision. Thanks Tiejun -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] kvm: commonize allocation of the new memory slots
On 2014/11/14 20:12, Paolo Bonzini wrote: > The two kmemdup invocations can be unified. I find that the new > placement of the comment makes it easier to see what happens. A lot easier to follow the logic. Reviewed-by: Takuya Yoshikawa > > Signed-off-by: Paolo Bonzini > --- > virt/kvm/kvm_main.c | 28 +++- > 1 file changed, 11 insertions(+), 17 deletions(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index c8ff99cc0ccb..7bfc842b96d7 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -865,11 +865,12 @@ int __kvm_set_memory_region(struct kvm *kvm, > goto out_free; > } > > + slots = kmemdup(kvm->memslots, sizeof(struct kvm_memslots), > + GFP_KERNEL); > + if (!slots) > + goto out_free; > + > if ((change == KVM_MR_DELETE) || (change == KVM_MR_MOVE)) { > - slots = kmemdup(kvm->memslots, sizeof(struct kvm_memslots), > - GFP_KERNEL); > - if (!slots) > - goto out_free; > slot = id_to_memslot(slots, mem->slot); > slot->flags |= KVM_MEMSLOT_INVALID; > > @@ -885,6 +886,12 @@ int __kvm_set_memory_region(struct kvm *kvm, >* - kvm_is_visible_gfn (mmu_check_roots) >*/ > kvm_arch_flush_shadow_memslot(kvm, slot); > + > + /* > + * We can re-use the old_memslots from above, the only > difference > + * from the currently installed memslots is the invalid flag. > This > + * will get overwritten by update_memslots anyway. > + */ > slots = old_memslots; > } > > @@ -892,19 +899,6 @@ int __kvm_set_memory_region(struct kvm *kvm, > if (r) > goto out_slots; > > - r = -ENOMEM; > - /* > - * We can re-use the old_memslots from above, the only difference > - * from the currently installed memslots is the invalid flag. This > - * will get overwritten by update_memslots anyway. > - */ > - if (!slots) { > - slots = kmemdup(kvm->memslots, sizeof(struct kvm_memslots), > - GFP_KERNEL); > - if (!slots) > - goto out_free; > - } > - > /* actual memory is freed via old in kvm_free_physmem_slot below */ > if (change == KVM_MR_DELETE) { > new.dirty_bitmap = NULL; > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH 1/2] kvm: x86: mmu: return zero if s > e in rsvd_bits()
On 2014/11/14 18:06, Paolo Bonzini wrote: On 14/11/2014 10:31, Tiejun Chen wrote: In some real scenarios 'start' may not be less than 'end' like maxphyaddr = 52. Signed-off-by: Tiejun Chen --- arch/x86/kvm/mmu.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index bde8ee7..0e98b5e 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -58,6 +58,8 @@ static inline u64 rsvd_bits(int s, int e) { + if (unlikely(s > e)) + return 0; return ((1ULL << (e - s + 1)) - 1) << s; } s == e + 1 is supported: (1ULL << (e - (e + 1) + 1)) - 1) << s == (1ULL << (e - (e + 1) + 1)) - 1) << s = (1ULL << (e - e - 1) + 1)) - 1) << s = (1ULL << (-1) + 1)) - 1) << s = (1ULL << (0) - 1) << s = (1ULL << (- 1) << s Am I missing something? Thanks Tiejun (1ULL << 0) << s == 0 Is there any case where s is even bigger? Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nested KVM slower than QEMU with gnumach guest kernel
Hello, Jan Kiszka, le Wed 12 Nov 2014 00:42:52 +0100, a écrit : > On 2014-11-11 19:55, Samuel Thibault wrote: > > jenkins.debian.net is running inside a KVM VM, and it runs nested > > KVM guests for its installation attempts. This goes fine with Linux > > kernels, but it is extremely slow with gnumach kernels. > You can try to catch a trace (ftrace) on the physical host. > > I suspect the setup forces a lot of instruction emulation, either on L0 > or L1. And that is slower than QEMU is KVM does not optimize like QEMU does. Here is a sample of trace-cmd output dump: the same kind of pattern repeats over and over, with EXTERNAL_INTERRUPT happening mostly every other microsecond: qemu-system-x86-9752 [003] 4106.187755: kvm_exit: reason EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6 qemu-system-x86-9752 [003] 4106.187756: kvm_entry:vcpu 0 qemu-system-x86-9752 [003] 4106.187757: kvm_exit: reason EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6 qemu-system-x86-9752 [003] 4106.187758: kvm_entry:vcpu 0 qemu-system-x86-9752 [003] 4106.187759: kvm_exit: reason EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6 qemu-system-x86-9752 [003] 4106.187760: kvm_entry:vcpu 0 The various functions being interrupted are vmx_vcpu_run (0xa02848b1 and 0xa0284972), handle_io (0xa027ee62), vmx_get_cpl (0xa027a7de), load_vmc12_host_state (0xa027ea31), native_read_tscp (0x81050a84), native_write_msr_safe (0x81050aa6), vmx_decache_cr0_guest_bits (0xa027a384), vmx_handle_external_intr (0xa027a54d). AIUI, the external interrupt is 0xf6, i.e. Linux' IRQ_WORK_VECTOR. I however don't see any of them, neither in L0's /proc/interrupts, nor in L1's /proc/interrupts... Samuel trace.bz2 Description: Binary data
[PATCH] KVM: x86: Fix lost interrupt on irr_pending race
apic_find_highest_irr assumes irr_pending is set if any vector in APIC_IRR is set. If this assumption is broken and apicv is disabled, the injection of interrupts may be deferred until another interrupt is delivered to the guest. Ultimately, if no other interrupt should be injected to that vCPU, the pending interrupt may be lost. commit 56cc2406d68c ("KVM: nVMX: fix "acknowledge interrupt on exit" when APICv is in use") changed the behavior of apic_clear_irr so irr_pending is cleared after setting APIC_IRR vector. After this commit, if apic_set_irr and apic_clear_irr run simultaneously, a race may occur, resulting in APIC_IRR vector set, and irr_pending cleared. In the following example, assume a single vector is set in IRR prior to calling apic_clear_irr: apic_set_irrapic_clear_irr -- apic->irr_pending = true; apic_clear_vector(...); vec = apic_search_irr(apic); // => vec == -1 apic_set_vector(...); apic->irr_pending = (vec != -1); // => apic->irr_pending == false Nonetheless, it appears the race might even occur prior to this commit: apic_set_irrapic_clear_irr -- apic->irr_pending = true; apic->irr_pending = false; apic_clear_vector(...); if (apic_search_irr(apic) != -1) apic->irr_pending = true; // => apic->irr_pending == false apic_set_vector(...); Fixing this issue by: 1. Restoring the previous behavior of apic_clear_irr: clear irr_pending, call apic_clear_vector, and then if APIC_IRR is non-zero, set irr_pending. 2. On apic_set_irr: first call apic_set_vector, then set irr_pending. Signed-off-by: Nadav Amit --- arch/x86/kvm/lapic.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 6e8ce5a..e0e5642 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -341,8 +341,12 @@ EXPORT_SYMBOL_GPL(kvm_apic_update_irr); static inline void apic_set_irr(int vec, struct kvm_lapic *apic) { - apic->irr_pending = true; apic_set_vector(vec, apic->regs + APIC_IRR); + /* +* irr_pending must be true if any interrupt is pending; set it after +* APIC_IRR to avoid race with apic_clear_irr +*/ + apic->irr_pending = true; } static inline int apic_search_irr(struct kvm_lapic *apic) @@ -374,13 +378,15 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic) vcpu = apic->vcpu; - apic_clear_vector(vec, apic->regs + APIC_IRR); - if (unlikely(kvm_apic_vid_enabled(vcpu->kvm))) + if (unlikely(kvm_apic_vid_enabled(vcpu->kvm))) { /* try to update RVI */ + apic_clear_vector(vec, apic->regs + APIC_IRR); kvm_make_request(KVM_REQ_EVENT, vcpu); - else { - vec = apic_search_irr(apic); - apic->irr_pending = (vec != -1); + } else { + apic->irr_pending = false; + apic_clear_vector(vec, apic->regs + APIC_IRR); + if (apic_search_irr(apic) != -1) + apic->irr_pending = true; } } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vhost + multiqueue + RSS question.
On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote: > Hi Michael, > > I am playing with vhost multiqueue capability and have a question about > vhost multiqueue and RSS (receive side steering). My setup has Mellanox > ConnectX-3 NIC which supports multiqueue and RSS. Network related > parameters for qemu are: > >-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4 >-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10 > > In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue. > > I am running one tcp stream into the guest using iperf. Since there is > only one tcp stream I expect it to be handled by one queue only but > this seams to be not the case. ethtool -S on a host shows that the > stream is handled by one queue in the NIC, just like I would expect, > but in a guest all 4 virtio-input interrupt are incremented. Am I > missing any configuration? I don't see anything obviously wrong with what you describe. Maybe, somehow, same irqfd got bound to multiple MSI vectors? To see, can you try dumping struct kvm_irqfd that's passed to kvm? > -- > Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
vhost + multiqueue + RSS question.
Hi Michael, I am playing with vhost multiqueue capability and have a question about vhost multiqueue and RSS (receive side steering). My setup has Mellanox ConnectX-3 NIC which supports multiqueue and RSS. Network related parameters for qemu are: -netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4 -device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10 In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue. I am running one tcp stream into the guest using iperf. Since there is only one tcp stream I expect it to be handled by one queue only but this seams to be not the case. ethtool -S on a host shows that the stream is handled by one queue in the NIC, just like I would expect, but in a guest all 4 virtio-input interrupt are incremented. Am I missing any configuration? -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Benchmarking for vhost polling patch
On Sun, Nov 16, 2014 at 02:08:49PM +0200, Razya Ladelsky wrote: > Razya Ladelsky/Haifa/IBM@IBMIL wrote on 29/10/2014 02:38:31 PM: > > > From: Razya Ladelsky/Haifa/IBM@IBMIL > > To: m...@redhat.com > > Cc: Razya Ladelsky/Haifa/IBM@IBMIL, Alex Glikson/Haifa/IBM@IBMIL, > > Eran Raichstein/Haifa/IBM@IBMIL, Yossi Kuperman1/Haifa/IBM@IBMIL, > > Joel Nider/Haifa/IBM@IBMIL, abel.gor...@gmail.com, kvm@vger.kernel.org > > Date: 29/10/2014 02:38 PM > > Subject: Benchmarking for vhost polling patch > > > > Hi Michael, > > > > Following the polling patch thread: http://marc.info/? > > l=kvm&m=140853271510179&w=2, > > I changed poll_stop_idle to be counted in micro seconds, and carried out > > > experiments using varying sizes of this value. > > > > If it makes sense to you, I will continue with the other changes > > requested for > > the patch. > > > > Thank you, > > Razya > > > > > > Dear Michael, > I'm still interested in hearing your opinion about these numbers > http://marc.info/?l=kvm&m=141458631532669&w=2, > and whether it is worthwhile to continue with the polling patch. > Thank you, > Razya > > > > > > Hi Razya, On the netperf benchmark, it looks like polling=10 gives a modest but measureable gain. So from that perspective it might be worth it if it's not too much code, though we'll need to spend more time checking the macro effect - we barely moved the needle on the macro benchmark and that is suspicious. Is there a chance you are actually trading latency for throughput? do you observe any effect on latency? How about trying some other benchmark, e.g. NFS? Also, I am wondering: since vhost thread is polling in kernel anyway, shouldn't we try and poll the host NIC? that would likely reduce at least the latency significantly, won't it? -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Benchmarking for vhost polling patch
Razya Ladelsky/Haifa/IBM@IBMIL wrote on 29/10/2014 02:38:31 PM: > From: Razya Ladelsky/Haifa/IBM@IBMIL > To: m...@redhat.com > Cc: Razya Ladelsky/Haifa/IBM@IBMIL, Alex Glikson/Haifa/IBM@IBMIL, > Eran Raichstein/Haifa/IBM@IBMIL, Yossi Kuperman1/Haifa/IBM@IBMIL, > Joel Nider/Haifa/IBM@IBMIL, abel.gor...@gmail.com, kvm@vger.kernel.org > Date: 29/10/2014 02:38 PM > Subject: Benchmarking for vhost polling patch > > Hi Michael, > > Following the polling patch thread: http://marc.info/? > l=kvm&m=140853271510179&w=2, > I changed poll_stop_idle to be counted in micro seconds, and carried out > experiments using varying sizes of this value. > > If it makes sense to you, I will continue with the other changes > requested for > the patch. > > Thank you, > Razya > > Dear Michael, I'm still interested in hearing your opinion about these numbers http://marc.info/?l=kvm&m=141458631532669&w=2, and whether it is worthwhile to continue with the polling patch. Thank you, Razya > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html