Re: [PATCH] KVM: APIC: avoid instruction emulation for EOI writes
On 09/10/2011 11:41 AM, ya su wrote: 0x80637b85: testl $0x1000, 0xfffe0300 0x80637b8f: jne 0x80637b85 0x80637b91: mov %ecx, 0xfffe0300 0x80637b97: testl $0x1000, 0xfffe0300 0x80637ba1: jne 0x80637b97 I wonder why testl operation will also cause a ICR write, from the asm code, there should only issue one IPI, but from trace-cmd, it issued 3 IPI, is there something wrong? It's a bug in test insn emulation, coincidentally I wrote a patch to fix it yesterday, not imagining that it actually happens in practice. Is it also possible to optimize ICR write emulation, from the result, winxp vm will produce a lot of ICR writes Unfortunately not. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: APIC: avoid instruction emulation for EOI writes
On 2011-09-11 09:11, Avi Kivity wrote: On 09/10/2011 11:41 AM, ya su wrote: 0x80637b85: testl $0x1000, 0xfffe0300 0x80637b8f: jne 0x80637b85 0x80637b91: mov %ecx, 0xfffe0300 0x80637b97: testl $0x1000, 0xfffe0300 0x80637ba1: jne 0x80637b97 I wonder why testl operation will also cause a ICR write, from the asm code, there should only issue one IPI, but from trace-cmd, it issued 3 IPI, is there something wrong? It's a bug in test insn emulation, coincidentally I wrote a patch to fix it yesterday, not imagining that it actually happens in practice. Is it also possible to optimize ICR write emulation, from the result, winxp vm will produce a lot of ICR writes Unfortunately not. I'm just hoping we'll see hardware-assisted APIC, ideally also IOAPIC virtualization soon. Jan signature.asc Description: OpenPGP digital signature
[PATCH 1/2] KVM: Clean up unneeded void pointer casts
From: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- virt/kvm/assigned-dev.c | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c index 4e9eaeb..ea1bf5d 100644 --- a/virt/kvm/assigned-dev.c +++ b/virt/kvm/assigned-dev.c @@ -143,7 +143,7 @@ static void deassign_host_irq(struct kvm *kvm, for (i = 0; i assigned_dev-entries_nr; i++) free_irq(assigned_dev-host_msix_entries[i].vector, -(void *)assigned_dev); +assigned_dev); assigned_dev-entries_nr = 0; kfree(assigned_dev-host_msix_entries); @@ -153,7 +153,7 @@ static void deassign_host_irq(struct kvm *kvm, /* Deal with MSI and INTx */ disable_irq(assigned_dev-host_irq); - free_irq(assigned_dev-host_irq, (void *)assigned_dev); + free_irq(assigned_dev-host_irq, assigned_dev); if (assigned_dev-irq_requested_type KVM_DEV_IRQ_HOST_MSI) pci_disable_msi(assigned_dev-dev); @@ -237,7 +237,7 @@ static int assigned_device_enable_host_intx(struct kvm *kvm, * are going to be long delays in accepting, acking, etc. */ if (request_threaded_irq(dev-host_irq, NULL, kvm_assigned_dev_thread, -IRQF_ONESHOT, dev-irq_name, (void *)dev)) +IRQF_ONESHOT, dev-irq_name, dev)) return -EIO; return 0; } @@ -256,7 +256,7 @@ static int assigned_device_enable_host_msi(struct kvm *kvm, dev-host_irq = dev-dev-irq; if (request_threaded_irq(dev-host_irq, NULL, kvm_assigned_dev_thread, -0, dev-irq_name, (void *)dev)) { +0, dev-irq_name, dev)) { pci_disable_msi(dev-dev); return -EIO; } @@ -283,7 +283,7 @@ static int assigned_device_enable_host_msix(struct kvm *kvm, for (i = 0; i dev-entries_nr; i++) { r = request_threaded_irq(dev-host_msix_entries[i].vector, NULL, kvm_assigned_dev_thread, -0, dev-irq_name, (void *)dev); +0, dev-irq_name, dev); if (r) goto err; } @@ -291,7 +291,7 @@ static int assigned_device_enable_host_msix(struct kvm *kvm, return 0; err: for (i -= 1; i = 0; i--) - free_irq(dev-host_msix_entries[i].vector, (void *)dev); + free_irq(dev-host_msix_entries[i].vector, dev); pci_disable_msix(dev-dev); return r; } -- 1.7.3.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] KVM: Avoid needless registrations of IRQ ack notifier for assigned devices
From: Jan Kiszka jan.kis...@siemens.com We only perform work in kvm_assigned_dev_ack_irq if the guest IRQ is of INTx type. This completely avoids the callback invocation in non-INTx cases by registering the IRQ ack notifier only for INTx. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- Part of my old INTx sharing series, but actually not depending on it. virt/kvm/assigned-dev.c | 18 -- 1 files changed, 8 insertions(+), 10 deletions(-) diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c index ea1bf5d..84ead54 100644 --- a/virt/kvm/assigned-dev.c +++ b/virt/kvm/assigned-dev.c @@ -86,13 +86,9 @@ static irqreturn_t kvm_assigned_dev_thread(int irq, void *dev_id) /* Ack the irq line for an assigned device */ static void kvm_assigned_dev_ack_irq(struct kvm_irq_ack_notifier *kian) { - struct kvm_assigned_dev_kernel *dev; - - if (kian-gsi == -1) - return; - - dev = container_of(kian, struct kvm_assigned_dev_kernel, - ack_notifier); + struct kvm_assigned_dev_kernel *dev = + container_of(kian, struct kvm_assigned_dev_kernel, +ack_notifier); kvm_set_irq(dev-kvm, dev-irq_source_id, dev-guest_irq, 0); @@ -110,8 +106,9 @@ static void kvm_assigned_dev_ack_irq(struct kvm_irq_ack_notifier *kian) static void deassign_guest_irq(struct kvm *kvm, struct kvm_assigned_dev_kernel *assigned_dev) { - kvm_unregister_irq_ack_notifier(kvm, assigned_dev-ack_notifier); - assigned_dev-ack_notifier.gsi = -1; + if (assigned_dev-ack_notifier.gsi != -1) + kvm_unregister_irq_ack_notifier(kvm, + assigned_dev-ack_notifier); kvm_set_irq(assigned_dev-kvm, assigned_dev-irq_source_id, assigned_dev-guest_irq, 0); @@ -404,7 +401,8 @@ static int assign_guest_irq(struct kvm *kvm, if (!r) { dev-irq_requested_type |= guest_irq_type; - kvm_register_irq_ack_notifier(kvm, dev-ack_notifier); + if (dev-ack_notifier.gsi != -1) + kvm_register_irq_ack_notifier(kvm, dev-ack_notifier); } else kvm_free_irq_source_id(kvm, dev-irq_source_id); -- 1.7.3.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: x86 emulator: disable writeback for TEST
The TEST instruction doesn't write its destination operand. This could cause problems if an MMIO register was accessed using the TEST instruction. Recently Windows XP was observed to use TEST against the APIC ICR; this can cause spurious IPIs. Signed-off-by: Avi Kivity a...@redhat.com --- arch/x86/kvm/emulate.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index c636ee7..c37f67e 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1670,6 +1670,8 @@ static int em_grp3(struct x86_emulate_ctxt *ctxt) switch (ctxt-modrm_reg) { case 0 ... 1: /* test */ emulate_2op_SrcV(ctxt, test); + /* Disable writeback. */ + ctxt-dst.type = OP_NONE; break; case 2: /* not */ ctxt-dst.val = ~ctxt-dst.val; @@ -2513,6 +2515,8 @@ static int em_cmp(struct x86_emulate_ctxt *ctxt) static int em_test(struct x86_emulate_ctxt *ctxt) { emulate_2op_SrcV(ctxt, test); + /* Disable writeback. */ + ctxt-dst.type = OP_NONE; return X86EMUL_CONTINUE; } -- 1.7.6.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: About hotplug multifunction
On Sat, Sep 10, 2011 at 02:43:11AM +0900, Isaku Yamahata wrote: pci/pcie hot plug needs clean up for multifunction hotplug in long term. Only single function device case works. Multifunction case is broken somwehat. Especially the current acpi based hotplug should be replaced by the standardized hot plug controller in long term. We'll need to keep supporting windows XP, which IIUC only supports hotplug through ACPI. So it looks like we'll need both. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: About hotplug multifunction
On Fri, Sep 09, 2011 at 03:34:26PM -0300, Marcelo Tosatti wrote: something I noted when readin our acpi code: we currently pass eject request for function 0 only: Name (_ADR, nr##) We either need a device per function there (acpi 1.0), send eject request for them all, or use as function number (newer acpi, not sure which version). Need to see which guests (windows,linux) can handle which form. I'd guess we need to change that to . No need, only make sure function 0 is there and all other functions should be removed automatically by the guest on eject notification. Hmm, the ACPI spec explicitly says: High word = Device #, Low word = Function #. (e.g., device 3, function 2 is 0x00030002). To refer to all the functions on a device #, use a function number of ). ACPI PCI hotplug is based on slots, not on functions. It does not support addition/removal of individual functions. Interesting. Is this just based on general logic, reading of the linux driver or the ACPI spec? The ACPI spec itself seems pretty vague. All tables list devices, where each device has an _ADR entry, which is built up of PCI device # and function #. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
On Fri, Sep 09, 2011 at 09:33:33AM -0700, Roopa Prabhu wrote: On 9/8/11 10:55 PM, Michael S. Tsirkin m...@redhat.com wrote: On Thu, Sep 08, 2011 at 07:53:11PM -0700, Roopa Prabhu wrote: Phase 1: Goal: Enable hardware filtering for all macvlan modes - In macvlan passthru mode the single guest virtio-nic connected will receive traffic that he requested for - In macvlan non-passthru mode all guest virtio-nics sharing the physical nic will see all other guest traffic but the filtering at guest virtio-nic I don't think guests currently filter anything. I was referring to Qemu-kvm virtio-net in virtion_net_receive-receive_filter. I think It only passes pkts that the guest OS is interested. It uses the filter table that I am passing to macvtap in this patch. This happens after userspace thread gets woken up and data is copied there. So relying on filtering at that level is going to be very inefficient on a system with multiple active guests. Further, and for that reason, vhost-net doesn't do filtering at all, relying on the backends to pass it correct packets. Ok thanks for the info. So in which case, phase 1 is best for PASSTHRU mode and for non-PASSTHRU when there is a single guest connected to a VF. For non-PASSTHRU multi guest sharing the same VF, Phase 1 is definitely better than putting the VF in promiscuous mode. But to address the concern you mention above, in phase 2 when we have more than one guest sharing the VF, It's probably more interesting for a card without SRIOV support. we will have to add filter lookup in macvlan to filter pkts for each guest. Any chance to enable hardware filters for that? This will need some performance tests too. Will start investigating the netlink interface comments for phase 1 first. Thanks! -Roopa -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
On Thu, Sep 08, 2011 at 08:00:53PM -0700, Roopa Prabhu wrote: On 9/8/11 12:33 PM, Michael S. Tsirkin m...@redhat.com wrote: On Thu, Sep 08, 2011 at 12:23:56PM -0700, Roopa Prabhu wrote: I think the main usecase for passthru mode is to assign a SR-IOV VF to a single guest. Yes and for the passthru usecase this patch should be enough to enable filtering in hw (eventually like I indicated before I need to fix vlan filtering too). So with filtering in hw, and in sriov VF case, VFs actually share a filtering table. How will that be partitioned? AFAIK, though it might maintain a single filter table space in hw, hw does know which filter belongs to which VF. And the OS driver does not need to do anything special. The VF driver exposes a VF netdev. And any uc/mc addresses registered with a VF netdev are registered with the hw by the driver. And hw will filter and send only pkts that the VF has expressed interest in. No special filter partitioning in hw is required. Thanks, Roopa Yes, but what I mean is, if the size of the single filter table is limited, we need to decide how many addresses is each guest allowed. If we let one guest ask for as many as it wants, it can lock others out. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking
On 09/08/2011 08:29 PM, Jeremy Fitzhardinge wrote: I don't think it's that expensive, especially compared to the double-context-switch and vmexit of the spinner going to sleep. On AMD we do have to take an extra vmexit (on IRET) though. Fair enough - so if the vcpu blocks itself, it ends up being rescheduled in the NMI handler, which then returns to the lock slowpath. And if its a normal hlt, then you can also take interrupts if they're enabled while spinning. Yes. To be clear, just execute 'hlt' and inherit the interrupt enable flag from the environment. And if you get nested NMIs (since you can get multiple spurious kicks, or from other NMI sources), then one NMI will get latched and any others will get dropped? While we're in the NMI handler, any further NMIs will be collapsed and queued (so one NMI can be in service and just one other queued behind it). We can detect this condition by checking %rip on stack. Well we could have a specialized sleep/wakeup hypercall pair like Xen, but I'd like to avoid it if at all possible. Yeah, that's something that just falls out of the existing event channel machinery, so it isn't something that I specifically added. But it does mean that you simply end up with a hypercall returning on kick, with no real complexities. It also has to return on interrupt, MNI, INIT etc. No real complexities is a meaningless phrase on x86, though it is fertile ground for math puns. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
On 9/11/11 2:44 AM, Michael S. Tsirkin m...@redhat.com wrote: AFAIK, though it might maintain a single filter table space in hw, hw does know which filter belongs to which VF. And the OS driver does not need to do anything special. The VF driver exposes a VF netdev. And any uc/mc addresses registered with a VF netdev are registered with the hw by the driver. And hw will filter and send only pkts that the VF has expressed interest in. No special filter partitioning in hw is required. Thanks, Roopa Yes, but what I mean is, if the size of the single filter table is limited, we need to decide how many addresses is each guest allowed. If we let one guest ask for as many as it wants, it can lock others out. Yes true. In these cases ie when the number of unicast addresses being registered is more than it can handle, The VF driver will put the VF in promiscuous mode (Or at least its supposed to do. I think all drivers do that). Thanks, Roopa -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
On 9/11/11 2:38 AM, Michael S. Tsirkin m...@redhat.com wrote: On Fri, Sep 09, 2011 at 09:33:33AM -0700, Roopa Prabhu wrote: On 9/8/11 10:55 PM, Michael S. Tsirkin m...@redhat.com wrote: On Thu, Sep 08, 2011 at 07:53:11PM -0700, Roopa Prabhu wrote: Phase 1: Goal: Enable hardware filtering for all macvlan modes - In macvlan passthru mode the single guest virtio-nic connected will receive traffic that he requested for - In macvlan non-passthru mode all guest virtio-nics sharing the physical nic will see all other guest traffic but the filtering at guest virtio-nic I don't think guests currently filter anything. I was referring to Qemu-kvm virtio-net in virtion_net_receive-receive_filter. I think It only passes pkts that the guest OS is interested. It uses the filter table that I am passing to macvtap in this patch. This happens after userspace thread gets woken up and data is copied there. So relying on filtering at that level is going to be very inefficient on a system with multiple active guests. Further, and for that reason, vhost-net doesn't do filtering at all, relying on the backends to pass it correct packets. Ok thanks for the info. So in which case, phase 1 is best for PASSTHRU mode and for non-PASSTHRU when there is a single guest connected to a VF. For non-PASSTHRU multi guest sharing the same VF, Phase 1 is definitely better than putting the VF in promiscuous mode. But to address the concern you mention above, in phase 2 when we have more than one guest sharing the VF, It's probably more interesting for a card without SRIOV support. If its an SRIOV card I am assuming people likely using PASSTHRU mode. Non-SRIOV cards will use any of the non-PASSTHRU mode. we will have to add filter lookup in macvlan to filter pkts for each guest. Any chance to enable hardware filters for that? NAFAIK. Am not sure how you would do it too. Its still a single device from where the host receives traffic from. Thanks, Roopa -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: About hotplug multifunction
On Sun, Sep 11, 2011 at 12:23:57PM +0300, Michael S. Tsirkin wrote: On Fri, Sep 09, 2011 at 03:34:26PM -0300, Marcelo Tosatti wrote: something I noted when readin our acpi code: we currently pass eject request for function 0 only: Name (_ADR, nr##) We either need a device per function there (acpi 1.0), send eject request for them all, or use as function number (newer acpi, not sure which version). Need to see which guests (windows,linux) can handle which form. I'd guess we need to change that to . No need, only make sure function 0 is there and all other functions should be removed automatically by the guest on eject notification. Hmm, the ACPI spec explicitly says: High word = Device #, Low word = Function #. (e.g., device 3, function 2 is 0x00030002). To refer to all the functions on a device #, use a function number of ). Right, but this is the _ADR of the device instance in ACPI. The communication between QEMU and the ACPI DSL code is all based in slots. ACPI PCI hotplug is based on slots, not on functions. It does not support addition/removal of individual functions. Interesting. Is this just based on general logic, reading of the linux driver or the ACPI spec? Its based on Seabios ACPI DST implementation and its relationship with the QEMU implementation in acpi_piix4.c. The ACPI spec itself seems pretty vague. All tables list devices, where each device has an _ADR entry, which is built up of PCI device # and function #. Yes, it is vague. Given the mandate from the PCI spec a device _must contain_ function 0, usage (including hotplug/unplug) of individual functions other than 0 as separate devices is a no-go. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: About hotplug multifunction
On Sun, Sep 11, 2011 at 12:01:49PM -0300, Marcelo Tosatti wrote: On Sun, Sep 11, 2011 at 12:23:57PM +0300, Michael S. Tsirkin wrote: On Fri, Sep 09, 2011 at 03:34:26PM -0300, Marcelo Tosatti wrote: something I noted when readin our acpi code: we currently pass eject request for function 0 only: Name (_ADR, nr##) We either need a device per function there (acpi 1.0), send eject request for them all, or use as function number (newer acpi, not sure which version). Need to see which guests (windows,linux) can handle which form. I'd guess we need to change that to . No need, only make sure function 0 is there and all other functions should be removed automatically by the guest on eject notification. Hmm, the ACPI spec explicitly says: High word = Device #, Low word = Function #. (e.g., device 3, function 2 is 0x00030002). To refer to all the functions on a device #, use a function number of ). Right, but this is the _ADR of the device instance in ACPI. The communication between QEMU and the ACPI DSL code is all based in slots. It's easy to extend that if we like though. ACPI PCI hotplug is based on slots, not on functions. It does not support addition/removal of individual functions. Interesting. Is this just based on general logic, reading of the linux driver or the ACPI spec? Its based on Seabios ACPI DST implementation and its relationship with the QEMU implementation in acpi_piix4.c. The ACPI spec itself seems pretty vague. All tables list devices, where each device has an _ADR entry, which is built up of PCI device # and function #. Yes, it is vague. Given the mandate from the PCI spec a device _must contain_ function 0, usage (including hotplug/unplug) of individual functions other than 0 as separate devices is a no-go. It doesn't seem to be a big issue. We could, for example, keep a stub function 0 around. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
On Sun, Sep 11, 2011 at 06:18:02AM -0700, Roopa Prabhu wrote: On 9/11/11 2:38 AM, Michael S. Tsirkin m...@redhat.com wrote: On Fri, Sep 09, 2011 at 09:33:33AM -0700, Roopa Prabhu wrote: On 9/8/11 10:55 PM, Michael S. Tsirkin m...@redhat.com wrote: On Thu, Sep 08, 2011 at 07:53:11PM -0700, Roopa Prabhu wrote: Phase 1: Goal: Enable hardware filtering for all macvlan modes - In macvlan passthru mode the single guest virtio-nic connected will receive traffic that he requested for - In macvlan non-passthru mode all guest virtio-nics sharing the physical nic will see all other guest traffic but the filtering at guest virtio-nic I don't think guests currently filter anything. I was referring to Qemu-kvm virtio-net in virtion_net_receive-receive_filter. I think It only passes pkts that the guest OS is interested. It uses the filter table that I am passing to macvtap in this patch. This happens after userspace thread gets woken up and data is copied there. So relying on filtering at that level is going to be very inefficient on a system with multiple active guests. Further, and for that reason, vhost-net doesn't do filtering at all, relying on the backends to pass it correct packets. Ok thanks for the info. So in which case, phase 1 is best for PASSTHRU mode and for non-PASSTHRU when there is a single guest connected to a VF. For non-PASSTHRU multi guest sharing the same VF, Phase 1 is definitely better than putting the VF in promiscuous mode. But to address the concern you mention above, in phase 2 when we have more than one guest sharing the VF, It's probably more interesting for a card without SRIOV support. If its an SRIOV card I am assuming people likely using PASSTHRU mode. Non-SRIOV cards will use any of the non-PASSTHRU mode. we will have to add filter lookup in macvlan to filter pkts for each guest. Any chance to enable hardware filters for that? NAFAIK. Am not sure how you would do it too. Its still a single device from where the host receives traffic from. Thanks, Roopa VMDQ cards might let you program mac addresses for individula rings. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
On Sun, Sep 11, 2011 at 06:18:01AM -0700, Roopa Prabhu wrote: On 9/11/11 2:44 AM, Michael S. Tsirkin m...@redhat.com wrote: AFAIK, though it might maintain a single filter table space in hw, hw does know which filter belongs to which VF. And the OS driver does not need to do anything special. The VF driver exposes a VF netdev. And any uc/mc addresses registered with a VF netdev are registered with the hw by the driver. And hw will filter and send only pkts that the VF has expressed interest in. No special filter partitioning in hw is required. Thanks, Roopa Yes, but what I mean is, if the size of the single filter table is limited, we need to decide how many addresses is each guest allowed. If we let one guest ask for as many as it wants, it can lock others out. Yes true. In these cases ie when the number of unicast addresses being registered is more than it can handle, The VF driver will put the VF in promiscuous mode (Or at least its supposed to do. I think all drivers do that). Thanks, Roopa Right, so that works at least but likely performs worse than a hardware filter. So we better allocate it in some fair way, as a minimum. Maybe a way for the admin to control that allocation is useful. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
On 9/11/2011 6:18 AM, Roopa Prabhu wrote: On 9/11/11 2:44 AM, Michael S. Tsirkinm...@redhat.com wrote: AFAIK, though it might maintain a single filter table space in hw, hw does know which filter belongs to which VF. And the OS driver does not need to do anything special. The VF driver exposes a VF netdev. And any uc/mc addresses registered with a VF netdev are registered with the hw by the driver. And hw will filter and send only pkts that the VF has expressed interest in. No special filter partitioning in hw is required. Thanks, Roopa Yes, but what I mean is, if the size of the single filter table is limited, we need to decide how many addresses is each guest allowed. If we let one guest ask for as many as it wants, it can lock others out. Yes true. In these cases ie when the number of unicast addresses being registered is more than it can handle, The VF driver will put the VF in promiscuous mode (Or at least its supposed to do. I think all drivers do that). What does putting VF in promiscuous mode mean? How can the NIC decide which set of mac addresses are passed to the VF? Does it mean VF sees all the packets received by the NIC including packets destined for other VFs/PF? Thanks Sridhar -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html