[PATCH v6 0/4] irqfd support for arm/arm64
This patch series enables irqfd on arm and arm64. Irqfd framework enables to inject a virtual IRQ into a guest upon an eventfd trigger. User-side uses KVM_IRQFD VM ioctl to provide KVM with a kvm_irqfd struct that associates a VM, an eventfd, a virtual IRQ number (aka. the gsi). When an actor signals the eventfd (typically a VFIO platform driver), the kvm irqfd subsystem injects the gsi into the VM. Resamplefd also is supported for level sensitive interrupts, ie. the user can provide another eventfd that is triggered when the completion of the virtual IRQ (gsi) is detected by the GIC. The gsi must correspond to a shared peripheral interrupt (SPI), ie the GIC interrupt ID is gsi + 32. The rationale behind not supporting PPI irqfd injection is that any device using a PPI would be a private-to-the-CPU device (timer for instance), so its state would have to be context-switched along with the VCPU and would require in-kernel wiring anyhow. It is not a relevant use case for irqfds. This patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD. No IRQ routing table is used, enabling to remove CONFIG_HAVE_KVM_IRQCHIP The ARM virtual interrupt controller, the VGIC, is dynamically instantiated. The user-space may attempt to assign an irqfd before the virtual interrupt controller is ready. For that reason a check is added in the generic irqfd code to test whether the virtual interrupt controller is ready. This work was tested with Calxeda Midway xgmac main interrupt with qemu-system-arm and QEMU VFIO platform device. Also irqfd was proven functional on several vhost-net prototypes. Available on ssh://git.linaro.org/people/eric.auger/linux.git branch irqfd_v6_integrated_official_release v5 -> v6: - take into account Christoffer's comments: - rename macro and function enabling to check the state of virtual interrupt controller (kvm_arch_intc_initialized) - kvm_arch_intc_initialized is declared in kvm_host.h whatever the archi support. - squash v5 patch files 3 & 4 - KVM_CAP_IRQFD support depends on vgic_present - add Christoffer's Reviewed-by on last patch file v4 -> v5: - add the capability to check whether vgic is initialized when assigning an irqfd. objective is to avoid injecting IRQ before this vgic is ready: this corresponds to new patch files 2, 3, 4. - do not specifically handle early virtual IRQ injections in kvm_set_irq. In case of injection when vgic is not yet ready, simply return an error. User-space now has means to force vgic init and get notified if irqfd assign takes place too early. - squash [PATCH v4 2/3] KVM: arm: add irqfd support and [PATCH v4 3/3] KVM: arm64: add irqfd support - add Acked-by's in KVM: arm/arm64: unset CONFIG_HAVE_KVM_IRQCHIP - some comment rewording in vgic v3 -> v4: - rebase on 3.18rc5 - vgic dynamic instantiation brought new challenges: handling of irqfd injection when vgic is not ready - unset of CONFIG_HAVE_KVM_IRQCHIP in a separate patch - add arm64 enable - vgic.c style modifications according to Christoffer comments v2 -> v3: - removal of irq.h from eventfd.c put in a separate patch to increase visibility - properly expose KVM_CAP_IRQFD capability in arm.c - remove CONFIG_HAVE_KVM_IRQCHIP meaningfull only if irq_comm.c is used v1 -> v2: - rebase on 3.17rc1 - move of the dist unlock in process_maintenance - remove of dist lock in __kvm_vgic_sync_hwstate - rewording of the commit message (add resamplefd reference) - remove irq.h Eric Auger (4): KVM: arm/arm64: unset CONFIG_HAVE_KVM_IRQCHIP KVM: introduce kvm_arch_intc_initialized KVM: arm/arm64: implement kvm_arch_intc_initialized and use it in irqfd KVM: arm/arm64: add irqfd support Documentation/virtual/kvm/api.txt | 6 +++- arch/arm/include/asm/kvm_host.h | 2 ++ arch/arm/include/uapi/asm/kvm.h | 3 ++ arch/arm/kvm/Kconfig | 4 +-- arch/arm/kvm/Makefile | 2 +- arch/arm/kvm/arm.c| 10 +++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/include/uapi/asm/kvm.h | 3 ++ arch/arm64/kvm/Kconfig| 3 +- arch/arm64/kvm/Makefile | 2 +- include/linux/kvm_host.h | 14 + virt/kvm/arm/vgic.c | 63 --- virt/kvm/eventfd.c| 3 ++ 13 files changed, 107 insertions(+), 10 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 0/4] kvmtool: ARM/ARM64: Misc updates
This patchset updates KVMTOOL to use some of the features supported by Linux-3.16 KVM ARM/ARM64, such as: 1. Target CPU == Host using KVM_ARM_PREFERRED_TARGET vm ioctl 2. Target CPU type Potenza for using KVMTOOL on X-Gene 3. PSCI v0.2 support for Aarch32 and Aarch64 guest 4. System event exit reason Changes since v5: - Use pr_info() and pr_warning() instead of printf() when handling system event exit reason Changes since v4: - Avoid using magic '0' target for kvm arm generic target - Added comment for why we need Potenza target in KVMTOOL Changes since v3: - Add generic targets for aarch32 and aarch64 which are used by KVMTOOL when target type returned by KVM_ARM_PREFERRED_TARGET vm ioctl is not known to KVMTOOL - Print more info when handling system reset event Changes since v2: - Use target type returned by KVM_ARM_PREFERRED_TARGET vm ioctl for VCPU init such that we don't need to update KVMTOOL for every new host hardware - Simplify DTB generation for PSCI node Changes since v1: - Drop the patch to fix compile error for aarch64 - Fallback to old method of trying all target types if KVM_ARM_PREFERRED_TARGET vm ioctl fails - Print more info when handling KVM_EXIT_SYSTEM_EVENT Anup Patel (4): kvmtool: ARM: Use KVM_ARM_PREFERRED_TARGET vm ioctl to determine target cpu kvmtool: ARM64: Add target type potenza for aarch64 kvmtool: Handle exit reason KVM_EXIT_SYSTEM_EVENT kvmtool: ARM/ARM64: Provide PSCI-0.2 to guest when KVM supports it tools/kvm/arm/aarch32/arm-cpu.c |8 +++ tools/kvm/arm/aarch64/arm-cpu.c | 23 - tools/kvm/arm/fdt.c | 51 +-- tools/kvm/arm/include/arm-common/kvm-cpu-arch.h |2 + tools/kvm/arm/kvm-cpu.c | 61 +++ tools/kvm/kvm-cpu.c | 21 6 files changed, 149 insertions(+), 17 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 0/4] live migration dirty bitmap support for ARMv7
Will do that, I'm sure there will be another iteration :). On 05/15/2014 11:51 AM, Christoffer Dall wrote: > On Thu, May 15, 2014 at 11:27:27AM -0700, Mario Smarduch wrote: >> This is v6 patcheset of live mgiration support for ARMv7. > > migration > > This is an extremely terse cover letter. It would have been nice with a > few sentences of which existing features this leverages, which support > was missing, what the preferred approach is, etc. Also, links to a wiki > page or just a few notes on how you did the testing below with which > user space tools etc. would also have been great. > >> >> - Tested on two 4-way A15 hardware, QEMU 2-way/4-way SMP guest upto 2GB >> - Various dirty data rates tested - 2GB/1s ... 2048 pgs/5ms >> - validated source/destination memory image integrity >> >> Changes since v1: >> - add unlock of VM mmu_lock to prevent a deadlock >> - moved migratiion active inside mmu_lock acquire for visibility in 2nd stage >> data abort handler >> - Added comments >> >> Changes since v2: >> - move initial VM write protect to memory region architecture prepare >> function >> (needed to make dirty logging function generic) >> - added stage2_mark_pte_ro() - to mark ptes ro - Marc's comment >> - optimized initial VM memory region write protect to do fewer table lookups >> - >> applied Marc's comment for walking dirty bitmap mask >> - added pud_addr_end() for stage2 tables, to make the walk 4-level >> - added kvm_flush_remote_tlbs() to use ARM TLB invalidation, made the generic >> one weak, Marc's comment to for generic dirty bitmap log function >> - optimized walking dirty bit map mask to skip upper tables - Marc's comment >> - deleted x86,arm kvm_vm_ioctl_get_dirty_log(), moved to kvm_main.c tagged >> the function weak - Marc's comment >> - changed Data Abort handler pte index handling - Marc's comment >> >> Changes since v3: >> - changed pte updates to reset write bit instead of setting default >> value for existing pte's - Steve's comment >> - In addition to PUD add 2nd stage >4GB range functions - Steves >> suggestion >> - Restructured initial memory slot write protect function for PGD, PUD, PMD >> table walking - Steves suggestion >> - Renamed variable types to resemble their use - Steves suggestions >> - Added couple pte helpers for 2nd stage tables - Steves suggestion >> - Updated unmap_range() that handles 2nd stage tables and identity mappings >> to handle 2nd stage addresses >4GB. Left ARMv8 unchanged. >> >> Changes since v4: >> - rebased to 3.15.0-rc1 - 'next' to pickup p*addr_end patches - Gavins >> comment >> - Update PUD address end function to support 4-level page table walk >> - Elimiated 5th patch of the series that fixed unmap_range(), since it was >> fixed by Marcs patches. >> >> Changes since v5: >> - Created seperate entry point for VMID TLB flush with no param - >> Christoffers >> comment >> - Update documentation for kvm_flush_remote_tlbs() - Christoffers comment >> - Simplified splitting of huge pages - inittial WP and 2nd stage DABT handler >> clear the huge page PMD, and use current code to fault in small pages. >> Removed kvm_split_pmd(). >> >> Mario Smarduch (4): >> add ARMv7 HYP API to flush VM TLBs without address param >> live migration support for initial write protect of VM >> live migration support for VM dirty log management >> add 2nd stage page fault handling during live migration >> >> arch/arm/include/asm/kvm_asm.h |1 + >> arch/arm/include/asm/kvm_host.h | 11 ++ >> arch/arm/include/asm/kvm_mmu.h | 10 ++ >> arch/arm/kvm/arm.c |8 +- >> arch/arm/kvm/interrupts.S | 11 ++ >> arch/arm/kvm/mmu.c | 292 >> ++- >> arch/x86/kvm/x86.c | 86 >> virt/kvm/kvm_main.c | 84 ++- >> 8 files changed, 409 insertions(+), 94 deletions(-) >> >> -- >> 1.7.9.5 >> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 0/4] live migration dirty bitmap support for ARMv7
On Thu, May 15, 2014 at 11:27:27AM -0700, Mario Smarduch wrote: > This is v6 patcheset of live mgiration support for ARMv7. migration This is an extremely terse cover letter. It would have been nice with a few sentences of which existing features this leverages, which support was missing, what the preferred approach is, etc. Also, links to a wiki page or just a few notes on how you did the testing below with which user space tools etc. would also have been great. > > - Tested on two 4-way A15 hardware, QEMU 2-way/4-way SMP guest upto 2GB > - Various dirty data rates tested - 2GB/1s ... 2048 pgs/5ms > - validated source/destination memory image integrity > > Changes since v1: > - add unlock of VM mmu_lock to prevent a deadlock > - moved migratiion active inside mmu_lock acquire for visibility in 2nd stage > data abort handler > - Added comments > > Changes since v2: > - move initial VM write protect to memory region architecture prepare function > (needed to make dirty logging function generic) > - added stage2_mark_pte_ro() - to mark ptes ro - Marc's comment > - optimized initial VM memory region write protect to do fewer table lookups - > applied Marc's comment for walking dirty bitmap mask > - added pud_addr_end() for stage2 tables, to make the walk 4-level > - added kvm_flush_remote_tlbs() to use ARM TLB invalidation, made the generic > one weak, Marc's comment to for generic dirty bitmap log function > - optimized walking dirty bit map mask to skip upper tables - Marc's comment > - deleted x86,arm kvm_vm_ioctl_get_dirty_log(), moved to kvm_main.c tagged > the function weak - Marc's comment > - changed Data Abort handler pte index handling - Marc's comment > > Changes since v3: > - changed pte updates to reset write bit instead of setting default > value for existing pte's - Steve's comment > - In addition to PUD add 2nd stage >4GB range functions - Steves > suggestion > - Restructured initial memory slot write protect function for PGD, PUD, PMD > table walking - Steves suggestion > - Renamed variable types to resemble their use - Steves suggestions > - Added couple pte helpers for 2nd stage tables - Steves suggestion > - Updated unmap_range() that handles 2nd stage tables and identity mappings > to handle 2nd stage addresses >4GB. Left ARMv8 unchanged. > > Changes since v4: > - rebased to 3.15.0-rc1 - 'next' to pickup p*addr_end patches - Gavins comment > - Update PUD address end function to support 4-level page table walk > - Elimiated 5th patch of the series that fixed unmap_range(), since it was > fixed by Marcs patches. > > Changes since v5: > - Created seperate entry point for VMID TLB flush with no param - Christoffers > comment > - Update documentation for kvm_flush_remote_tlbs() - Christoffers comment > - Simplified splitting of huge pages - inittial WP and 2nd stage DABT handler > clear the huge page PMD, and use current code to fault in small pages. > Removed kvm_split_pmd(). > > Mario Smarduch (4): > add ARMv7 HYP API to flush VM TLBs without address param > live migration support for initial write protect of VM > live migration support for VM dirty log management > add 2nd stage page fault handling during live migration > > arch/arm/include/asm/kvm_asm.h |1 + > arch/arm/include/asm/kvm_host.h | 11 ++ > arch/arm/include/asm/kvm_mmu.h | 10 ++ > arch/arm/kvm/arm.c |8 +- > arch/arm/kvm/interrupts.S | 11 ++ > arch/arm/kvm/mmu.c | 292 > ++- > arch/x86/kvm/x86.c | 86 > virt/kvm/kvm_main.c | 84 ++- > 8 files changed, 409 insertions(+), 94 deletions(-) > > -- > 1.7.9.5 > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 0/4] live migration dirty bitmap support for ARMv7
This is v6 patcheset of live mgiration support for ARMv7. - Tested on two 4-way A15 hardware, QEMU 2-way/4-way SMP guest upto 2GB - Various dirty data rates tested - 2GB/1s ... 2048 pgs/5ms - validated source/destination memory image integrity Changes since v1: - add unlock of VM mmu_lock to prevent a deadlock - moved migratiion active inside mmu_lock acquire for visibility in 2nd stage data abort handler - Added comments Changes since v2: - move initial VM write protect to memory region architecture prepare function (needed to make dirty logging function generic) - added stage2_mark_pte_ro() - to mark ptes ro - Marc's comment - optimized initial VM memory region write protect to do fewer table lookups - applied Marc's comment for walking dirty bitmap mask - added pud_addr_end() for stage2 tables, to make the walk 4-level - added kvm_flush_remote_tlbs() to use ARM TLB invalidation, made the generic one weak, Marc's comment to for generic dirty bitmap log function - optimized walking dirty bit map mask to skip upper tables - Marc's comment - deleted x86,arm kvm_vm_ioctl_get_dirty_log(), moved to kvm_main.c tagged the function weak - Marc's comment - changed Data Abort handler pte index handling - Marc's comment Changes since v3: - changed pte updates to reset write bit instead of setting default value for existing pte's - Steve's comment - In addition to PUD add 2nd stage >4GB range functions - Steves suggestion - Restructured initial memory slot write protect function for PGD, PUD, PMD table walking - Steves suggestion - Renamed variable types to resemble their use - Steves suggestions - Added couple pte helpers for 2nd stage tables - Steves suggestion - Updated unmap_range() that handles 2nd stage tables and identity mappings to handle 2nd stage addresses >4GB. Left ARMv8 unchanged. Changes since v4: - rebased to 3.15.0-rc1 - 'next' to pickup p*addr_end patches - Gavins comment - Update PUD address end function to support 4-level page table walk - Elimiated 5th patch of the series that fixed unmap_range(), since it was fixed by Marcs patches. Changes since v5: - Created seperate entry point for VMID TLB flush with no param - Christoffers comment - Update documentation for kvm_flush_remote_tlbs() - Christoffers comment - Simplified splitting of huge pages - inittial WP and 2nd stage DABT handler clear the huge page PMD, and use current code to fault in small pages. Removed kvm_split_pmd(). Mario Smarduch (4): add ARMv7 HYP API to flush VM TLBs without address param live migration support for initial write protect of VM live migration support for VM dirty log management add 2nd stage page fault handling during live migration arch/arm/include/asm/kvm_asm.h |1 + arch/arm/include/asm/kvm_host.h | 11 ++ arch/arm/include/asm/kvm_mmu.h | 10 ++ arch/arm/kvm/arm.c |8 +- arch/arm/kvm/interrupts.S | 11 ++ arch/arm/kvm/mmu.c | 292 ++- arch/x86/kvm/x86.c | 86 virt/kvm/kvm_main.c | 84 ++- 8 files changed, 409 insertions(+), 94 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 0/4] KVM/ARM Architected Timers support
The following series implements support for the architected generic timers for KVM/ARM. This patch series can also be pulled from: git://github.com/virtualopensystems/linux-kvm-arm.git branch: kvm-arm-v16-vgic-timers Changes since v5: - Renamed sync_{to,from} to {flush,sync}_hwstate - Removed ISB's in world-switch code - Avoid add/sub on vcpu pointer in world-switch Changes since v1-v4: - Get virtual IRQ number from DT - Simplify access to cntvoff and cntv_cval - Remove extraneous bit clearing - Abstract timer arming/disarming to improve code readability - Context switch CNTKCTL across world-switches - Add CPU hotplug notifier --- Marc Zyngier (4): ARM: arch_timers: switch to physical timers if HYP mode is available ARM: KVM: arch_timers: Add guest timer core support ARM: KVM: arch_timers: Add timer world switch ARM: KVM: arch_timers: Wire the init code and config option arch/arm/include/asm/kvm_arch_timer.h | 85 ++ arch/arm/include/asm/kvm_asm.h|3 arch/arm/include/asm/kvm_host.h |5 + arch/arm/kernel/arch_timer.c |7 + arch/arm/kernel/asm-offsets.c |6 + arch/arm/kvm/Kconfig |8 + arch/arm/kvm/Makefile |1 arch/arm/kvm/arch_timer.c | 271 + arch/arm/kvm/arm.c| 14 ++ arch/arm/kvm/coproc.c |4 arch/arm/kvm/interrupts.S |2 arch/arm/kvm/interrupts_head.S| 90 +++ arch/arm/kvm/vgic.c |1 13 files changed, 495 insertions(+), 2 deletions(-) create mode 100644 arch/arm/include/asm/kvm_arch_timer.h create mode 100644 arch/arm/kvm/arch_timer.c -- -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 0/4] VFIO-based PCI device assignment
Alex Williamson writes: > v6: > Update patch 4/4 so Makefile just uses CONFIG_LINUX and > avoids all the noise in configure. > > Also available in git here: > > git://github.com/awilliam/qemu-vfio.git > branch: vfio-for-qemu > tag: vfio-pci-for-qemu-v6 Applied. Thanks. Regards, Anthony Liguori > > --- > > Alex Williamson (4): > vfio: Enable vfio-pci and mark supported > vfio: vfio-pci device assignment driver > Update Linux kernel headers > Update kernel header script to include vfio > > > MAINTAINERS |5 > hw/Makefile.objs|3 > hw/vfio_pci.c | 1864 > +++ > hw/vfio_pci_int.h | 114 ++ > linux-headers/linux/vfio.h | 368 > scripts/update-linux-headers.sh |2 > 6 files changed, 2354 insertions(+), 2 deletions(-) > create mode 100644 hw/vfio_pci.c > create mode 100644 hw/vfio_pci_int.h > create mode 100644 linux-headers/linux/vfio.h > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 0/4] VFIO-based PCI device assignment
v6: Update patch 4/4 so Makefile just uses CONFIG_LINUX and avoids all the noise in configure. Also available in git here: git://github.com/awilliam/qemu-vfio.git branch: vfio-for-qemu tag: vfio-pci-for-qemu-v6 --- Alex Williamson (4): vfio: Enable vfio-pci and mark supported vfio: vfio-pci device assignment driver Update Linux kernel headers Update kernel header script to include vfio MAINTAINERS |5 hw/Makefile.objs|3 hw/vfio_pci.c | 1864 +++ hw/vfio_pci_int.h | 114 ++ linux-headers/linux/vfio.h | 368 scripts/update-linux-headers.sh |2 6 files changed, 2354 insertions(+), 2 deletions(-) create mode 100644 hw/vfio_pci.c create mode 100644 hw/vfio_pci_int.h create mode 100644 linux-headers/linux/vfio.h -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 0/4] The intro of QEMU block I/O throttling
The main goal of the patch is to effectively cap the disk I/O speed or counts of one single VM.It is only one draft, so it unavoidably has some drawbacks, if you catch them, please let me know. The patch will mainly introduce one block I/O throttling algorithm, one timer and one block queue for each I/O limits enabled drive. When a block request is coming in, the throttling algorithm will check if its I/O rate or counts exceed the limits; if yes, then it will enqueue to the block queue; The timer will handle the I/O requests in it. Some available features follow as below: (1) global bps limit. -drive bps=xxxin bytes/s (2) only read bps limit -drive bps_rd=xxx in bytes/s (3) only write bps limit -drive bps_wr=xxx in bytes/s (4) global iops limit -drive iops=xxx in ios/s (5) only read iops limit -drive iops_rd=xxxin ios/s (6) only write iops limit -drive iops_wr=xxxin ios/s (7) the combination of some limits. -drive bps=xxx,iops=xxx Known Limitations: (1) #1 can not coexist with #2, #3 (2) #4 can not coexist with #5, #6 (3) When bps/iops limits are specified to a small value such as 511 bytes/s, this VM will hang up. We are considering how to handle this senario. Changes since code V5: Mainly fix the aio callback issue for block queue. Adjust codes based on Ram Pai's comments. Zhi Yong Wu (4): block: add the command line support block: add the block queue support block: add block timer and block throttling algorithm qmp/hmp: add block_set_io_throttle v5: add qmp/hmp support. Adjust the codes based on stefan's comments qmp/hmp: add block_set_io_throttle v4: fix memory leaking based on ryan's feedback. v3: Added the code for extending slice time, and modified the method to compute wait time for the timer. v2: The codes V2 for QEMU disk I/O limits. Modified the codes mainly based on stefan's comments. v1: Submit the codes for QEMU disk I/O limits. Only a code draft. Makefile.objs |2 +- block.c | 324 +++-- block.h |6 +- block/blk-queue.c | 226 + block/blk-queue.h | 63 ++ block_int.h | 30 + blockdev.c| 98 blockdev.h|2 + hmp-commands.hx | 15 +++ qemu-config.c | 24 qemu-options.hx |1 + qerror.c |4 + qerror.h |3 + qmp-commands.hx | 52 +- 14 files changed, 837 insertions(+), 13 deletions(-) create mode 100644 block/blk-queue.c create mode 100644 block/blk-queue.h -- 1.7.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 0/4] Enable SMEP feature support for KVM
This patchset enables a new CPU feature SMEP (Supervisor Mode Execution Protection) in KVM. SMEP prevents kernel from executing code in application. Updated Intel SDM describes this CPU feature. The document will be published soon. This patchset is based on Fenghua's SMEP patch series, as referred by: https://lkml.org/lkml/2011/5/17/523 changes since v5: Add kvm_supported_word9_x86_features and mask against it before masking against host capability changes since v4: Update patch 1/4 comment Change PT_USER_MASK to ACC_USER_MASK changes since v3: Add SMEP bit in CR4_RESERVED_BITS while removing cr4_reserved_bits; Mask CPUID leaf 7 ebx against host capability word9 in do_cpuid_ent; Changes since v2: add instruction fetch checking when walking guest page table. --- arch/x86/include/asm/kvm_host.h |2 +- arch/x86/kvm/paging_tmpl.h |9 - arch/x86/kvm/x86.c | 30 +++--- 3 files changed, 36 insertions(+), 5 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V6 0/4 net-next] macvtap/vhost TX zero-copy support
On Thu, 2011-05-26 at 13:31 -0700, Shirley Ma wrote: > On Thu, 2011-05-26 at 23:28 +0300, Michael S. Tsirkin wrote: > > On Thu, May 26, 2011 at 01:00:20PM -0700, Shirley Ma wrote: > > > 3. Add sleep in vhost shutting down instead of busy-wait for > > outstanding > > >DMAs. > > > > I still think this is not much better. We need to use a > > completion structure and wait on it instead. > > If this gets blocked thinkably a tx watchdog can fire and save us > > from blocking forver :) > > Ok, I can add a completion structure here. The code here doesn't block forever during shutdown, it will release all outstanding userspace buffers anyway, see vhost_zerocopy_signal_used() shutdown case. Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 0/4] rbd improvements
Am 27.05.2011 01:07, schrieb Josh Durgin: > This patchset moves the complexity of the rbd format into librbd and > adds truncation support. > > Changes since v5: > * compare full string, not prefix, with "conf" in 2/4 > * when truncate fails, just return librbd's error > > Changes since v4: > * fixed cosmetic issues pointed out by Christian Brunner > > Changes since v3: > * trivially rebased > * updated copyright header > > Changes since v2: > * return values are checked in rbd_aio_rw_vector > * bdrv_truncate added > > Josh Durgin (4): > rbd: use the higher level librbd instead of just librados > rbd: allow configuration of rados from the rbd filename > rbd: check return values when scheduling aio > rbd: Add bdrv_truncate implementation > > block/rbd.c | 896 > +++-- > block/rbd_types.h | 71 - > configure | 33 +-- > 3 files changed, 334 insertions(+), 666 deletions(-) > delete mode 100644 block/rbd_types.h Thanks, applied to the block branch. Kevin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 0/4] rbd improvements
This patchset moves the complexity of the rbd format into librbd and adds truncation support. Changes since v5: * compare full string, not prefix, with "conf" in 2/4 * when truncate fails, just return librbd's error Changes since v4: * fixed cosmetic issues pointed out by Christian Brunner Changes since v3: * trivially rebased * updated copyright header Changes since v2: * return values are checked in rbd_aio_rw_vector * bdrv_truncate added Josh Durgin (4): rbd: use the higher level librbd instead of just librados rbd: allow configuration of rados from the rbd filename rbd: check return values when scheduling aio rbd: Add bdrv_truncate implementation block/rbd.c | 896 +++-- block/rbd_types.h | 71 - configure | 33 +-- 3 files changed, 334 insertions(+), 666 deletions(-) delete mode 100644 block/rbd_types.h -- 1.7.2.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V6 0/4 net-next] macvtap/vhost TX zero-copy support
On Thu, 2011-05-26 at 23:28 +0300, Michael S. Tsirkin wrote: > On Thu, May 26, 2011 at 01:00:20PM -0700, Shirley Ma wrote: > > 3. Add sleep in vhost shutting down instead of busy-wait for > outstanding > >DMAs. > > I still think this is not much better. We need to use a > completion structure and wait on it instead. > If this gets blocked thinkably a tx watchdog can fire and save us > from blocking forver :) Ok, I can add a completion structure here. Thanks Shirley -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V6 0/4 net-next] macvtap/vhost TX zero-copy support
On Thu, May 26, 2011 at 01:00:20PM -0700, Shirley Ma wrote: > 3. Add sleep in vhost shutting down instead of busy-wait for outstanding >DMAs. I still think this is not much better. We need to use a completion structure and wait on it instead. If this gets blocked thinkably a tx watchdog can fire and save us from blocking forver :) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V6 0/4 net-next] macvtap/vhost TX zero-copy support
This patchset add supports for TX zero-copy between guest and host kernel through vhost. It significantly reduces CPU utilization on the local host on which the guest is located (It reduced about 50% CPU usage for single stream test on the host, while 4K message size BW has increased about 50%). The patchset is based on previous submission and comments from the community regarding when/how to handle guest kernel buffers to be released. This is the simplest approach I can think of after comparing with several other solutions. This patchset has integrated V3 review comments from community: 1. Add more comments on how to use device ZEROCOPY flag; 2. Change device ZEROCOPY to available bit 31 3. Fix skb header linear allocation when virtio_net GSO is not enabled It has integrated V4 review comments from MST and Sridhar: 1. In vhost, using socket poll wake up for outstanding DMAs 2. Add detailed comments for vhost_zerocopy_signal_used call 3. Add sleep in vhost shutting down instead of busy-wait for outstanding DMAs. 4. Copy small packets, don't do zero-copy callback in mavtap, mark it's DMA done in vhost 5. change zerocopy to bool in macvtap. It integrates V5 review comments from MST and Michał Mirosław 1. Prevent userspace apps from holding skb userspace buffers by copying userspace buffers to kernel in skb_clone, skb_copy, pskb_copy, pskb_expand_head. 2. It is also used HIGHDMA, SG feature bits to enable ZEROCOPY to remove the dependency of a new feature bit, we can add it later when new feature bit is available. This patchset includes: 1/4: Add a new sock zero-copy flag, SOCK_ZEROCOPY; 2/4: Add a new struct skb_ubuf_info in skb_share_info for userspace buffers release callback when lower device DMA has done for that skb, which is the last reference count gone; Or whenever skb_clone, skb_copy, pskb_copy, pskb_expand_head get call from tcpdump, filtering, these userspace buffers will be copied into kernel ... we don't want userspace apps to hold userspace buffers too long. 3/4: Add vhost zero-copy callback in vhost when skb last refcnt is gone; add vhost_zerocopy_signal_used to notify guest to release TX skb buffers. 4/4: Add macvtap zero-copy in lower device when sending packet is greater than 256 bytes. The patchset is built against most recent net-next linux 2.6.39-rc7. It has passed netperf/netserver multiple streams stress test, tcpdump suspended test, dynamically SG change test. Single TCP_STREAM 120 secs test results over ixgbe 10Gb NIC results: Message BW(Gb/s)qemu-kvm (NumCPU)vhost-net(NumCPU) PerfTop irq/s 4K 7408.57 92.1% 22.6% 1229 4K(Orig)4913.17 118.1% 84.1% 2086 8K 9129.90 89.3% 23.3% 1141 8K(Orig)7094.55 115.9% 84.7% 2157 16K 9178.81 89.1% 23.3% 1139 16K(Orig)8927.1 118.7% 83.4% 2262 64K 9171.43 88.4% 24.9% 1253 64K(Orig)9085.85115.9% 82.4% 2229 For message size less or equal than 2K, there is a known KVM guest TX overrun issue. With this zero-copy patch, the issue becomes more severe, guest io_exits has tripled than before, so the performance is not good. Once the TX overrun problem has been addressed, I will retest the small message size performance. drivers/net/macvtap.c | 132 --- drivers/vhost/net.c| 44 +- drivers/vhost/vhost.c | 49 +++ drivers/vhost/vhost.h | 13 include/linux/netdevice.h | 10 +++ include/linux/skbuff.h | 26 include/net/sock.h |1 + net/core/skbuff.c | 81 - 8 files changed, 345 insertions(+), 17 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V6 0/4]
This patchset add supports for TX zero-copy between guest and host kernel through vhost. It significantly reduces CPU utilization on the local host on which the guest is located (It reduced about 50% CPU usage for single stream test on the host, while 4K message size BW has increased about 50%). The patchset is based on previous submission and comments from the community regarding when/how to handle guest kernel buffers to be released. This is the simplest approach I can think of after comparing with several other solutions. This patchset has integrated V3 review comments from community: 1. Add more comments on how to use device ZEROCOPY flag; 2. Change device ZEROCOPY to available bit 31 3. Fix skb header linear allocation when virtio_net GSO is not enabled It has integrated V4 review comments from MST and Sridhar: 1. In vhost, using socket poll wake up for outstanding DMAs 2. Add detailed comments for vhost_zerocopy_signal_used call 3. Add sleep in vhost shutting down instead of busy-wait for outstanding DMAs. 4. Copy small packets, don't do zero-copy callback in mavtap, mark it's DMA done in vhost 5. change zerocopy to bool in macvtap. It integrates V5 review comments from MST and Michał Mirosław 1. Prevent userspace apps from holding skb userspace buffers by copying userspace buffers to kernel in skb_clone, skb_copy, pskb_copy, pskb_expand_head. 2. It is also used HIGHDMA, SG feature bits to enable ZEROCOPY to remove the dependency of a new feature bit, we can add it later when new feature bit is available. This patchset includes: 1/4: Add a new sock zero-copy flag, SOCK_ZEROCOPY; 2/4: Add a new struct skb_ubuf_info in skb_share_info for userspace buffers release callback when lower device DMA has done for that skb, which is the last reference count gone; Or whenever skb_clone, skb_copy, pskb_copy, pskb_expand_head get call from tcpdump, filtering, these userspace buffers will be copied into kernel ... we don't want userspace apps to hold userspace buffers too long. 3/4: Add vhost zero-copy callback in vhost when skb last refcnt is gone; add vhost_zerocopy_signal_used to notify guest to release TX skb buffers. 4/4: Add macvtap zero-copy in lower device when sending packet is greater than 256 bytes. The patchset is built against most recent net-next linux 2.6.39-rc7. It has passed netperf/netserver multiple streams stress test, tcpdump suspended test, dynamically SG change test. Single TCP_STREAM 120 secs test results over ixgbe 10Gb NIC results: Message BW(Gb/s)qemu-kvm (NumCPU)vhost-net(NumCPU) PerfTop irq/s 4K 7408.57 92.1% 22.6% 1229 4K(Orig)4913.17 118.1% 84.1% 2086 8K 9129.90 89.3% 23.3% 1141 8K(Orig)7094.55 115.9% 84.7% 2157 16K 9178.81 89.1% 23.3% 1139 16K(Orig)8927.1 118.7% 83.4% 2262 64K 9171.43 88.4% 24.9% 1253 64K(Orig)9085.85115.9% 82.4% 2229 For message size less or equal than 2K, there is a known KVM guest TX overrun issue. With this zero-copy patch, the issue becomes more severe, guest io_exits has tripled than before, so the performance is not good. Once the TX overrun problem has been addressed, I will retest the small message size performance. drivers/net/macvtap.c | 132 --- drivers/vhost/net.c| 44 +- drivers/vhost/vhost.c | 49 +++ drivers/vhost/vhost.h | 13 include/linux/netdevice.h | 10 +++ include/linux/skbuff.h | 26 include/net/sock.h |1 + net/core/skbuff.c | 81 - 8 files changed, 345 insertions(+), 17 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM PATCH v6 0/4] irqfd fixes and enhancements
Gregory Haskins wrote: > (Applies to kvm.git/master:4631e094) > > The following is the latest attempt to fix the races in irqfd/eventfd, as > well as restore DEASSIGN support. For more details, please read the patch > headers. > > You can also find this applied as a git tree: > > git pull > git://git.kernel.org/pub/scm/linux/kernel/git/ghaskins/linux-2.6-hacks.git > kvm/irqfd > > For reviewing convenience, here is a link to the entire virt/kvm/eventfd.c > file after the patches are applied: > > http://git.kernel.org/?p=linux/kernel/git/ghaskins/linux-2.6-hacks.git;a=blob;f=virt/kvm/eventfd.c;h=409d9e160f1f85618a5e3772937b2721a249399a;hb=85cfd57e33dcaea29971513334ca003764653b21 > > As always, this series has been tested against the kvm-eventfd unit test, and > appears to be functioning properly. You can download this test here: > > ftp://ftp.novell.com/dev/ghaskins/kvm-eventfd.tar.bz2 > > I've included version 4 of Davide's eventfd patch (ported to kvm.git) so > that its a complete reviewable series. Note, however, that there may be > later versions of his patch to consider for merging, so we should > coordinate with him. > > -Greg > > > [Changelog: > > v6: > *) Removed slow-work in favor of using a dedicated single-thread > workqueue. > *) Condensed cleanup path to always use deferred shutdown > *) Saved about 56 lines over v5, with the following diffstat: > > include/linux/kvm_host.h |2 > virt/kvm/eventfd.c | 248 > ++- > 2 files changed, 97 insertions(+), 153 deletions(-) > Forgot another change: *) Fixed race in ASSIGN for the proper acquisition order of the irqfd->eventfd > v5: >Untracked.. > ] > > --- > > Davide Libenzi (1): > eventfd - revised interface and cleanups (4th rev) > > Gregory Haskins (3): > KVM: add irqfd DEASSIGN feature > KVM: Fix races in irqfd using new eventfd_kref_get interface > kvm: prepare irqfd for having interrupts disabled during > eventfd->release > > > drivers/lguest/lg.h |2 > drivers/lguest/lguest_user.c |4 - > fs/aio.c | 24 +--- > fs/eventfd.c | 126 --- > include/linux/aio.h |4 - > include/linux/eventfd.h | 35 +- > include/linux/kvm.h |2 > include/linux/kvm_host.h |5 + > virt/kvm/Kconfig |1 > virt/kvm/eventfd.c | 229 > +++--- > 10 files changed, 321 insertions(+), 111 deletions(-) > > signature.asc Description: OpenPGP digital signature
[KVM PATCH v6 0/4] irqfd fixes and enhancements
(Applies to kvm.git/master:4631e094) The following is the latest attempt to fix the races in irqfd/eventfd, as well as restore DEASSIGN support. For more details, please read the patch headers. You can also find this applied as a git tree: git pull git://git.kernel.org/pub/scm/linux/kernel/git/ghaskins/linux-2.6-hacks.git kvm/irqfd For reviewing convenience, here is a link to the entire virt/kvm/eventfd.c file after the patches are applied: http://git.kernel.org/?p=linux/kernel/git/ghaskins/linux-2.6-hacks.git;a=blob;f=virt/kvm/eventfd.c;h=409d9e160f1f85618a5e3772937b2721a249399a;hb=85cfd57e33dcaea29971513334ca003764653b21 As always, this series has been tested against the kvm-eventfd unit test, and appears to be functioning properly. You can download this test here: ftp://ftp.novell.com/dev/ghaskins/kvm-eventfd.tar.bz2 I've included version 4 of Davide's eventfd patch (ported to kvm.git) so that its a complete reviewable series. Note, however, that there may be later versions of his patch to consider for merging, so we should coordinate with him. -Greg [Changelog: v6: *) Removed slow-work in favor of using a dedicated single-thread workqueue. *) Condensed cleanup path to always use deferred shutdown *) Saved about 56 lines over v5, with the following diffstat: include/linux/kvm_host.h |2 virt/kvm/eventfd.c | 248 ++- 2 files changed, 97 insertions(+), 153 deletions(-) v5: Untracked.. ] --- Davide Libenzi (1): eventfd - revised interface and cleanups (4th rev) Gregory Haskins (3): KVM: add irqfd DEASSIGN feature KVM: Fix races in irqfd using new eventfd_kref_get interface kvm: prepare irqfd for having interrupts disabled during eventfd->release drivers/lguest/lg.h |2 drivers/lguest/lguest_user.c |4 - fs/aio.c | 24 +--- fs/eventfd.c | 126 --- include/linux/aio.h |4 - include/linux/eventfd.h | 35 +- include/linux/kvm.h |2 include/linux/kvm_host.h |5 + virt/kvm/Kconfig |1 virt/kvm/eventfd.c | 229 +++--- 10 files changed, 321 insertions(+), 111 deletions(-) -- Signature -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html