Re: [PATCH] kvm: arm/arm64 : fix vm's hanging at startup time
>Hi, > >On Wed, Nov 21, 2018 at 04:56:54PM +0800, peng.h...@zte.com.cn wrote: >> >On 19/11/2018 09:10, Mark Rutland wrote: >> >> On Sat, Nov 17, 2018 at 10:58:37AM +0800, peng.h...@zte.com.cn wrote: >> On 16/11/18 00:23, peng.h...@zte.com.cn wrote: >> >> Hi, >> >>> When virtual machine starts, hang up. >> >> >> >> I take it you mean the *guest* hangs? Because it doesn't get a timer >> >> interrupt? >> >> >> >>> The kernel version of guest >> >>> is 4.16. Host support vgic_v3. >> >> >> >> Your host kernel is something recent, I guess? >> >> >> >>> It was mainly due to the incorrect vgic_irq's(intid=27) group value >> >>> during injection interruption. when kvm_vgic_vcpu_init is called, >> >>> dist is not initialized at this time. Unable to get vgic V3 or V2 >> >>> correctly, so group is not set. >> >> >> >> Mmh, that shouldn't happen with (v)GICv3. Do you use QEMU (which >> >> version?) or some other userland tool? >> >> >> > >> > QEMU emulator version 3.0.50 . >> > >> >>> group is setted to 1 when vgic_mmio_write_group is invoked at some >> >>> time. >> >>> when irq->group=0 (intid=27), No ICH_LR_GROUP flag was set and >> >>> interrupt injection failed. >> >>> >> >>> Signed-off-by: Peng Hao >> >>> --- >> >>> virt/kvm/arm/vgic/vgic-v3.c | 2 +- >> >>> 1 file changed, 1 insertion(+), 1 deletion(-) >> >>> >> >>> diff --git a/virt/kvm/arm/vgic/vgic-v3.c >> >>> b/virt/kvm/arm/vgic/vgic-v3.c >> >>> index 9c0dd23..d101000 100644 >> >>> --- a/virt/kvm/arm/vgic/vgic-v3.c >> >>> +++ b/virt/kvm/arm/vgic/vgic-v3.c >> >>> @@ -198,7 +198,7 @@ void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, >> >>> struct vgic_irq *irq, int lr) if (vgic_irq_is_mapped_level(irq) && >> >>> (val & ICH_LR_PENDING_BIT)) irq->line_level = false; >> >>> >> >>> -if (irq->group) >> >>> +if (model == KVM_DEV_TYPE_ARM_VGIC_V3) >> >> >> >> This is not the right fix, not only because it basically reverts the >> >> GICv3 part of 87322099052 (KVM: arm/arm64: vgic: Signal IRQs using >> >> their configured group). >> >> >> >> Can you try to work out why kvm_vgic_vcpu_init() is apparently called >> >> before dist->vgic_model is set, also what value it has? >> >> If I understand the code correctly, that shouldn't happen for a GICv3. >> >> >> > Even if the value of group is correctly assigned in >> > kvm_vgic_vcpu_init, the group is then written 0 through >> > >vgic_mmio_write_group. >> > If the interrupt comes at this time, the interrupt injection fails. >> >> Does that mean that the guest is configuring its interrupts as Group0? >> That sounds wrong, Linux should configure all it's interrupts as >> non-secure group1. >> >>> >> >>> no, I think that uefi dose this, not linux. >> >>> 1. kvm_vgic_vcpu_init >> >>> 2. vgic_create >> >>> 3. kvm_vgic_dist_init >> >>> 4.vgic_mmio_write_group: uefi as guest, write group=0 >> >>> 5.vgic_mmio_write_group: linux as guest, write group=1 >> >> >> >> Is this the same issue fixed by EDK2 commit: >> >> >> >> 66127011a544b90e ("ArmPkg/ArmGicDxe ARM: fix encoding for GICv3 interrupt >> >> acknowledge") >> >> >> >> ... where EDK2 would try to use IAR0 rather than IAR1? >> >> >> >> The commit messages notes this lead to a boot-time hang. >> > >> >I managed to trigger an issue with a really old EFI implementation that >> >doesn't configure its interrupts as Group1, and yet tries to ACK its >> >interrupts using the Group1 accessor. Guess what? It is not going to work. >> > >> >Commit c7fefb690661f2e38afcb8200bd318ecf38ab961 in the edk2 tree seems >> >to be the fix (I only assume it does, I haven't actually checked). A >> >recent build, as found in Debian Buster, works perfectly (tested with >> >both QEMU v2.12 and tip of tree). >> > >> >Now, I really don't get what you're saying about Linux not getting >> >interrupts. How do you get to booting Linux if EFI is not making any >> >forward progress? Are you trying them independently? >> > >> I start linux with bypassing uefi, the print info is the same. >> [507107.748908] vgic_mmio_write_group:## intid/27 group=0 >> [507107.752185] vgic_mmio_write_group:## intid/27 group=0 >> [507107.899566] vgic_mmio_write_group:## intid/27 group=1 >> [507107.907370] vgic_mmio_write_group:## intid/27 group=1 >> the command line is like this: >> /home/qemu-patch/linshi/qemu/aarch64-softmmu/qemu-system-aarch64 -machine >> virt-3.1,accel=kvm,usb=off,dump-guest-core=off,gic-version=3 -kernel >> /home/kernelboot/vmlinuz-4.16.0+ -initrd >> /home/kernelboot/initramfs-4.16.0+.img -append root=/dev/mapper/cla-root ro >> crashkernel=auto rd.lvm.lv=cla/root rd.lvm.lv=cla/swap.UTF-8 -drive >> file=/home/centos74-ph/boot.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0 >> -device >>
[PATCH v5 6/7] iommu/virtio: Add probe request
When the device offers the probe feature, send a probe request for each device managed by the IOMMU. Extract RESV_MEM information. When we encounter a MSI doorbell region, set it up as a IOMMU_RESV_MSI region. This will tell other subsystems that there is no need to map the MSI doorbell in the virtio-iommu, because MSIs bypass it. Reviewed-by: Eric Auger Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/virtio-iommu.c | 156 -- include/uapi/linux/virtio_iommu.h | 38 2 files changed, 188 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c index 7540dab9c8dc..0c7a7fa2628d 100644 --- a/drivers/iommu/virtio-iommu.c +++ b/drivers/iommu/virtio-iommu.c @@ -46,6 +46,7 @@ struct viommu_dev { struct iommu_domain_geometrygeometry; u64 pgsize_bitmap; u8 domain_bits; + u32 probe_size; }; struct viommu_mapping { @@ -67,8 +68,10 @@ struct viommu_domain { }; struct viommu_endpoint { + struct device *dev; struct viommu_dev *viommu; struct viommu_domain*vdomain; + struct list_headresv_regions; }; struct viommu_request { @@ -119,6 +122,9 @@ static off_t viommu_get_write_desc_offset(struct viommu_dev *viommu, { size_t tail_size = sizeof(struct virtio_iommu_req_tail); + if (req->type == VIRTIO_IOMMU_T_PROBE) + return len - viommu->probe_size - tail_size; + return len - tail_size; } @@ -393,6 +399,110 @@ static int viommu_replay_mappings(struct viommu_domain *vdomain) return ret; } +static int viommu_add_resv_mem(struct viommu_endpoint *vdev, + struct virtio_iommu_probe_resv_mem *mem, + size_t len) +{ + size_t size; + u64 start64, end64; + phys_addr_t start, end; + struct iommu_resv_region *region = NULL; + unsigned long prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO; + + start = start64 = le64_to_cpu(mem->start); + end = end64 = le64_to_cpu(mem->end); + size = end64 - start64 + 1; + + /* Catch any overflow, including the unlikely end64 - start64 + 1 = 0 */ + if (start != start64 || end != end64 || size < end64 - start64) + return -EOVERFLOW; + + if (len < sizeof(*mem)) + return -EINVAL; + + switch (mem->subtype) { + default: + dev_warn(vdev->dev, "unknown resv mem subtype 0x%x\n", +mem->subtype); + /* Fall-through */ + case VIRTIO_IOMMU_RESV_MEM_T_RESERVED: + region = iommu_alloc_resv_region(start, size, 0, +IOMMU_RESV_RESERVED); + break; + case VIRTIO_IOMMU_RESV_MEM_T_MSI: + region = iommu_alloc_resv_region(start, size, prot, +IOMMU_RESV_MSI); + break; + } + if (!region) + return -ENOMEM; + + list_add(>resv_regions, >list); + return 0; +} + +static int viommu_probe_endpoint(struct viommu_dev *viommu, struct device *dev) +{ + int ret; + u16 type, len; + size_t cur = 0; + size_t probe_len; + struct virtio_iommu_req_probe *probe; + struct virtio_iommu_probe_property *prop; + struct iommu_fwspec *fwspec = dev->iommu_fwspec; + struct viommu_endpoint *vdev = fwspec->iommu_priv; + + if (!fwspec->num_ids) + return -EINVAL; + + probe_len = sizeof(*probe) + viommu->probe_size + + sizeof(struct virtio_iommu_req_tail); + probe = kzalloc(probe_len, GFP_KERNEL); + if (!probe) + return -ENOMEM; + + probe->head.type = VIRTIO_IOMMU_T_PROBE; + /* +* For now, assume that properties of an endpoint that outputs multiple +* IDs are consistent. Only probe the first one. +*/ + probe->endpoint = cpu_to_le32(fwspec->ids[0]); + + ret = viommu_send_req_sync(viommu, probe, probe_len); + if (ret) + goto out_free; + + prop = (void *)probe->properties; + type = le16_to_cpu(prop->type) & VIRTIO_IOMMU_PROBE_T_MASK; + + while (type != VIRTIO_IOMMU_PROBE_T_NONE && + cur < viommu->probe_size) { + len = le16_to_cpu(prop->length) + sizeof(*prop); + + switch (type) { + case VIRTIO_IOMMU_PROBE_T_RESV_MEM: + ret = viommu_add_resv_mem(vdev, (void *)prop, len); + break; + default: + dev_err(dev, "unknown viommu prop 0x%x\n", type); + } + + if (ret) + dev_err(dev, "failed to parse viommu prop 0x%x\n",
[PATCH v5 7/7] iommu/virtio: Add event queue
The event queue offers a way for the device to report access faults from endpoints. It is implemented on virtqueue #1. Whenever the host needs to signal a fault, it fills one of the buffers offered by the guest and interrupts it. Reviewed-by: Eric Auger Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/virtio-iommu.c | 115 +++--- include/uapi/linux/virtio_iommu.h | 19 + 2 files changed, 125 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c index 0c7a7fa2628d..e6ff515d41c0 100644 --- a/drivers/iommu/virtio-iommu.c +++ b/drivers/iommu/virtio-iommu.c @@ -29,7 +29,8 @@ #define MSI_IOVA_LENGTH0x10 #define VIOMMU_REQUEST_VQ 0 -#define VIOMMU_NR_VQS 1 +#define VIOMMU_EVENT_VQ1 +#define VIOMMU_NR_VQS 2 struct viommu_dev { struct iommu_device iommu; @@ -41,6 +42,7 @@ struct viommu_dev { struct virtqueue*vqs[VIOMMU_NR_VQS]; spinlock_t request_lock; struct list_headrequests; + void*evts; /* Device configuration */ struct iommu_domain_geometrygeometry; @@ -82,6 +84,15 @@ struct viommu_request { charbuf[]; }; +#define VIOMMU_FAULT_RESV_MASK 0xff00 + +struct viommu_event { + union { + u32 head; + struct virtio_iommu_fault fault; + }; +}; + #define to_viommu_domain(domain) \ container_of(domain, struct viommu_domain, domain) @@ -503,6 +514,68 @@ static int viommu_probe_endpoint(struct viommu_dev *viommu, struct device *dev) return ret; } +static int viommu_fault_handler(struct viommu_dev *viommu, + struct virtio_iommu_fault *fault) +{ + char *reason_str; + + u8 reason = fault->reason; + u32 flags = le32_to_cpu(fault->flags); + u32 endpoint= le32_to_cpu(fault->endpoint); + u64 address = le64_to_cpu(fault->address); + + switch (reason) { + case VIRTIO_IOMMU_FAULT_R_DOMAIN: + reason_str = "domain"; + break; + case VIRTIO_IOMMU_FAULT_R_MAPPING: + reason_str = "page"; + break; + case VIRTIO_IOMMU_FAULT_R_UNKNOWN: + default: + reason_str = "unknown"; + break; + } + + /* TODO: find EP by ID and report_iommu_fault */ + if (flags & VIRTIO_IOMMU_FAULT_F_ADDRESS) + dev_err_ratelimited(viommu->dev, "%s fault from EP %u at %#llx [%s%s%s]\n", + reason_str, endpoint, address, + flags & VIRTIO_IOMMU_FAULT_F_READ ? "R" : "", + flags & VIRTIO_IOMMU_FAULT_F_WRITE ? "W" : "", + flags & VIRTIO_IOMMU_FAULT_F_EXEC ? "X" : ""); + else + dev_err_ratelimited(viommu->dev, "%s fault from EP %u\n", + reason_str, endpoint); + return 0; +} + +static void viommu_event_handler(struct virtqueue *vq) +{ + int ret; + unsigned int len; + struct scatterlist sg[1]; + struct viommu_event *evt; + struct viommu_dev *viommu = vq->vdev->priv; + + while ((evt = virtqueue_get_buf(vq, )) != NULL) { + if (len > sizeof(*evt)) { + dev_err(viommu->dev, + "invalid event buffer (len %u != %zu)\n", + len, sizeof(*evt)); + } else if (!(evt->head & VIOMMU_FAULT_RESV_MASK)) { + viommu_fault_handler(viommu, >fault); + } + + sg_init_one(sg, evt, sizeof(*evt)); + ret = virtqueue_add_inbuf(vq, sg, 1, evt, GFP_ATOMIC); + if (ret) + dev_err(viommu->dev, "could not add event buffer\n"); + } + + virtqueue_kick(vq); +} + /* IOMMU API */ static struct iommu_domain *viommu_domain_alloc(unsigned type) @@ -885,16 +958,35 @@ static struct iommu_ops viommu_ops = { static int viommu_init_vqs(struct viommu_dev *viommu) { struct virtio_device *vdev = dev_to_virtio(viommu->dev); - const char *name = "request"; - void *ret; + const char *names[] = { "request", "event" }; + vq_callback_t *callbacks[] = { + NULL, /* No async requests */ + viommu_event_handler, + }; - ret = virtio_find_single_vq(vdev, NULL, name); - if (IS_ERR(ret)) { - dev_err(viommu->dev, "cannot find VQ\n"); - return PTR_ERR(ret); - } + return virtio_find_vqs(vdev, VIOMMU_NR_VQS, viommu->vqs, callbacks, + names,
[PATCH v5 1/7] dt-bindings: virtio-mmio: Add IOMMU description
The nature of a virtio-mmio node is discovered by the virtio driver at probe time. However the DMA relation between devices must be described statically. When a virtio-mmio node is a virtio-iommu device, it needs an "#iommu-cells" property as specified by bindings/iommu/iommu.txt. Otherwise, the virtio-mmio device may perform DMA through an IOMMU, which requires an "iommus" property. Describe these requirements in the device-tree bindings documentation. Reviewed-by: Rob Herring Reviewed-by: Eric Auger Signed-off-by: Jean-Philippe Brucker --- .../devicetree/bindings/virtio/mmio.txt | 30 +++ 1 file changed, 30 insertions(+) diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt index 5069c1b8e193..21af30fbb81f 100644 --- a/Documentation/devicetree/bindings/virtio/mmio.txt +++ b/Documentation/devicetree/bindings/virtio/mmio.txt @@ -8,10 +8,40 @@ Required properties: - reg: control registers base address and size including configuration space - interrupts: interrupt generated by the device +Required properties for virtio-iommu: + +- #iommu-cells:When the node corresponds to a virtio-iommu device, it is + linked to DMA masters using the "iommus" or "iommu-map" + properties [1][2]. #iommu-cells specifies the size of the + "iommus" property. For virtio-iommu #iommu-cells must be + 1, each cell describing a single endpoint ID. + +Optional properties: + +- iommus: If the device accesses memory through an IOMMU, it should + have an "iommus" property [1]. Since virtio-iommu itself + does not access memory through an IOMMU, the "virtio,mmio" + node cannot have both an "#iommu-cells" and an "iommus" + property. + Example: virtio_block@3000 { compatible = "virtio,mmio"; reg = <0x3000 0x100>; interrupts = <41>; + + /* Device has endpoint ID 23 */ + iommus = < 23> } + + viommu: iommu@3100 { + compatible = "virtio,mmio"; + reg = <0x3100 0x100>; + interrupts = <42>; + + #iommu-cells = <1> + } + +[1] Documentation/devicetree/bindings/iommu/iommu.txt +[2] Documentation/devicetree/bindings/pci/pci-iommu.txt -- 2.19.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 2/7] dt-bindings: virtio: Add virtio-pci-iommu node
Some systems implement virtio-iommu as a PCI endpoint. The operating system needs to discover the relationship between IOMMU and masters long before the PCI endpoint gets probed. Add a PCI child node to describe the virtio-iommu device. The virtio-pci-iommu is conceptually split between a PCI programming interface and a translation component on the parent bus. The latter doesn't have a node in the device tree. The virtio-pci-iommu node describes both, by linking the PCI endpoint to "iommus" property of DMA master nodes and to "iommu-map" properties of bus nodes. Reviewed-by: Rob Herring Reviewed-by: Eric Auger Signed-off-by: Jean-Philippe Brucker --- .../devicetree/bindings/virtio/iommu.txt | 66 +++ 1 file changed, 66 insertions(+) create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt diff --git a/Documentation/devicetree/bindings/virtio/iommu.txt b/Documentation/devicetree/bindings/virtio/iommu.txt new file mode 100644 index ..2407fea0651c --- /dev/null +++ b/Documentation/devicetree/bindings/virtio/iommu.txt @@ -0,0 +1,66 @@ +* virtio IOMMU PCI device + +When virtio-iommu uses the PCI transport, its programming interface is +discovered dynamically by the PCI probing infrastructure. However the +device tree statically describes the relation between IOMMU and DMA +masters. Therefore, the PCI root complex that hosts the virtio-iommu +contains a child node representing the IOMMU device explicitly. + +Required properties: + +- compatible: Should be "virtio,pci-iommu" +- reg: PCI address of the IOMMU. As defined in the PCI Bus + Binding reference [1], the reg property is a five-cell + address encoded as (phys.hi phys.mid phys.lo size.hi + size.lo). phys.hi should contain the device's BDF as + 0b dfff . The other cells + should be zero. +- #iommu-cells:Each platform DMA master managed by the IOMMU is assigned + an endpoint ID, described by the "iommus" property [2]. + For virtio-iommu, #iommu-cells must be 1. + +Notes: + +- DMA from the IOMMU device isn't managed by another IOMMU. Therefore the + virtio-iommu node doesn't have an "iommus" property, and is omitted from + the iommu-map property of the root complex. + +Example: + +pcie@1000 { + compatible = "pci-host-ecam-generic"; + ... + + /* The IOMMU programming interface uses slot 00:01.0 */ + iommu0: iommu@0008 { + compatible = "virtio,pci-iommu"; + reg = <0x0800 0 0 0 0>; + #iommu-cells = <1>; + }; + + /* +* The IOMMU manages all functions in this PCI domain except +* itself. Omit BDF 00:01.0. +*/ + iommu-map = <0x0 0x0 0x8> + <0x9 0x9 0xfff7>; +}; + +pcie@2000 { + compatible = "pci-host-ecam-generic"; + ... + /* +* The IOMMU also manages all functions from this domain, +* with endpoint IDs 0x1 - 0x1 +*/ + iommu-map = <0x0 0x1 0x1>; +}; + +ethernet@fe001000 { + ... + /* The IOMMU manages this platform device with endpoint ID 0x2 */ + iommus = < 0x2>; +}; + +[1] Documentation/devicetree/bindings/pci/pci.txt +[2] Documentation/devicetree/bindings/iommu/iommu.txt -- 2.19.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 5/7] iommu: Add virtio-iommu driver
The virtio IOMMU is a para-virtualized device, allowing to send IOMMU requests such as map/unmap over virtio transport without emulating page tables. This implementation handles ATTACH, DETACH, MAP and UNMAP requests. The bulk of the code transforms calls coming from the IOMMU API into corresponding virtio requests. Mappings are kept in an interval tree instead of page tables. Signed-off-by: Jean-Philippe Brucker --- MAINTAINERS | 7 + drivers/iommu/Kconfig | 11 + drivers/iommu/Makefile| 1 + drivers/iommu/virtio-iommu.c | 916 ++ include/uapi/linux/virtio_ids.h | 1 + include/uapi/linux/virtio_iommu.h | 104 6 files changed, 1040 insertions(+) create mode 100644 drivers/iommu/virtio-iommu.c create mode 100644 include/uapi/linux/virtio_iommu.h diff --git a/MAINTAINERS b/MAINTAINERS index 1689dcfec800..3d8550c76f4a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -15946,6 +15946,13 @@ S: Maintained F: drivers/virtio/virtio_input.c F: include/uapi/linux/virtio_input.h +VIRTIO IOMMU DRIVER +M: Jean-Philippe Brucker +L: virtualizat...@lists.linux-foundation.org +S: Maintained +F: drivers/iommu/virtio-iommu.c +F: include/uapi/linux/virtio_iommu.h + VIRTUAL BOX GUEST DEVICE DRIVER M: Hans de Goede M: Arnd Bergmann diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index bf2bbfa2a399..db5f2b8c23f5 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -464,4 +464,15 @@ config QCOM_IOMMU help Support for IOMMU on certain Qualcomm SoCs. +config VIRTIO_IOMMU + bool "Virtio IOMMU driver" + depends on VIRTIO=y + select IOMMU_API + select INTERVAL_TREE + select ARM_DMA_USE_IOMMU if ARM + help + Para-virtualised IOMMU driver with virtio. + + Say Y here if you intend to run this kernel as a guest. + endif # IOMMU_SUPPORT diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 5481e5fe1f95..bd7e55751d09 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -36,3 +36,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o obj-$(CONFIG_S390_IOMMU) += s390-iommu.o obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c new file mode 100644 index ..7540dab9c8dc --- /dev/null +++ b/drivers/iommu/virtio-iommu.c @@ -0,0 +1,916 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Virtio driver for the paravirtualized IOMMU + * + * Copyright (C) 2018 Arm Limited + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#define MSI_IOVA_BASE 0x800 +#define MSI_IOVA_LENGTH0x10 + +#define VIOMMU_REQUEST_VQ 0 +#define VIOMMU_NR_VQS 1 + +struct viommu_dev { + struct iommu_device iommu; + struct device *dev; + struct virtio_device*vdev; + + struct ida domain_ids; + + struct virtqueue*vqs[VIOMMU_NR_VQS]; + spinlock_t request_lock; + struct list_headrequests; + + /* Device configuration */ + struct iommu_domain_geometrygeometry; + u64 pgsize_bitmap; + u8 domain_bits; +}; + +struct viommu_mapping { + phys_addr_t paddr; + struct interval_tree_node iova; + u32 flags; +}; + +struct viommu_domain { + struct iommu_domain domain; + struct viommu_dev *viommu; + struct mutexmutex; /* protects viommu pointer */ + unsigned intid; + + spinlock_t mappings_lock; + struct rb_root_cached mappings; + + unsigned long nr_endpoints; +}; + +struct viommu_endpoint { + struct viommu_dev *viommu; + struct viommu_domain*vdomain; +}; + +struct viommu_request { + struct list_headlist; + void*writeback; + unsigned intwrite_offset; + unsigned intlen; + charbuf[]; +}; + +#define to_viommu_domain(domain) \ + container_of(domain, struct viommu_domain, domain) + +static int viommu_get_req_errno(void *buf, size_t len) +{ + struct virtio_iommu_req_tail *tail = buf + len - sizeof(*tail); + + switch (tail->status) { + case
[PATCH v5 0/7] Add virtio-iommu driver
Implement the virtio-iommu driver, following specification v0.9 [1]. Since v4 [2] I fixed the issues reported by Eric, and added Reviewed-by from Eric and Rob. Thanks! I changed the specification to fix one inconsistency discussed in v4. That the device fills the probe buffer with zeroes is now a "SHOULD" instead of a "MAY", since it's the only way for the driver to know if the device wrote the status. Existing devices already do this. In addition the device now needs to fill the three padding bytes at the tail with zeroes. You can find Linux driver and kvmtool device on branches virtio-iommu/v0.9 [3]. I also lightly tested with Eric's latest QEMU device [4]. [1] Virtio-iommu specification v0.9, sources, pdf and diff from v0.8 git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9 http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf http://jpbrucker.net/virtio-iommu/spec/diffs/virtio-iommu-pdf-diff-v0.8-v0.9.pdf [2] [PATCH v4 0/7] Add virtio-iommu driver https://lists.linuxfoundation.org/pipermail/iommu/2018-November/031074.html [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9 git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9 [4] [RFC v9 00/17] VIRTIO-IOMMU device https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html Jean-Philippe Brucker (7): dt-bindings: virtio-mmio: Add IOMMU description dt-bindings: virtio: Add virtio-pci-iommu node of: Allow the iommu-map property to omit untranslated devices PCI: OF: Initialize dev->fwnode appropriately iommu: Add virtio-iommu driver iommu/virtio: Add probe request iommu/virtio: Add event queue .../devicetree/bindings/virtio/iommu.txt | 66 + .../devicetree/bindings/virtio/mmio.txt | 30 + MAINTAINERS |7 + drivers/iommu/Kconfig | 11 + drivers/iommu/Makefile|1 + drivers/iommu/virtio-iommu.c | 1157 + drivers/of/base.c | 10 +- drivers/pci/of.c |7 + include/uapi/linux/virtio_ids.h |1 + include/uapi/linux/virtio_iommu.h | 161 +++ 10 files changed, 1448 insertions(+), 3 deletions(-) create mode 100644 Documentation/devicetree/bindings/virtio/iommu.txt create mode 100644 drivers/iommu/virtio-iommu.c create mode 100644 include/uapi/linux/virtio_iommu.h -- 2.19.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 4/7] PCI: OF: Initialize dev->fwnode appropriately
For PCI devices that have an OF node, set the fwnode as well. This way drivers that rely on fwnode don't need the special case described by commit f94277af03ea ("of/platform: Initialise dev->fwnode appropriately"). Acked-by: Bjorn Helgaas Signed-off-by: Jean-Philippe Brucker --- drivers/pci/of.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/pci/of.c b/drivers/pci/of.c index 4c4217d0c3f1..c272ecfcd038 100644 --- a/drivers/pci/of.c +++ b/drivers/pci/of.c @@ -21,12 +21,15 @@ void pci_set_of_node(struct pci_dev *dev) return; dev->dev.of_node = of_pci_find_child_device(dev->bus->dev.of_node, dev->devfn); + if (dev->dev.of_node) + dev->dev.fwnode = >dev.of_node->fwnode; } void pci_release_of_node(struct pci_dev *dev) { of_node_put(dev->dev.of_node); dev->dev.of_node = NULL; + dev->dev.fwnode = NULL; } void pci_set_bus_of_node(struct pci_bus *bus) @@ -35,12 +38,16 @@ void pci_set_bus_of_node(struct pci_bus *bus) bus->dev.of_node = pcibios_get_phb_of_node(bus); else bus->dev.of_node = of_node_get(bus->self->dev.of_node); + + if (bus->dev.of_node) + bus->dev.fwnode = >dev.of_node->fwnode; } void pci_release_bus_of_node(struct pci_bus *bus) { of_node_put(bus->dev.of_node); bus->dev.of_node = NULL; + bus->dev.fwnode = NULL; } struct device_node * __weak pcibios_get_phb_of_node(struct pci_bus *bus) -- 2.19.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 3/7] of: Allow the iommu-map property to omit untranslated devices
In PCI root complex nodes, the iommu-map property describes the IOMMU that translates each endpoint. On some platforms, the IOMMU itself is presented as a PCI endpoint (e.g. AMD IOMMU and virtio-iommu). This isn't supported by the current OF driver, which expects all endpoints to have an IOMMU. Allow the iommu-map property to have gaps. Relaxing of_map_rid() also allows the msi-map property to have gaps, which is invalid since MSIs always reach an MSI controller. In that case pci_msi_setup_msi_irqs() will return an error when attempting to find the device's MSI domain. Reviewed-by: Rob Herring Signed-off-by: Jean-Philippe Brucker --- drivers/of/base.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/of/base.c b/drivers/of/base.c index 09692c9b32a7..99f6bfa9b898 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -2237,8 +2237,12 @@ int of_map_rid(struct device_node *np, u32 rid, return 0; } - pr_err("%pOF: Invalid %s translation - no match for rid 0x%x on %pOF\n", - np, map_name, rid, target && *target ? *target : NULL); - return -EFAULT; + pr_info("%pOF: no %s translation for rid 0x%x on %pOF\n", np, map_name, + rid, target && *target ? *target : NULL); + + /* Bypasses translation */ + if (id_out) + *id_out = rid; + return 0; } EXPORT_SYMBOL_GPL(of_map_rid); -- 2.19.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 4/4] arm64: KVM: Implement workaround for Cortex-A76 erratum 1165522
On 08/11/2018 20:10, Christoffer Dall wrote: > On Thu, Nov 08, 2018 at 06:05:55PM +, Marc Zyngier wrote: >> On 06/11/18 08:15, Christoffer Dall wrote: >>> On Mon, Nov 05, 2018 at 02:36:16PM +, Marc Zyngier wrote: Early versions of Cortex-A76 can end-up with corrupt TLBs if they speculate an AT instruction in during a guest switch while the S1/S2 system registers are in an inconsistent state. Work around it by: - Mandating VHE - Make sure that S1 and S2 system registers are consistent before clearing HCR_EL2.TGE, which allows AT to target the EL1 translation regime These two things together ensure that we cannot hit this erratum. Signed-off-by: Marc Zyngier --- Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 12 arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/include/asm/kvm_host.h | 3 +++ arch/arm64/include/asm/kvm_hyp.h | 6 ++ arch/arm64/kernel/cpu_errata.c | 8 arch/arm64/kvm/hyp/switch.c| 14 ++ 7 files changed, 46 insertions(+), 1 deletion(-) diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index 76ccded8b74c..04f0bc4690c6 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -57,6 +57,7 @@ stable kernels. | ARM| Cortex-A73 | #858921 | ARM64_ERRATUM_858921| | ARM| Cortex-A55 | #1024718| ARM64_ERRATUM_1024718 | | ARM| Cortex-A76 | #1188873| ARM64_ERRATUM_1188873 | +| ARM| Cortex-A76 | #1165522| ARM64_ERRATUM_1165522 | | ARM| MMU-500 | #841119,#826419 | N/A | || | | | | Cavium | ThunderX ITS| #22375, #24313 | CAVIUM_ERRATUM_22375| diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 787d7850e064..a68bc6cc2167 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -497,6 +497,18 @@ config ARM64_ERRATUM_1188873 If unsure, say Y. +config ARM64_ERRATUM_1165522 + bool "Cortex-A76: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" + default y + help +This option adds work arounds for ARM Cortex-A76 erratum 1165522 + +Affected Cortex-A76 cores (r0p0, r1p0, r2p0) could end-up with +corrupted TLBs by speculating an AT instruction during a guest +context switch. + +If unsure, say Y. + config CAVIUM_ERRATUM_22375 bool "Cavium erratum 22375, 24313" default y diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 6e2d254c09eb..62d8cd15fdf2 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -54,7 +54,8 @@ #define ARM64_HAS_CRC32 33 #define ARM64_SSBS34 #define ARM64_WORKAROUND_1188873 35 +#define ARM64_WORKAROUND_1165522 36 -#define ARM64_NCAPS 36 +#define ARM64_NCAPS 37 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 7d6e974d024a..8f486026ac87 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -435,6 +435,9 @@ static inline bool kvm_arch_sve_requires_vhe(void) static inline bool kvm_arch_impl_requires_vhe(void) { /* Some implementations have defects that confine them to VHE */ + if (cpus_have_cap(ARM64_WORKAROUND_1165522)) + return true; + return false; } diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index 23aca66767f9..9681360d0959 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -163,6 +163,12 @@ static __always_inline void __hyp_text __load_guest_stage2(struct kvm *kvm) { write_sysreg(kvm->arch.vtcr, vtcr_el2); write_sysreg(kvm->arch.vttbr, vttbr_el2); + + /* + * ARM erratum 1165522 requires the actual execution of the + * above before we can switch to the guest translation regime. + */ >>> >>> Is it about a guest translation 'regime' or should this just say before >>> we can enable stage 2
Re: [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests
Dave Martin writes: > This series implements basic support for allowing KVM guests to use the > Arm Scalable Vector Extension (SVE). > > The patches are based on v4.19-rc5. > > The patches are also available on a branch for reviewer convenience. [1] > > This is a significant overhaul of the previous preliminary series [2], > with the major changes outlined below, and additional minor updates in > response to review feedback (see the individual patches for those). > > In the interest of getting this series out for review, > This series is **completely untested**. Richard is currently working on VHE support for QEMU and we already have SVE system emulation support as of 3.1 so hopefully QEMU will be able to test this soon (and probably shake out a few of our own bugs ;-). > Reviewers should focus on the proposed API (but any other comments are > of course welcome!) I've finished my pass for this revision. Sorry it took so long to get to it. -- Alex Bennée ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 23/23] KVM: arm64/sve: Document KVM API extensions for SVE
Dave Martin writes: > This patch adds sections to the KVM API documentation describing > the extensions for supporting the Scalable Vector Extension (SVE) > in guests. > > Signed-off-by: Dave Martin > --- > Documentation/virtual/kvm/api.txt | 142 > +- > 1 file changed, 139 insertions(+), 3 deletions(-) > > diff --git a/Documentation/virtual/kvm/api.txt > b/Documentation/virtual/kvm/api.txt > index a58067b..b8257d4 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -2054,13 +2054,21 @@ Specifically: >0x6030 0010 004c SPSR_UND64 spsr[KVM_SPSR_UND] >0x6030 0010 004e SPSR_IRQ64 spsr[KVM_SPSR_IRQ] >0x6060 0010 0050 SPSR_FIQ64 spsr[KVM_SPSR_FIQ] > - 0x6040 0010 0054 V0 128 fp_regs.vregs[0] > - 0x6040 0010 0058 V1 128 fp_regs.vregs[1] > + 0x6040 0010 0054 V0 128 fp_regs.vregs[0](*) > + 0x6040 0010 0058 V1 128 fp_regs.vregs[1](*) > ... > - 0x6040 0010 00d0 V31128 fp_regs.vregs[31] > + 0x6040 0010 00d0 V31128 fp_regs.vregs[31] (*) >0x6020 0010 00d4 FPSR32 fp_regs.fpsr >0x6020 0010 00d5 FPCR32 fp_regs.fpcr > > +(*) These encodings are not accepted for SVE-enabled vcpus. See > +KVM_ARM_SVE_CONFIG for details of how SVE support is configured for > +a vcpu. > + > +The equivalent register content can be accessed via bits [2047:0] > of You mean [127:0] I think. > +the corresponding SVE Zn registers instead for vcpus that have SVE > +enabled (see below). > + > arm64 CCSIDR registers are demultiplexed by CSSELR value: >0x6020 0011 00 > > @@ -2070,6 +2078,14 @@ arm64 system registers have the following id bit > patterns: > arm64 firmware pseudo-registers have the following bit pattern: >0x6030 0014 > > +arm64 SVE registers have the following bit patterns: > + 0x6080 0015 00 Zn bits[2048*slice + 2047 : > 2048*slice] > + 0x6050 0015 04 Pn bits[256*slice + 255 : 256*slice] > + 0x6050 0015 060 FFR bits[256*slice + 255 : 256*slice] > + > + These registers are only accessible on SVE-enabled vcpus. See > + KVM_ARM_SVE_CONFIG for details. > + > > MIPS registers are mapped using the lower 32 bits. The upper 16 of that is > the register group type: > @@ -3700,6 +3716,126 @@ Returns: 0 on success, -1 on error > This copies the vcpu's kvm_nested_state struct from userspace to the kernel. > For > the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE. > > +4.116 KVM_ARM_SVE_CONFIG > + > +Capability: KVM_CAP_ARM_SVE > +Architectures: arm64 > +Type: vm and vcpu ioctl > +Parameters: struct kvm_sve_vls (in/out) > +Returns: 0 on success > +Errors: > + EINVAL:Unrecognised subcommand or bad arguments > + EBADFD:vcpu in wrong state for request > + (KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_SET) > + ENOMEM:Out of memory > + EFAULT:Bad user address > + > +struct kvm_sve_vls { > + __u16 cmd; > + __u16 max_vq; > + __u16 _reserved[2]; > + __u64 required_vqs[8]; > +}; > + > +General: > + > +cmd: This ioctl supports a few different subcommands, selected by the > +value of cmd (described in detail in the following sections). > + > +_reserved[]: these fields may be meaningful to later kernels. For > +forward compatibility, they must be zeroed before invoking this ioctl > +for the first time on a given struct kvm_sve_vls object. (So, memset() > +it to zero before first use, or allocate with calloc() for example.) > + > +max_vq, required_vqs[]: encode a set of SVE vector lengths. The set is > +encoded as follows: > + > +If (a * 64 + b + 1) <= max_vq, then the bit represented by > + > +required_vqs[a] & ((__u64)1 << b) > + > +(where a is in the range 0..7 and b is in the range 0..63) > +indicates that the vector length (a * 64 + b + 1) * 128 bits is > +supported (KVM_ARM_SVE_CONFIG_QUERY, KVM_ARM_SVE_CONFIG_GET) or required > +(KVM_ARM_SVE_CONFIG_SET). > + > +If (a * 64 + b + 1) > max_vq, then the vector length > +(a * 64 + b + 1) * 128 bits is unsupported or prohibited respectively. > +In other words, only the first max_vq bits in required_vqs[] are > +significant; remaining bits are implicitly treated as if they were zero. > + > +max_vq must be in the range SVE_VQ_MIN (1) to SVE_VQ_MAX (512). > + > +See Documentation/arm64/sve.txt for an explanation of vector lengths and > +the meaning associated with "VQ". > + > +Subcommands: > + > +/* values for cmd: */ > +#define KVM_ARM_SVE_CONFIG_QUERY 0 /* query what the host can support */ > +#define KVM_ARM_SVE_CONFIG_SET 1 /* enable SVE for vcpu and > set VLs */ > +#define KVM_ARM_SVE_CONFIG_GET 2 /* read the set of VLs for a > vcpu */ > + > +Subcommand details: > + > +4.116.1 KVM_ARM_SVE_CONFIG_QUERY > +Type: vm and vcpu > + > +Retrieve the full set of
Re: [RFC PATCH v2 21/23] KVM: arm64/sve: allow KVM_ARM_SVE_CONFIG_QUERY on vm fd
Dave Martin writes: > Since userspace may need to decide on the set of vector lengths for > the guest before setting up a vm, it is onerous to require a vcpu > fd to be available first. KVM_ARM_SVE_CONFIG_QUERY is not > vcpu-dependent anyway, so this patch wires up KVM_ARM_SVE_CONFIG to > be usable on a vm fd where appropriate. > > Subcommands that are vcpu-dependent (currently > KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET) will return -EINVAL > if invoked on a vm fd. > > Signed-off-by: Dave Martin Apropos comments on last patch, this could go away if we went with just reporting the max VL via the SVE capability probe which works on the vmfd. > --- > arch/arm64/kvm/guest.c | 17 - > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > index f066b17..2313c22 100644 > --- a/arch/arm64/kvm/guest.c > +++ b/arch/arm64/kvm/guest.c > @@ -574,6 +574,9 @@ static int kvm_vcpu_set_sve_vls(struct kvm_vcpu *vcpu, > struct kvm_sve_vls *vls, > unsigned int vq, max_vq; > int ret; > > + if (!vcpu) > + return -EINVAL; /* per-vcpu operation on vm fd */ > + > if (vcpu->arch.has_run_once || vcpu_has_sve(vcpu)) > return -EBADFD; /* too late, or already configured */ > > @@ -659,6 +662,9 @@ static int kvm_vcpu_query_sve_vls(struct kvm_vcpu *vcpu, > struct kvm_sve_vls *vls > static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls > *vls, > struct kvm_sve_vls __user *userp) > { > + if (!vcpu) > + return -EINVAL; /* per-vcpu operation on vm fd */ > + > if (!vcpu_has_sve(vcpu)) > return -EBADFD; /* not configured yet */ > > @@ -668,6 +674,7 @@ static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, > struct kvm_sve_vls *vls, > sve_vq_from_vl(vcpu->arch.sve_max_vl), userp); > } > > +/* vcpu may be NULL if this is called via a vm fd */ > static int kvm_vcpu_sve_config(struct kvm_vcpu *vcpu, > struct kvm_sve_vls __user *userp) > { > @@ -717,7 +724,15 @@ int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu, > int kvm_arm_arch_vm_ioctl(struct kvm *kvm, > unsigned int ioctl, unsigned long arg) > { > - return -EINVAL; > + void __user *userp = (void __user *)arg; > + > + switch (ioctl) { > + case KVM_ARM_SVE_CONFIG: > + return kvm_vcpu_sve_config(NULL, userp); > + > + default: > + return -EINVAL; > + } > } > > int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) -- Alex Bennée ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 19/23] KVM: arm64/sve: Report and enable SVE API extensions for userspace
Dave Martin writes: > This patch adds the necessary API extensions to allow userspace to > detect SVE support for guests and enable it. > > A new capability KVM_CAP_ARM_SVE is defined to allow userspace to > detect the availability of the KVM SVE API extensions in the usual > way. > > Userspace needs to enable SVE explicitly per vcpu and configure the > set of SVE vector lengths available to the guest before the vcpu is > allowed to run. For these purposes, a new arm64-specific vcpu > ioctl KVM_ARM_SVE_CONFIG is added, with the following subcommands > (in rough order of expected use): > > KVM_ARM_SVE_CONFIG_QUERY: report the set of vector lengths > supported by this host. > > The resulting set can be supplied directly to > KVM_ARM_SVE_CONFIG_SET in order to obtain the maximal possible > set, or used to inform userspace's decision on the appropriate > set of vector lengths (possibly taking into account the > configuration of other nodes in the cluster so that the VM can > migrate freely). > > KVM_ARM_SVE_CONFIG_SET: enable SVE for this vcpu and configure the > set of vector lengths it offers to the guest. > > This can only be done once, before the vcpu is run. > > KVM_ARM_SVE_CONFIG_GET: report the set of vector lengths available > to the guest on this vcpu (for use when snapshotting or > migrating a VM). > > Signed-off-by: Dave Martin > --- > > Changes since RFCv1: > > * The new feature bit for PREFERRED_TARGET / VCPU_INIT is gone in >favour of a capability and a new ioctl to enable/configure SVE. > >Perhaps the SVE configuration could be done via device attributes, >but it still has to be done early, so crowbarring support for this >behind a generic API may cause more trouble than it solves. > >This is still up for discussion if anybody feels strongly about it. > > * An ioctl KVM_ARM_SVE_CONFIG has been added to report the set of >vector lengths available and configure SVE for a vcpu. > >To reduce ioctl namespace pollution the new operations are grouped >as subcommands under a single ioctl, since they use the same >argument format anyway. > --- > arch/arm64/include/asm/kvm_host.h | 8 +- > arch/arm64/include/uapi/asm/kvm.h | 14 > arch/arm64/kvm/guest.c| 164 > +- > arch/arm64/kvm/reset.c| 50 > include/uapi/linux/kvm.h | 4 + > 5 files changed, 238 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_host.h > b/arch/arm64/include/asm/kvm_host.h > index bbde597..5225485 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -52,6 +52,12 @@ > > DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use); > > +#ifdef CONFIG_ARM64_SVE > +bool kvm_sve_supported(void); > +#else > +static inline bool kvm_sve_supported(void) { return false; } > +#endif > + > int __attribute_const__ kvm_target_cpu(void); > int kvm_reset_vcpu(struct kvm_vcpu *vcpu); > int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext); > @@ -441,7 +447,7 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) > {} > static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} > static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {} > > -static inline void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {} > +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu); > > void kvm_arm_init_debug(void); > void kvm_arm_setup_debug(struct kvm_vcpu *vcpu); > diff --git a/arch/arm64/include/uapi/asm/kvm.h > b/arch/arm64/include/uapi/asm/kvm.h > index 1ff68fa..94f6932 100644 > --- a/arch/arm64/include/uapi/asm/kvm.h > +++ b/arch/arm64/include/uapi/asm/kvm.h > @@ -32,6 +32,7 @@ > #define KVM_NR_SPSR 5 > > #ifndef __ASSEMBLY__ > +#include > #include > #include > #include > @@ -108,6 +109,19 @@ struct kvm_vcpu_init { > __u32 features[7]; > }; > > +/* Vector length set for KVM_ARM_SVE_CONFIG */ > +struct kvm_sve_vls { > + __u16 cmd; > + __u16 max_vq; > + __u16 _reserved[2]; > + __u64 required_vqs[__KERNEL_DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, > 64)]; > +}; > + > +/* values for cmd: */ > +#define KVM_ARM_SVE_CONFIG_QUERY 0 /* query what the host can support */ > +#define KVM_ARM_SVE_CONFIG_SET 1 /* enable SVE for vcpu and > set VLs */ > +#define KVM_ARM_SVE_CONFIG_GET 2 /* read the set of VLs for a > vcpu */ > + > struct kvm_sregs { > }; > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > index 331b85e..d96145a 100644 > --- a/arch/arm64/kvm/guest.c > +++ b/arch/arm64/kvm/guest.c > @@ -26,6 +26,9 @@ > #include > #include > #include > +#include > +#include > +#include > #include > #include > #include > @@ -56,6 +59,11 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) > return 0; > } > > +void kvm_arm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) > +{ > +
Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
On Thu, Nov 22, 2018 at 01:32:37PM +0100, Dave P Martin wrote: > On Thu, Nov 22, 2018 at 11:27:53AM +, Alex Bennée wrote: > > > > Christoffer Dall writes: > > > > > [Adding Peter and Alex for their view on the QEMU side] > > > > > > On Thu, Nov 15, 2018 at 05:27:11PM +, Dave Martin wrote: > > >> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote: > > >> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote: > > >> > > KVM_GET_REG_LIST should only enumerate registers that are actually > > >> > > accessible, so it is necessary to filter out any register that is > > >> > > not exposed to the guest. For features that are configured at > > >> > > runtime, this will require a dynamic check. > > >> > > > > >> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden > > >> > > if SVE is not enabled for the guest. > > >> > > > >> > This implies that userspace can never access this interface for a vcpu > > >> > before having decided whether such features are enabled for the guest > > >> > or > > >> > not, since otherwise userspace will see different states for a VCPU > > >> > depending on sequencing of the API, which sounds fragile to me. > > >> > > > >> > That should probably be documented somewhere, and I hope the > > >> > enable/disable API for SVE in guests already takes that into account. > > >> > > > >> > Not sure if there's an action to take here, but it was the best place I > > >> > could raise this concern. > > >> > > >> Fair point. I struggled to come up with something better that solves > > >> all problems. > > >> > > >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of > > >> creating the vcpu, so that if issued at all for a vcpu, it is issued > > >> very soon after KVM_VCPU_INIT. > > >> > > >> I think this worked OK with the current structure of kvmtool and I > > >> seem to remember discussing this with Peter Maydell re qemu -- but > > >> it sounds like I should double-check. > > > > > > QEMU does some thing around enumerating all the system registers exposed > > > by KVM and saving/restoring them as part of its startup, but I don't > > > remember the exact sequence. > > > > QEMU does this for each vCPU as part of it's start-up sequence: > > > > kvm_init_vcpu > > kvm_get_cpu (-> KVM_CREATE_VCPU) > > KVM_GET_VCPU_MMAP_SIZE > > kvm_arch_init_vcpu > > kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT) > > kvm_get_one_reg(ARM_CPU_ID_MPIDR) > > kvm_arm_init_debug (chk for KVM_CAP > > SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS) > > kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR) > > kvm_arm_init_cpreg_list (KVM_GET_REG_LIST) > > > > At this point we have the register list we need for > > kvm_arch_get_registers which is what we call every time we want to > > synchronise state. We only really do this for debug events, crashes and > > at some point when migrating. > > So we would need to insert KVM_ARM_SVE_CONFIG_SET into this sequence, > meaning that the new capability is not strictly necessary. > > I sympathise with Christoffer's view though that without the capability > mechanism it may be too easy for software to make mistakes: code > refactoring might swap the KVM_GET_REG_LIST and KVM_ARM_SVE_CONFIG ioctls > over and then things would go wrong with no immediate error indication. > > In effect, the SVE regs would be missing from the list yielded by > KVM_GET_REG_LIST, possibly leading to silent migration failures. > > I'm a bit uneasy about that. Am I being too paranoid now? > No, we've made decisions in the past where we didn't enforce ordering which ended up being a huge pain (vgic lazy init, as a clear example of something really bad). Of course, it's a tradeoff. If it's a huge pain to implement, maybe things will be ok, but if it's just a read/write capability handshake, I think it's worth doing. Thanks, Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
On 22 November 2018 at 12:34, Christoffer Dall wrote: > So on migration, will you have the required information for > KVM_ARM_VCPU_INIT before setting the registers from the migration > stream? > > (I assume so, because presumably this comes from a command-line switch > or from the machine definition, which must match the source.) Yes. QEMU always sets up the VCPU completely before doing any inbound migration. > Therefore, I don't think there's an issue with this patch, but from > bitter experience I think we should enforce ordering if possible. Yes, if there are semantic ordering constraints on the various calls it would be nice to have the kernel enforce them. thanks -- PMM ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
On Thu, Nov 22, 2018 at 11:13:51AM +, Peter Maydell wrote: > On 22 November 2018 at 10:53, Christoffer Dall > wrote: > > [Adding Peter and Alex for their view on the QEMU side] > > > > On Thu, Nov 15, 2018 at 05:27:11PM +, Dave Martin wrote: > >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of > >> creating the vcpu, so that if issued at all for a vcpu, it is issued > >> very soon after KVM_VCPU_INIT. > >> > >> I think this worked OK with the current structure of kvmtool and I > >> seem to remember discussing this with Peter Maydell re qemu -- but > >> it sounds like I should double-check. > > > > QEMU does some thing around enumerating all the system registers exposed > > by KVM and saving/restoring them as part of its startup, but I don't > > remember the exact sequence. > > This all happens in kvm_arch_init_vcpu(), which does: > * KVM_ARM_VCPU_INIT ioctl (with the appropriate kvm_init_features set) > * read the guest MPIDR with GET_ONE_REG so we know what KVM >is doing with MPIDR assignment across CPUs > * check for interesting extensions like KVM_CAP_SET_GUEST_DEBUG > * get and cache a list of what system registers the vcpu has, >using KVM_GET_REG_LIST. This is where we do the "size must >be U32 or U64" sanity check. > > So if there's something we can't do by setting kvm_init_features > for KVM_ARM_VCPU_INIT but have to do immediately afterwards, > that is straightforward. > > The major requirement for QEMU is that if we don't specifically > enable SVE in the VCPU then we must not see any registers > in the KVM_GET_REG_LIST that are not u32 or u64 -- otherwise > QEMU will refuse to start. > So on migration, will you have the required information for KVM_ARM_VCPU_INIT before setting the registers from the migration stream? (I assume so, because presumably this comes from a command-line switch or from the machine definition, which must match the source.) Therefore, I don't think there's an issue with this patch, but from bitter experience I think we should enforce ordering if possible. Thanks, Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
On Thu, Nov 22, 2018 at 11:27:53AM +, Alex Bennée wrote: > > Christoffer Dall writes: > > > [Adding Peter and Alex for their view on the QEMU side] > > > > On Thu, Nov 15, 2018 at 05:27:11PM +, Dave Martin wrote: > >> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote: > >> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote: > >> > > KVM_GET_REG_LIST should only enumerate registers that are actually > >> > > accessible, so it is necessary to filter out any register that is > >> > > not exposed to the guest. For features that are configured at > >> > > runtime, this will require a dynamic check. > >> > > > >> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden > >> > > if SVE is not enabled for the guest. > >> > > >> > This implies that userspace can never access this interface for a vcpu > >> > before having decided whether such features are enabled for the guest or > >> > not, since otherwise userspace will see different states for a VCPU > >> > depending on sequencing of the API, which sounds fragile to me. > >> > > >> > That should probably be documented somewhere, and I hope the > >> > enable/disable API for SVE in guests already takes that into account. > >> > > >> > Not sure if there's an action to take here, but it was the best place I > >> > could raise this concern. > >> > >> Fair point. I struggled to come up with something better that solves > >> all problems. > >> > >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of > >> creating the vcpu, so that if issued at all for a vcpu, it is issued > >> very soon after KVM_VCPU_INIT. > >> > >> I think this worked OK with the current structure of kvmtool and I > >> seem to remember discussing this with Peter Maydell re qemu -- but > >> it sounds like I should double-check. > > > > QEMU does some thing around enumerating all the system registers exposed > > by KVM and saving/restoring them as part of its startup, but I don't > > remember the exact sequence. > > QEMU does this for each vCPU as part of it's start-up sequence: > > kvm_init_vcpu > kvm_get_cpu (-> KVM_CREATE_VCPU) > KVM_GET_VCPU_MMAP_SIZE > kvm_arch_init_vcpu > kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT) > kvm_get_one_reg(ARM_CPU_ID_MPIDR) > kvm_arm_init_debug (chk for KVM_CAP > SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS) > kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR) > kvm_arm_init_cpreg_list (KVM_GET_REG_LIST) > > At this point we have the register list we need for > kvm_arch_get_registers which is what we call every time we want to > synchronise state. We only really do this for debug events, crashes and > at some point when migrating. So we would need to insert KVM_ARM_SVE_CONFIG_SET into this sequence, meaning that the new capability is not strictly necessary. I sympathise with Christoffer's view though that without the capability mechanism it may be too easy for software to make mistakes: code refactoring might swap the KVM_GET_REG_LIST and KVM_ARM_SVE_CONFIG ioctls over and then things would go wrong with no immediate error indication. In effect, the SVE regs would be missing from the list yielded by KVM_GET_REG_LIST, possibly leading to silent migration failures. I'm a bit uneasy about that. Am I being too paranoid now? Cheers ---Dave IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
Christoffer Dall writes: > [Adding Peter and Alex for their view on the QEMU side] > > On Thu, Nov 15, 2018 at 05:27:11PM +, Dave Martin wrote: >> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote: >> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote: >> > > KVM_GET_REG_LIST should only enumerate registers that are actually >> > > accessible, so it is necessary to filter out any register that is >> > > not exposed to the guest. For features that are configured at >> > > runtime, this will require a dynamic check. >> > > >> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden >> > > if SVE is not enabled for the guest. >> > >> > This implies that userspace can never access this interface for a vcpu >> > before having decided whether such features are enabled for the guest or >> > not, since otherwise userspace will see different states for a VCPU >> > depending on sequencing of the API, which sounds fragile to me. >> > >> > That should probably be documented somewhere, and I hope the >> > enable/disable API for SVE in guests already takes that into account. >> > >> > Not sure if there's an action to take here, but it was the best place I >> > could raise this concern. >> >> Fair point. I struggled to come up with something better that solves >> all problems. >> >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of >> creating the vcpu, so that if issued at all for a vcpu, it is issued >> very soon after KVM_VCPU_INIT. >> >> I think this worked OK with the current structure of kvmtool and I >> seem to remember discussing this with Peter Maydell re qemu -- but >> it sounds like I should double-check. > > QEMU does some thing around enumerating all the system registers exposed > by KVM and saving/restoring them as part of its startup, but I don't > remember the exact sequence. QEMU does this for each vCPU as part of it's start-up sequence: kvm_init_vcpu kvm_get_cpu (-> KVM_CREATE_VCPU) KVM_GET_VCPU_MMAP_SIZE kvm_arch_init_vcpu kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT) kvm_get_one_reg(ARM_CPU_ID_MPIDR) kvm_arm_init_debug (chk for KVM_CAP SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS) kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR) kvm_arm_init_cpreg_list (KVM_GET_REG_LIST) At this point we have the register list we need for kvm_arch_get_registers which is what we call every time we want to synchronise state. We only really do this for debug events, crashes and at some point when migrating. > >> >> Either way, you're right, this needs to be clearly documented. >> >> >> If we want to be more robust, maybe we should add a capability too, >> so that userspace that enables this capability promises to call >> KVM_ARM_SVE_CONFIG_SET for each vcpu, and affected ioctls (KVM_RUN, >> KVM_GET_REG_LIST etc.) are forbidden until that is done? >> >> That should help avoid accidents. >> >> I could add a special meaning for an empty kvm_sve_vls, such that >> it doesn't enable SVE on the affected vcpu. That retains the ability >> to create heterogeneous guests while still following the above flow. >> > I think making sure that userspace can ever only see the same list of > available system regiters is going to cause us less pain going forward. > > If the separate ioctl and capability check is the easiest way of doing > that, then I think that sounds good. (I had wished we could have just > added some data to KVM_CREATE_VCPU, but that doesn't seem to be the > case.) > > > Thanks, > > Christoffer -- Alex Bennée ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v3 0/4] arm64: Support perf event modifiers :G and :H
This patchset provides support for perf event modifiers :G and :H which allows for filtering of PMU events between host and guests when used with KVM. As the underlying hardware cannot distinguish between guest and host context, the performance counters must be stopped and started upon entry/exit to the guest. This is performed at EL2 in a way that minimizes overhead and improves accuracy of recording events that only occur in the requested context. This has been tested with VHE and non-VHE kernels with a KVM guest. Changes from v2: - Ensured that exclude_kernel works for guest - Removed unnecessary exclusion of EL2 with exclude_host on !VHE - Renamed kvm_clr_set_host_pmu_events to reflect args order - Added additional information to isb patch Changes from v1: - Removed unnecessary exclusion of EL1 with exclude_guest on VHE - Removed unnecessary isb from existing perf_event.c driver - Folded perf_event.c patches together - Added additional information to last patch commit message Andrew Murray (4): arm64: arm_pmu: remove unnecessary isb instruction arm64: KVM: add accessors to track guest/host only counters arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes arm64: KVM: Enable support for :G/:H perf event modifiers arch/arm64/include/asm/kvm_host.h | 20 +++ arch/arm64/kernel/perf_event.c| 53 +-- arch/arm64/kvm/hyp/switch.c | 38 3 files changed, 103 insertions(+), 8 deletions(-) -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v3 2/4] arm64: KVM: add accessors to track guest/host only counters
In order to effeciently enable/disable guest/host only perf counters at guest entry/exit we add bitfields to kvm_cpu_context for guest and host only events as well as accessors for updating them. Signed-off-by: Andrew Murray --- arch/arm64/include/asm/kvm_host.h | 20 1 file changed, 20 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 1550192..4a828eb 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -203,6 +203,8 @@ struct kvm_cpu_context { }; struct kvm_vcpu *__hyp_running_vcpu; + u32 events_host_only; + u32 events_guest_only; }; typedef struct kvm_cpu_context kvm_cpu_context_t; @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) { return kvm_arch_vcpu_run_map_fp(vcpu); } +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set) +{ + kvm_cpu_context_t *ctx = this_cpu_ptr(_host_cpu_state); + + ctx->events_host_only &= ~clr; + ctx->events_host_only |= set; +} + +static inline void kvm_clr_set_guest_pmu_events(u32 clr, u32 set) +{ + kvm_cpu_context_t *ctx = this_cpu_ptr(_host_cpu_state); + + ctx->events_guest_only &= ~clr; + ctx->events_guest_only |= set; +} +#else +static inline void kvm_clr_set_host_pmu_events(u32 clr, u32 set) {} +static inline void kvm_clr_set_guest_pmu_events(u32 clr, u32 set) {} #endif static inline void kvm_arm_vhe_guest_enter(void) -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v3 4/4] arm64: KVM: Enable support for :G/:H perf event modifiers
Enable/disable event counters as appropriate when entering and exiting the guest to enable support for guest or host only event counting. For both VHE and non-VHE we switch the counters between host/guest at EL2. EL2 is filtered out by the PMU when we are using the :G modifier. The PMU may be on when we change which counters are enabled however we avoid adding an isb as we instead rely on existing context synchronisation events: the isb in kvm_arm_vhe_guest_exit for VHE and the eret from the hvc in kvm_call_hyp. Signed-off-by: Andrew Murray --- arch/arm64/kvm/hyp/switch.c | 38 ++ 1 file changed, 38 insertions(+) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index d496ef5..ebf0aac 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -373,6 +373,32 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu) return true; } +static bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt) +{ + u32 host_only = host_ctxt->events_host_only; + u32 guest_only = host_ctxt->events_guest_only; + + if (host_only) + write_sysreg(host_only, pmcntenclr_el0); + + if (guest_only) + write_sysreg(guest_only, pmcntenset_el0); + + return (host_only || guest_only); +} + +static void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt) +{ + u32 host_only = host_ctxt->events_host_only; + u32 guest_only = host_ctxt->events_guest_only; + + if (guest_only) + write_sysreg(guest_only, pmcntenclr_el0); + + if (host_only) + write_sysreg(host_only, pmcntenset_el0); +} + /* * Return true when we were able to fixup the guest exit and should return to * the guest, false when we should restore the host state and return to the @@ -488,12 +514,15 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) { struct kvm_cpu_context *host_ctxt; struct kvm_cpu_context *guest_ctxt; + bool pmu_switch_needed; u64 exit_code; host_ctxt = vcpu->arch.host_cpu_context; host_ctxt->__hyp_running_vcpu = vcpu; guest_ctxt = >arch.ctxt; + pmu_switch_needed = __pmu_switch_to_guest(host_ctxt); + sysreg_save_host_state_vhe(host_ctxt); __activate_traps(vcpu); @@ -524,6 +553,9 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) __debug_switch_to_host(vcpu); + if (pmu_switch_needed) + __pmu_switch_to_host(host_ctxt); + return exit_code; } @@ -532,6 +564,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu) { struct kvm_cpu_context *host_ctxt; struct kvm_cpu_context *guest_ctxt; + bool pmu_switch_needed; u64 exit_code; vcpu = kern_hyp_va(vcpu); @@ -540,6 +573,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu) host_ctxt->__hyp_running_vcpu = vcpu; guest_ctxt = >arch.ctxt; + pmu_switch_needed = __pmu_switch_to_guest(host_ctxt); + __sysreg_save_state_nvhe(host_ctxt); __activate_traps(vcpu); @@ -586,6 +621,9 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu) */ __debug_switch_to_host(vcpu); + if (pmu_switch_needed) + __pmu_switch_to_host(host_ctxt); + return exit_code; } -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
Add support for the :G and :H attributes in perf by handling the exclude_host/exclude_guest event attributes. We notify KVM of counters that we wish to be enabled or disabled on guest entry/exit and thus defer from starting or stopping :G events as per the events exclude_host attribute. With both VHE and non-VHE we switch the counters between host/guest at EL2. We are able to eliminate counters counting host events on the boundaries of guest entry/exit when using :G by filtering out EL2 for exclude_host. However when using :H unless exclude_hv is set on non-VHE then there is a small blackout window at the guest entry/exit where host events are not captured. Signed-off-by: Andrew Murray --- arch/arm64/kernel/perf_event.c | 52 -- 1 file changed, 45 insertions(+), 7 deletions(-) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index de564ae..e0619ba 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -26,6 +26,7 @@ #include #include +#include #include #include #include @@ -647,11 +648,23 @@ static inline int armv8pmu_enable_counter(int idx) static inline void armv8pmu_enable_event_counter(struct perf_event *event) { + struct perf_event_attr *attr = >attr; int idx = event->hw.idx; + u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); - armv8pmu_enable_counter(idx); if (armv8pmu_event_is_chained(event)) - armv8pmu_enable_counter(idx - 1); + counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); + + if (attr->exclude_host) + kvm_clr_set_guest_pmu_events(0, counter_bits); + if (attr->exclude_guest) + kvm_clr_set_host_pmu_events(0, counter_bits); + + if (!attr->exclude_host) { + armv8pmu_enable_counter(idx); + if (armv8pmu_event_is_chained(event)) + armv8pmu_enable_counter(idx - 1); + } } static inline int armv8pmu_disable_counter(int idx) @@ -664,11 +677,23 @@ static inline int armv8pmu_disable_counter(int idx) static inline void armv8pmu_disable_event_counter(struct perf_event *event) { struct hw_perf_event *hwc = >hw; + struct perf_event_attr *attr = >attr; int idx = hwc->idx; + u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); if (armv8pmu_event_is_chained(event)) - armv8pmu_disable_counter(idx - 1); - armv8pmu_disable_counter(idx); + counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); + + if (attr->exclude_host) + kvm_clr_set_guest_pmu_events(counter_bits, 0); + if (attr->exclude_guest) + kvm_clr_set_host_pmu_events(counter_bits, 0); + + if (!attr->exclude_host) { + if (armv8pmu_event_is_chained(event)) + armv8pmu_disable_counter(idx - 1); + armv8pmu_disable_counter(idx); + } } static inline int armv8pmu_enable_intens(int idx) @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event, * Therefore we ignore exclude_hv in this configuration, since * there's no hypervisor to sample anyway. This is consistent * with other architectures (x86 and Power). +* +* To eliminate counting host events on the boundaries of +* guest entry/exit we ensure EL2 is not included in hyp mode +* with !exclude_host. */ if (is_kernel_in_hyp_mode()) { - if (!attr->exclude_kernel) + if (!attr->exclude_kernel && !attr->exclude_host) config_base |= ARMV8_PMU_INCLUDE_EL2; } else { - if (attr->exclude_kernel) - config_base |= ARMV8_PMU_EXCLUDE_EL1; if (!attr->exclude_hv) config_base |= ARMV8_PMU_INCLUDE_EL2; } + + /* +* Filter out !VHE kernels and guest kernels +*/ + if (attr->exclude_kernel) + config_base |= ARMV8_PMU_EXCLUDE_EL1; + if (attr->exclude_user) config_base |= ARMV8_PMU_EXCLUDE_EL0; @@ -976,6 +1010,10 @@ static void armv8pmu_reset(void *info) armv8pmu_disable_intens(idx); } + /* Clear the counters we flip at guest entry/exit */ + kvm_clr_set_host_pmu_events(U32_MAX, 0); + kvm_clr_set_guest_pmu_events(U32_MAX, 0); + /* * Initialize & Reset PMNC. Request overflow interrupt for * 64 bit cycle counter but cheat in armv8pmu_write_counter(). -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v3 1/4] arm64: arm_pmu: remove unnecessary isb instruction
The armv8pmu_enable_event_counter function issues an isb instruction after enabling a pair of counters - this doesn't provide any value and is inconsistent with the armv8pmu_disable_event_counter. In any case armv8pmu_enable_event_counter is always called with the PMU stopped. Starting the PMU with armv8pmu_start results in an isb instruction being issued prior to writing to PMCR_EL0. Let's remove the unnecessary isb instruction. Signed-off-by: Andrew Murray Reviewed-by: Suzuki K Poulose Acked-by: Mark Rutland --- arch/arm64/kernel/perf_event.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 8e38d52..de564ae 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -652,7 +652,6 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event) armv8pmu_enable_counter(idx); if (armv8pmu_event_is_chained(event)) armv8pmu_enable_counter(idx - 1); - isb(); } static inline int armv8pmu_disable_counter(int idx) -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
On 22 November 2018 at 10:53, Christoffer Dall wrote: > [Adding Peter and Alex for their view on the QEMU side] > > On Thu, Nov 15, 2018 at 05:27:11PM +, Dave Martin wrote: >> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of >> creating the vcpu, so that if issued at all for a vcpu, it is issued >> very soon after KVM_VCPU_INIT. >> >> I think this worked OK with the current structure of kvmtool and I >> seem to remember discussing this with Peter Maydell re qemu -- but >> it sounds like I should double-check. > > QEMU does some thing around enumerating all the system registers exposed > by KVM and saving/restoring them as part of its startup, but I don't > remember the exact sequence. This all happens in kvm_arch_init_vcpu(), which does: * KVM_ARM_VCPU_INIT ioctl (with the appropriate kvm_init_features set) * read the guest MPIDR with GET_ONE_REG so we know what KVM is doing with MPIDR assignment across CPUs * check for interesting extensions like KVM_CAP_SET_GUEST_DEBUG * get and cache a list of what system registers the vcpu has, using KVM_GET_REG_LIST. This is where we do the "size must be U32 or U64" sanity check. So if there's something we can't do by setting kvm_init_features for KVM_ARM_VCPU_INIT but have to do immediately afterwards, that is straightforward. The major requirement for QEMU is that if we don't specifically enable SVE in the VCPU then we must not see any registers in the KVM_GET_REG_LIST that are not u32 or u64 -- otherwise QEMU will refuse to start. thanks -- PMM ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST
[Adding Peter and Alex for their view on the QEMU side] On Thu, Nov 15, 2018 at 05:27:11PM +, Dave Martin wrote: > On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote: > > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote: > > > KVM_GET_REG_LIST should only enumerate registers that are actually > > > accessible, so it is necessary to filter out any register that is > > > not exposed to the guest. For features that are configured at > > > runtime, this will require a dynamic check. > > > > > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden > > > if SVE is not enabled for the guest. > > > > This implies that userspace can never access this interface for a vcpu > > before having decided whether such features are enabled for the guest or > > not, since otherwise userspace will see different states for a VCPU > > depending on sequencing of the API, which sounds fragile to me. > > > > That should probably be documented somewhere, and I hope the > > enable/disable API for SVE in guests already takes that into account. > > > > Not sure if there's an action to take here, but it was the best place I > > could raise this concern. > > Fair point. I struggled to come up with something better that solves > all problems. > > My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of > creating the vcpu, so that if issued at all for a vcpu, it is issued > very soon after KVM_VCPU_INIT. > > I think this worked OK with the current structure of kvmtool and I > seem to remember discussing this with Peter Maydell re qemu -- but > it sounds like I should double-check. QEMU does some thing around enumerating all the system registers exposed by KVM and saving/restoring them as part of its startup, but I don't remember the exact sequence. > > Either way, you're right, this needs to be clearly documented. > > > If we want to be more robust, maybe we should add a capability too, > so that userspace that enables this capability promises to call > KVM_ARM_SVE_CONFIG_SET for each vcpu, and affected ioctls (KVM_RUN, > KVM_GET_REG_LIST etc.) are forbidden until that is done? > > That should help avoid accidents. > > I could add a special meaning for an empty kvm_sve_vls, such that > it doesn't enable SVE on the affected vcpu. That retains the ability > to create heterogeneous guests while still following the above flow. > I think making sure that userspace can ever only see the same list of available system regiters is going to cause us less pain going forward. If the separate ioctl and capability check is the easiest way of doing that, then I think that sounds good. (I had wished we could have just added some data to KVM_CREATE_VCPU, but that doesn't seem to be the case.) Thanks, Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] kvm: arm/arm64 : fix vm's hanging at startup time
On Wed, Nov 21, 2018 at 03:53:03PM +, Julien Thierry wrote: > > > On 21/11/18 15:24, Christoffer Dall wrote: > >On Wed, Nov 21, 2018 at 12:17:45PM +, Julien Thierry wrote: > >> > >> > >>On 21/11/18 11:06, Christoffer Dall wrote: > >>>Hi, > >>> > >>>On Wed, Nov 21, 2018 at 04:56:54PM +0800, peng.h...@zte.com.cn wrote: > >On 19/11/2018 09:10, Mark Rutland wrote: > >>On Sat, Nov 17, 2018 at 10:58:37AM +0800, peng.h...@zte.com.cn wrote: > On 16/11/18 00:23, peng.h...@zte.com.cn wrote: > >>Hi, > >>>When virtual machine starts, hang up. > >> > >>I take it you mean the *guest* hangs? Because it doesn't get a timer > >>interrupt? > >> > >>>The kernel version of guest > >>>is 4.16. Host support vgic_v3. > >> > >>Your host kernel is something recent, I guess? > >> > >>>It was mainly due to the incorrect vgic_irq's(intid=27) group value > >>>during injection interruption. when kvm_vgic_vcpu_init is called, > >>>dist is not initialized at this time. Unable to get vgic V3 or V2 > >>>correctly, so group is not set. > >> > >>Mmh, that shouldn't happen with (v)GICv3. Do you use QEMU (which > >>version?) or some other userland tool? > >> > > > >QEMU emulator version 3.0.50 . > > > >>>group is setted to 1 when vgic_mmio_write_group is invoked at some > >>>time. > >>>when irq->group=0 (intid=27), No ICH_LR_GROUP flag was set and > >>>interrupt injection failed. > >>> > >>>Signed-off-by: Peng Hao > >>>--- > >>> virt/kvm/arm/vgic/vgic-v3.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>>diff --git a/virt/kvm/arm/vgic/vgic-v3.c > >>>b/virt/kvm/arm/vgic/vgic-v3.c > >>>index 9c0dd23..d101000 100644 > >>>--- a/virt/kvm/arm/vgic/vgic-v3.c > >>>+++ b/virt/kvm/arm/vgic/vgic-v3.c > >>>@@ -198,7 +198,7 @@ void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, > >>>struct vgic_irq *irq, int lr) if (vgic_irq_is_mapped_level(irq) && > >>>(val & ICH_LR_PENDING_BIT)) irq->line_level = false; > >>> > >>>-if (irq->group) > >>>+if (model == KVM_DEV_TYPE_ARM_VGIC_V3) > >> > >>This is not the right fix, not only because it basically reverts the > >>GICv3 part of 87322099052 (KVM: arm/arm64: vgic: Signal IRQs using > >>their configured group). > >> > >>Can you try to work out why kvm_vgic_vcpu_init() is apparently > >>called > >>before dist->vgic_model is set, also what value it has? > >>If I understand the code correctly, that shouldn't happen for a > >>GICv3. > >> > >Even if the value of group is correctly assigned in > >kvm_vgic_vcpu_init, the group is then written 0 through > >vgic_mmio_write_group. > > If the interrupt comes at this time, the interrupt injection > > fails. > > Does that mean that the guest is configuring its interrupts as Group0? > That sounds wrong, Linux should configure all it's interrupts as > non-secure group1. > >>> > >>>no, I think that uefi dose this, not linux. > >>>1. kvm_vgic_vcpu_init > >>>2. vgic_create > >>>3. kvm_vgic_dist_init > >>>4.vgic_mmio_write_group: uefi as guest, write group=0 > >>>5.vgic_mmio_write_group: linux as guest, write group=1 > >> > >>Is this the same issue fixed by EDK2 commit: > >> > >>66127011a544b90e ("ArmPkg/ArmGicDxe ARM: fix encoding for GICv3 > >>interrupt acknowledge") > >> > >>... where EDK2 would try to use IAR0 rather than IAR1? > >> > >>The commit messages notes this lead to a boot-time hang. > > > >I managed to trigger an issue with a really old EFI implementation that > >doesn't configure its interrupts as Group1, and yet tries to ACK its > >interrupts using the Group1 accessor. Guess what? It is not going to > >work. > > > >Commit c7fefb690661f2e38afcb8200bd318ecf38ab961 in the edk2 tree seems > >to be the fix (I only assume it does, I haven't actually checked). A > >recent build, as found in Debian Buster, works perfectly (tested with > >both QEMU v2.12 and tip of tree). > > > >Now, I really don't get what you're saying about Linux not getting > >interrupts. How do you get to booting Linux if EFI is not making any > >forward progress? Are you trying them independently? > > > I start linux with bypassing uefi, the print info is the same. > [507107.748908] vgic_mmio_write_group:## intid/27 group=0 > [507107.752185] vgic_mmio_write_group:## intid/27 group=0 > [507107.899566] vgic_mmio_write_group:## intid/27 group=1 > [507107.907370] vgic_mmio_write_group:## intid/27
Re: [PATCH v2 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
On Wed, Nov 21, 2018 at 05:01:47PM +, Suzuki K Poulose wrote: > > > On 21/11/2018 13:23, Andrew Murray wrote: > > On Wed, Nov 21, 2018 at 10:56:02AM +, Andrew Murray wrote: > > > On Tue, Nov 20, 2018 at 02:49:05PM +, Suzuki K Poulose wrote: > > > > On 11/20/2018 02:15 PM, Andrew Murray wrote: > > > > > Add support for the :G and :H attributes in perf by handling the > > > > > exclude_host/exclude_guest event attributes. > > > > > > > > > > We notify KVM of counters that we wish to be enabled or disabled on > > > > > guest entry/exit and thus defer from starting or stopping :G events > > > > > as per the events exclude_host attribute. > > > > > > > > > > When using VHE, EL2 is unused by the guest - therefore we can filter > > > > > out these events with the PMU as per the 'exclude_host' attribute. > > > > > > > > > > With both VHE and non-VHE we switch the counters between host/guest > > > > > at EL2. With non-VHE when using 'exclude_host' we filter out EL2. > > > > > > > > > > These changes eliminate counters counting host events on the > > > > > boundaries of guest entry/exit when using :G. However when using :H > > > > > unless exclude_hv is set on non-VHE then there is a small blackout > > > > > window at the guest entry/exit where host events are not captured. > > > > > > > > > > Signed-off-by: Andrew Murray > > > > > --- > > > > >arch/arm64/kernel/perf_event.c | 41 > > > > > +++-- > > > > > > > > > > > > > > > > + } > > > > >} > > > > >static inline int armv8pmu_disable_counter(int idx) > > > > > @@ -664,11 +677,23 @@ static inline int armv8pmu_disable_counter(int > > > > > idx) > > > > >static inline void armv8pmu_disable_event_counter(struct > > > > > perf_event *event) > > > > >{ > > > > > struct hw_perf_event *hwc = >hw; > > > > > + struct perf_event_attr *attr = >attr; > > > > > int idx = hwc->idx; > > > > > + u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx)); > > > > > if (armv8pmu_event_is_chained(event)) > > > > > - armv8pmu_disable_counter(idx - 1); > > > > > - armv8pmu_disable_counter(idx); > > > > > + counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1)); > > > > > + > > > > > + if (attr->exclude_host) > > > > > + kvm_set_clr_guest_pmu_events(counter_bits, 0); > > > > > + if (attr->exclude_guest) > > > > > + kvm_set_clr_host_pmu_events(counter_bits, 0); > > > > > + > > > > > + if (!attr->exclude_host) { > > > > > + if (armv8pmu_event_is_chained(event)) > > > > > + armv8pmu_disable_counter(idx - 1); > > > > > + armv8pmu_disable_counter(idx); > > > > > + } > > > > >} > > > > >static inline int armv8pmu_enable_intens(int idx) > > > > > @@ -945,12 +970,12 @@ static int armv8pmu_set_event_filter(struct > > > > > hw_perf_event *event, > > > > >* with other architectures (x86 and Power). > > > > >*/ > > > > > if (is_kernel_in_hyp_mode()) { > > > > > - if (!attr->exclude_kernel) > > > > > + if (!attr->exclude_kernel && !attr->exclude_host) > > > > > config_base |= ARMV8_PMU_INCLUDE_EL2; > > > > > > > > Shouldn't we handle "exclude_kernel" for a "Guest" event ? > > > > i.e, what if we have exclude_kernel + exclude_host set ? Doesn't > > > > the "exclude_kernel" apply to the event filtering after we enter > > > > guest and thus, we need to set the EXCLUDE_EL1 ? > > > > > > Yes you're right. This is a problem not just for a VHE host's guest but > > > for the host as well - for example perf -e instructions:h and > > > -e instructions:G will currently give the same count. > > > > > > > > > > > Also I am wondering what is the situation with Nested virtualisation > > > > coming in. i.e, if the host hyp wanted to profile the guest hyp, should > > > > we set EL2 events ? I understand this is something which should be > > > > solved with the nested virt changes. But it would be good to see > > > > if we could filter "exclude_host" and "exclude_guest" at the hypervisor > > > > level (i.e, in software, without PMU filtering) to allow the normal > > > > controls to make use of the hardware filtering ? > > > > > > It took me a while to think this through and I think you're right in > > > that we can do more to future proof this for nested virt. In fact we > > > don't depend on the hunks in this function (i.e. addition of extra > > > '!attr->exclude_host' conditions) - we don't need them due to the > > > enable/disable of counters at guest entry/exit. > > > > > > With the assumption of the hypervisor switch enabling/disabling > > > host/guest counters I think we can now do the following: > > > > > > /* > > > * If we're running in hyp mode, then we *are* the hypervisor. > > > * Therefore we ignore exclude_hv in this configuration, since > > > * there's no hypervisor
Re: [PATCH v2 2/4] arm64: KVM: add accessors to track guest/host only counters
On Wed, Nov 21, 2018 at 04:04:03PM +, Marc Zyngier wrote: > On 20/11/2018 14:15, Andrew Murray wrote: > > In order to effeciently enable/disable guest/host only perf counters > > at guest entry/exit we add bitfields to kvm_cpu_context for guest and > > host only events as well as accessors for updating them. > > > > Signed-off-by: Andrew Murray > > --- > > arch/arm64/include/asm/kvm_host.h | 20 > > 1 file changed, 20 insertions(+) > > > > diff --git a/arch/arm64/include/asm/kvm_host.h > > b/arch/arm64/include/asm/kvm_host.h > > index 1550192..b6f998b 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -203,6 +203,8 @@ struct kvm_cpu_context { > > }; > > > > struct kvm_vcpu *__hyp_running_vcpu; > > + u32 events_host_only; > > + u32 events_guest_only; > > }; > > > > typedef struct kvm_cpu_context kvm_cpu_context_t; > > @@ -472,6 +474,24 @@ static inline int kvm_arch_vcpu_run_pid_change(struct > > kvm_vcpu *vcpu) > > { > > return kvm_arch_vcpu_run_map_fp(vcpu); > > } > > +static inline void kvm_set_clr_host_pmu_events(u32 clr, u32 set) > > nit: keep the arguments in an order that match the function name (saves > a lot of brain power...) Thanks, every bit helps. Andrew Murray > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 1/4] arm64: arm_pmu: remove unnecessary isb instruction
On Wed, Nov 21, 2018 at 03:58:14PM +, Mark Rutland wrote: > On Tue, Nov 20, 2018 at 02:15:19PM +, Andrew Murray wrote: > > The armv8pmu_enable_event_counter function issues an isb instruction > > after enabling a pair of counters - this doesn't provide any value > > and is inconsistent with the armv8pmu_disable_event_counter. > > > > In any case armv8pmu_enable_event_counter is always called with the > > PMU stopped. Starting the PMU with armv8pmu_start results in an isb > > instruction being issued. > > > > Let's remove the unnecessary isb instruction. > > > > Signed-off-by: Andrew Murray > > Acked-by: Mark Rutland > > ... though it would be nice to explicitly point out that the ISB in > armv8pmu_start() is before the write to PMCR_EL1, if you happen to > respin this patch. Thanks, I'm respining so I'll update the log and preserve your ack. Andrew Murray > > Mark. > > > --- > > arch/arm64/kernel/perf_event.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c > > index 8e38d52..de564ae 100644 > > --- a/arch/arm64/kernel/perf_event.c > > +++ b/arch/arm64/kernel/perf_event.c > > @@ -652,7 +652,6 @@ static inline void armv8pmu_enable_event_counter(struct > > perf_event *event) > > armv8pmu_enable_counter(idx); > > if (armv8pmu_event_is_chained(event)) > > armv8pmu_enable_counter(idx - 1); > > - isb(); > > } > > > > static inline int armv8pmu_disable_counter(int idx) > > -- > > 2.7.4 > > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm