RE: [RFT - PATCH v2 0/2] KVM/arm64: add fp/simd lazy switch support

2015-10-05 Thread Mario Smarduch
Will do, I'll get them over to you.

-Original Message-
From: Christoffer Dall [mailto:christoffer.d...@linaro.org] 
Sent: Monday, October 05, 2015 10:26 AM
To: Mario Smarduch
Cc: kvmarm@lists.cs.columbia.edu; marc.zyng...@arm.com; k...@vger.kernel.org; 
linux-arm-ker...@lists.infradead.org
Subject: Re: [RFT - PATCH v2 0/2] KVM/arm64: add fp/simd lazy switch support

On Mon, Oct 05, 2015 at 09:14:57AM -0700, Mario Smarduch wrote:
> Hi Christoffer,
>I just managed to boot qemu arm32 up on arm64 (last Fri - thanks 
> for the tip
> - there were few other issue to clean up), so let me retest it again. 
> Also I noticed some refactoring would help both 32 and 64 bit patches.
> 
> Yes I could provide a the user space tests as well.
> 
I'd like those regardless as I generally test my queue before pushing it to 
next.

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


RE: [RFC PATCH 5/6] vfio-pci: Create iommu mapping for msi interrupt

2015-10-05 Thread Bhushan Bharat


> -Original Message-
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Saturday, October 03, 2015 4:17 AM
> To: Bhushan Bharat-R65777 
> Cc: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org;
> christoffer.d...@linaro.org; eric.au...@linaro.org; pranavku...@linaro.org;
> marc.zyng...@arm.com; will.dea...@arm.com
> Subject: Re: [RFC PATCH 5/6] vfio-pci: Create iommu mapping for msi
> interrupt
> 
> On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote:
> > An MSI-address is allocated and programmed in pcie device during
> > interrupt configuration. Now for a pass-through device, try to create
> > the iommu mapping for this allocted/programmed msi-address.  If the
> > iommu mapping is created and the msi address programmed in the pcie
> > device is different from msi-iova as per iommu programming then
> > reconfigure the pci device to use msi-iova as msi address.
> >
> > Signed-off-by: Bharat Bhushan 
> > ---
> >  drivers/vfio/pci/vfio_pci_intrs.c | 36
> > ++--
> >  1 file changed, 34 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c
> > b/drivers/vfio/pci/vfio_pci_intrs.c
> > index 1f577b4..c9690af 100644
> > --- a/drivers/vfio/pci/vfio_pci_intrs.c
> > +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> > @@ -312,13 +312,23 @@ static int vfio_msi_set_vector_signal(struct
> vfio_pci_device *vdev,
> > int irq = msix ? vdev->msix[vector].vector : pdev->irq + vector;
> > char *name = msix ? "vfio-msix" : "vfio-msi";
> > struct eventfd_ctx *trigger;
> > +   struct msi_msg msg;
> > +   struct vfio_device *device;
> > +   uint64_t msi_addr, msi_iova;
> > int ret;
> >
> > if (vector >= vdev->num_ctx)
> > return -EINVAL;
> >
> > +   device = vfio_device_get_from_dev(>dev);
> 
> Have you looked at this function?  I don't think we want to be doing that
> every time we want to poke the interrupt configuration.

I am trying to describe what I understood, a device can have many interrupts 
and we should setup iommu only once, when called for the first time to 
enable/setup interrupt.
Similarly when disabling the interrupt we should iommu-unmap when called for 
the last enabled interrupt for that device. Now with this understanding, should 
I move this map-unmap to separate functions and call them from 
vfio_msi_set_block() rather than in vfio_msi_set_vector_signal()

>  Also note that
> IOMMU mappings don't operate on devices, but groups, so maybe we want
> to pass the group.

Yes, it operates on group. I hesitated to add an API to get group. Do you 
suggest to that it is ok to add API to get group from device.

> 
> > +   if (device == NULL)
> > +   return -EINVAL;
> 
> This would be a legitimate BUG_ON(!device)
> 
> > +
> > if (vdev->ctx[vector].trigger) {
> > free_irq(irq, vdev->ctx[vector].trigger);
> > +   get_cached_msi_msg(irq, );
> > +   msi_iova = ((u64)msg.address_hi << 32) | msg.address_lo;
> > +   vfio_device_unmap_msi(device, msi_iova, PAGE_SIZE);
> > kfree(vdev->ctx[vector].name);
> > eventfd_ctx_put(vdev->ctx[vector].trigger);
> > vdev->ctx[vector].trigger = NULL;
> > @@ -346,12 +356,11 @@ static int vfio_msi_set_vector_signal(struct
> vfio_pci_device *vdev,
> >  * cached value of the message prior to enabling.
> >  */
> > if (msix) {
> > -   struct msi_msg msg;
> > -
> > get_cached_msi_msg(irq, );
> > pci_write_msi_msg(irq, );
> > }
> >
> > +
> 
> gratuitous newline
> 
> > ret = request_irq(irq, vfio_msihandler, 0,
> >   vdev->ctx[vector].name, trigger);
> > if (ret) {
> > @@ -360,6 +369,29 @@ static int vfio_msi_set_vector_signal(struct
> vfio_pci_device *vdev,
> > return ret;
> > }
> >
> > +   /* Re-program the new-iova in pci-device in case there is
> > +* different iommu-mapping created for programmed msi-address.
> > +*/
> > +   get_cached_msi_msg(irq, );
> > +   msi_iova = 0;
> > +   msi_addr = (u64)(msg.address_hi) << 32 | (u64)(msg.address_lo);
> > +   ret = vfio_device_map_msi(device, msi_addr, PAGE_SIZE,
> _iova);
> > +   if (ret) {
> > +   free_irq(irq, vdev->ctx[vector].trigger);
> > +   kfree(vdev->ctx[vector].name);
> > +   eventfd_ctx_put(trigger);
> > +   return ret;
> > +   }
> > +
> > +   /* Reprogram only if iommu-mapped iova is different from msi-
> address */
> > +   if (msi_iova && (msi_iova != msi_addr)) {
> > +   msg.address_hi = (u32)(msi_iova >> 32);
> > +   /* Keep Lower bits from original msi message address */
> > +   msg.address_lo &= PAGE_MASK;
> > +   msg.address_lo |= (u32)(msi_iova & 0x);
> 
> Seems like you're making some assumptions here that are dependent on the
> architecture and maybe the platform.

What I tried is to map the msi 

RE: [RFC PATCH 3/6] vfio: Extend iommu-info to return MSIs automap state

2015-10-05 Thread Bhushan Bharat


> -Original Message-
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Saturday, October 03, 2015 4:16 AM
> To: Bhushan Bharat-R65777 
> Cc: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org;
> christoffer.d...@linaro.org; eric.au...@linaro.org; pranavku...@linaro.org;
> marc.zyng...@arm.com; will.dea...@arm.com
> Subject: Re: [RFC PATCH 3/6] vfio: Extend iommu-info to return MSIs
> automap state
> 
> On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote:
> > This patch allows the user-space to know whether msi-pages are
> > automatically mapped with some magic iova or not.
> >
> > Even if the msi-pages are automatically mapped, still user-space wants
> > to over-ride the automatic iova selection for msi-mapping.
> > For this user-space need to know whether it is allowed to change the
> > automatic mapping or not and this API provides this mechanism.
> > Follow up patches will provide how to over-ride this.
> >
> > Signed-off-by: Bharat Bhushan 
> > ---
> >  drivers/vfio/vfio_iommu_type1.c | 32
> 
> >  include/uapi/linux/vfio.h   |  3 +++
> >  2 files changed, 35 insertions(+)
> >
> > diff --git a/drivers/vfio/vfio_iommu_type1.c
> > b/drivers/vfio/vfio_iommu_type1.c index fa5d3e4..3315fb6 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -59,6 +59,7 @@ struct vfio_iommu {
> > struct rb_root  dma_list;
> > boolv2;
> > boolnesting;
> > +   boolallow_msi_reconfig;
> > struct list_headreserved_iova_list;
> >  };
> >
> > @@ -1117,6 +1118,23 @@ static int
> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
> > return ret;
> >  }
> >
> > +static
> > +int vfio_domains_get_msi_maps(struct vfio_iommu *iommu,
> > + struct iommu_domain_msi_maps *msi_maps) {
> > +   struct vfio_domain *d;
> > +   int ret;
> > +
> > +   mutex_lock(>lock);
> > +   /* All domains have same msi-automap property, pick first */
> > +   d = list_first_entry(>domain_list, struct vfio_domain, next);
> > +   ret = iommu_domain_get_attr(d->domain,
> DOMAIN_ATTR_MSI_MAPPING,
> > +   msi_maps);
> > +   mutex_unlock(>lock);
> > +
> > +   return ret;
> > +}
> > +
> >  static long vfio_iommu_type1_ioctl(void *iommu_data,
> >unsigned int cmd, unsigned long arg)  { @@
> -1138,6 +1156,8 @@
> > static long vfio_iommu_type1_ioctl(void *iommu_data,
> > }
> > } else if (cmd == VFIO_IOMMU_GET_INFO) {
> > struct vfio_iommu_type1_info info;
> > +   struct iommu_domain_msi_maps msi_maps;
> > +   int ret;
> >
> > minsz = offsetofend(struct vfio_iommu_type1_info,
> iova_pgsizes);
> >
> > @@ -1149,6 +1169,18 @@ static long vfio_iommu_type1_ioctl(void
> > *iommu_data,
> >
> > info.flags = 0;
> >
> > +   ret = vfio_domains_get_msi_maps(iommu, _maps);
> > +   if (ret)
> > +   return ret;
> 
> And now ioctl(VFIO_IOMMU_GET_INFO) no longer works for any IOMMU
> implementing domain_get_attr but not supporting
> DOMAIN_ATTR_MSI_MAPPING.

With this current patch version this will get the default assumed behavior as 
you commented on previous patch. 

> 
> > +
> > +   if (msi_maps.override_automap) {
> > +   info.flags |=
> VFIO_IOMMU_INFO_MSI_ALLOW_RECONFIG;
> > +   iommu->allow_msi_reconfig = true;
> > +   }
> > +
> > +   if (msi_maps.automap)
> > +   info.flags |= VFIO_IOMMU_INFO_MSI_AUTOMAP;
> > +
> > info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
> >
> > return copy_to_user((void __user *)arg, , minsz); diff --
> git
> > a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index
> > 1abd1a9..9998f6e 100644
> > --- a/include/uapi/linux/vfio.h
> > +++ b/include/uapi/linux/vfio.h
> > @@ -391,6 +391,9 @@ struct vfio_iommu_type1_info {
> > __u32   argsz;
> > __u32   flags;
> >  #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)   /* supported page
> sizes info */
> > +#define VFIO_IOMMU_INFO_MSI_AUTOMAP (1 << 1)   /* MSI pages
> are auto-mapped
> > +  in iommu */
> > +#define VFIO_IOMMU_INFO_MSI_ALLOW_RECONFIG (1 << 2) /* Allows
> > +reconfig automap*/
> > __u64   iova_pgsizes;   /* Bitmap of supported page sizes */
> >  };
> >
> 
> Once again, exposing interfaces to the user before they actually do anything
> is backwards.

Will change the order.

Thanks
-Bharat

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


RE: [RFC PATCH 6/6] arm-smmu: Allow to set iommu mapping for MSI

2015-10-05 Thread Bhushan Bharat


> -Original Message-
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Saturday, October 03, 2015 4:17 AM
> To: Bhushan Bharat-R65777 
> Cc: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org;
> christoffer.d...@linaro.org; eric.au...@linaro.org; pranavku...@linaro.org;
> marc.zyng...@arm.com; will.dea...@arm.com
> Subject: Re: [RFC PATCH 6/6] arm-smmu: Allow to set iommu mapping for
> MSI
> 
> On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote:
> > Finally ARM SMMU declare that iommu-mapping for MSI-pages are not set
> > automatically and it should be set explicitly.
> >
> > Signed-off-by: Bharat Bhushan 
> > ---
> >  drivers/iommu/arm-smmu.c | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index
> > a3956fb..9d37e72 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -1401,13 +1401,18 @@ static int arm_smmu_domain_get_attr(struct
> iommu_domain *domain,
> > enum iommu_attr attr, void *data)  {
> > struct arm_smmu_domain *smmu_domain =
> to_smmu_domain(domain);
> > +   struct iommu_domain_msi_maps *msi_maps;
> >
> > switch (attr) {
> > case DOMAIN_ATTR_NESTING:
> > *(int *)data = (smmu_domain->stage ==
> ARM_SMMU_DOMAIN_NESTED);
> > return 0;
> > case DOMAIN_ATTR_MSI_MAPPING:
> > -   /* Dummy handling added */
> > +   msi_maps = data;
> > +
> > +   msi_maps->automap = false;
> > +   msi_maps->override_automap = true;
> > +
> > return 0;
> > default:
> > return -ENODEV;
> 
> In previous discussions I understood one of the problems you were trying to
> solve was having a limited number of MSI banks and while you may be able
> to get isolated MSI banks for some number of users, it wasn't unlimited and
> sharing may be required.  I don't see any of that addressed in this series.

That problem was on PowerPC. Infact there were two problems, one which MSI bank 
to be used and second how to create iommu-mapping for device assigned to 
userspace.
First problem was PowerPC specific and that will be solved separately.
For second problem, earlier I tried to added a couple of MSI specific ioctls 
and you suggested (IIUC) that we should have a generic reserved-iova type of 
API and then we can map MSI bank using reserved-iova and this will not require 
involvement of user-space.

> 
> Also, the management of reserved IOVAs vs MSI addresses looks really
> dubious to me.  How does your platform pick an MSI address and what are
> we breaking by covertly changing it?  We seem to be masking over at the
> VFIO level, where there should be lower level interfaces doing the right thing
> when we configure MSI on the device.

Yes, In my understanding the right solution should be:
 1) VFIO driver should know what physical-msi-address will be used for devices 
in an iommu-group.
I did not find an generic API, on PowerPC I added some function in 
ffrescale msi-driver and called from vfio-iommu-fsl-pamu.c (not yet upstreamed).
 2) VFIO driver should know what IOVA to be used for creating iommu-mapping 
(VFIO APIs patch of this patch series)
 3) VFIO driver will create the iommu-mapping using (1) and (2)
 4) VFIO driver should be able to tell the msi-driver that for a given device 
it should use different IOVA. So when composing the msi message (for the 
devices is the given iommu-group) it should use that programmed iova as 
MSI-address. This interface also needed to be developed.

I was not sure of which approach we should take. The current approach in the 
patch is simple to develop so I went ahead to take input but I agree this does 
not look very good.
What do you think, should drop this approach and work out the approach as 
described above.

Thanks
-Bharat
> 
> The problem of reporting "automap" base address isn't addressed more than
> leaving some unused field in iommu_domain_msi_maps.

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v5 0/3] vfio: platform: return device properties for a platform device

2015-10-05 Thread Christoffer Dall
On Fri, Oct 02, 2015 at 11:53:07PM +0100, Peter Maydell wrote:
> On 2 October 2015 at 21:28, Christoffer Dall
>  wrote:
> > We discussed this for the purposes of ARM during SFO15 last week, and
> > basically arrived at the conclusion that the resonable thing to do is to
> > rely on sysfs device tree parsing in userspace.  We don't have a great
> > solution for ACPI yet, but we also don't know of any ACPI-only devices
> > that want platform device passthrough yet.
> 
> I wasn't hugely happy with that approach though:
>  * it's DT specific and just won't work on ACPI platforms; implementing
>features with a "needs DT" dependency seems like it will come back to
>bite us later

I tend to agree with that point of view, but I don't have hugely better
ideas.

>  * I don't really want to build in a lot of infrastructure into
>QEMU to either build the DTC compiler into it or else introduce
>a runtime dependency on the dtc binary

what level of complexity are we really talking about here?  Doesn't QEMU
already link against libfdt and doesn't it support exactly what we're
trying to do here?

> , if this is just going
>to be a stopgap solution until somebody says "has to work on
>ACPI" and we need to do it some other way
> 
> On the other hand I don't exactly have a better approach to suggest

I also don't think this is the job of VFIO, assuming there is some
better place to do this.  I initially thought exposing device properties
in some canonical format from sysfs independently from whether the
system was booted with ACPI or DT was the right thing to do, but the
counter argument to this was essentially that the guest kernel needs the
same description as the host kernel and therefore we really have to find
a way to pass the HW description bits on to the guest.

> (except "don't do device passthrough for platform devices, insist
> on a real bus like PCI"...)
> 
While I appreciate the simplicity of this solution from our
(maintainers') point of view, I still see the latter point as relatively
moot.  We have hardware with 10G platform ethernet devices that people
want to do device assignment on already, and I think we should try to
find a reasonable set of boundaries for setups that we can support
upstream instead of this becoming a black hole of derivative code bases
to do this sort of thing.

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v5 0/3] vfio: platform: return device properties for a platform device

2015-10-05 Thread Baptiste Reynal
In this patch series we want to wrap an already available kernel
interface to expose a device property to userspace, in order to keep
the code lighter on the userspace. We need those properties in VFIO as
VFIO grants the possibility to develop userspace drivers.

The sysfs doesn't seems to be ready for this kind of usage. We can
only find raw data that require heavy parsing. Here we retrieve
directly usable data and it can be extended later according to new
needs (as it is already done with ACPI).

This interface has been developed for VFIO and is currently bound to
it, though there is no special dependencies with it. We could make it
more generic, but I can only think of VFIO to use it.

On Sat, Oct 3, 2015 at 12:53 AM, Peter Maydell  wrote:
> On 2 October 2015 at 21:28, Christoffer Dall
>  wrote:
>> We discussed this for the purposes of ARM during SFO15 last week, and
>> basically arrived at the conclusion that the resonable thing to do is to
>> rely on sysfs device tree parsing in userspace.  We don't have a great
>> solution for ACPI yet, but we also don't know of any ACPI-only devices
>> that want platform device passthrough yet.
>
> I wasn't hugely happy with that approach though:
>  * it's DT specific and just won't work on ACPI platforms; implementing
>features with a "needs DT" dependency seems like it will come back to
>bite us later
>  * I don't really want to build in a lot of infrastructure into
>QEMU to either build the DTC compiler into it or else introduce
>a runtime dependency on the dtc binary, if this is just going
>to be a stopgap solution until somebody says "has to work on
>ACPI" and we need to do it some other way
>
> On the other hand I don't exactly have a better approach to suggest
> (except "don't do device passthrough for platform devices, insist
> on a real bus like PCI"...)
>
> thanks
> -- PMM
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCHv2] arm: Add PMU test

2015-10-05 Thread Wei Huang


On 10/02/2015 10:48 AM, Christopher Covington wrote:
> Add test the ARM Performance Monitors Unit (PMU). The informational
> fields from the control register are printed, but not checked, and
> the number of cycles it takes to run a known-instruction-count loop
> is printed, but not checked. Once QEMU is fixed, we can at least
> begin to check for IPC == 1 when using -icount.

Thanks for submitting it. I think this is a good starting point to add
PMU unit testing support for ARM. Some comments below.

> 
> Signed-off-by: Christopher Covington 
> ---
>  arm/pmu.c   | 89 
> +
>  arm/unittests.cfg   | 11 ++
>  config/config-arm64.mak |  4 ++-
>  3 files changed, 103 insertions(+), 1 deletion(-)
>  create mode 100644 arm/pmu.c
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> new file mode 100644
> index 000..f724c2c
> --- /dev/null
> +++ b/arm/pmu.c
> @@ -0,0 +1,89 @@
> +/*
> + * Test the ARM Performance Monitors Unit (PMU).
> + *
> + * Copyright 2015 The Linux Foundation. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU Lesser General Public License version 2.1 and
> + * only version 2.1 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but 
> WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public 
> License
> + * for more details.
> + */
> +#include "libcflat.h"
> +
> +struct pmu_data {
> + union {
> + uint32_t pmcr_el0;
> + struct {
> + unsigned int enable:1;
> + unsigned int event_counter_reset:1;
> + unsigned int cycle_counter_reset:1;
> + unsigned int cycle_counter_clock_divider:1;
> + unsigned int event_counter_export:1;
> + unsigned int cycle_counter_disable_when_prohibited:1;
> + unsigned int cycle_counter_long:1;
> + unsigned int zeros:4;
> + unsigned int num_counters:5;
> + unsigned int identification_code:8;
> + unsigned int implementor:8;

Not saying it is a problem because "unsigned int" is 32-bit on 64bit
machine. But to make it consistent with pmcr_el0, I would prefer
"unsigned int" been replaced by "uint32_t".

> + };
> + };
> +};
> +
> +/* Execute a known number of guest instructions. Only odd instruction counts
> + * greater than or equal to 3 are supported. The control register (PMCR) is
> + * initialized with the provided value (allowing for example for the cycle
> + * counter or eventer count to be reset if needed). After the known 
> instruction
> + * count loop, zero is written to the PMCR to disable counting, allowing the
> + * cycle counter or event counters to be read as needed at a later time.
> + */
> +static void measure_instrs(int len, struct pmu_data pmcr)
> +{
> + int i = (len - 1) / 2;
> +
> + if (len < 3 || ((len - 1) % 2))
> + abort();
> +
> + asm volatile(
> + "msr pmcr_el0, %[pmcr]\n"
> + "1: subs %[i], %[i], #1\n"
> + "b.gt 1b\n"
> + "msr pmcr_el0, xzr"
> + : [i] "+r" (i) : [pmcr] "r" (pmcr) : "cc");
> +}
> +
> +int main()
> +{
> + struct pmu_data pmcr;
> + const int samples = 10;
> +
> + asm volatile("mrs %0, pmcr_el0" : "=r" (pmcr));
> +
> + printf("PMU implementor: %c\n", pmcr.implementor);
> + printf("Identification code: 0x%x\n", pmcr.identification_code);
> + printf("Event counters:  %d\n", pmcr.num_counters);
> +
> + pmcr.cycle_counter_reset = 1;
> + pmcr.enable = 1;
> +
> + printf("\ninstructions : cycles0 cycles1 ...\n");
> +
> + for (int i = 3; i < 300; i += 32) {
> + int avg, sum = 0;
> + printf("%d :", i);
> + for (int j = 0; j < samples; j++) {
> + int val;
> + measure_instrs(i, pmcr);
> + asm volatile("mrs %0, pmccntr_el0" : "=r" (val));
> + sum += val;
> + printf(" %d", val);
> + }
> + avg = sum / samples;
> + printf(" sum=%d avg=%d avg_ipc=%d avg_cpi=%d\n", sum, avg, i / 
> avg, avg / i);
> + }

I understand that, as stated in commit message, it currently doesn't
check the correctness of PMU counter values. But it would be better if
testing code can be abstracted into an independent function (e.g.
instr_cycle_check) for report("Instruction Cycles",
instr_cycle_check()). You can return TRUE in the checking code for now.


> +
> + return report_summary();
> +}
> diff --git a/arm/unittests.cfg b/arm/unittests.cfg
> index 

Re: [RFC PATCH 1/6] vfio: Add interface for add/del reserved iova region

2015-10-05 Thread Alex Williamson
On Mon, 2015-10-05 at 04:55 +, Bhushan Bharat wrote:
> Hi Alex,
> 
> > -Original Message-
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Saturday, October 03, 2015 4:16 AM
> > To: Bhushan Bharat-R65777 
> > Cc: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org;
> > christoffer.d...@linaro.org; eric.au...@linaro.org; pranavku...@linaro.org;
> > marc.zyng...@arm.com; will.dea...@arm.com
> > Subject: Re: [RFC PATCH 1/6] vfio: Add interface for add/del reserved iova
> > region
> > 
> > On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote:
> > > This Patch adds the VFIO APIs to add and remove reserved iova regions.
> > > The reserved iova region can be used for mapping some specific
> > > physical address in iommu.
> > >
> > > Currently we are planning to use this interface for adding iova
> > > regions for creating iommu of msi-pages. But the API are designed for
> > > future extension where some other physical address can be mapped.
> > >
> > > Signed-off-by: Bharat Bhushan 
> > > ---
> > >  drivers/vfio/vfio_iommu_type1.c | 201
> > +++-
> > >  include/uapi/linux/vfio.h   |  43 +
> > >  2 files changed, 243 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/vfio/vfio_iommu_type1.c
> > > b/drivers/vfio/vfio_iommu_type1.c index 57d8c37..fa5d3e4 100644
> > > --- a/drivers/vfio/vfio_iommu_type1.c
> > > +++ b/drivers/vfio/vfio_iommu_type1.c
> > > @@ -59,6 +59,7 @@ struct vfio_iommu {
> > >   struct rb_root  dma_list;
> > >   boolv2;
> > >   boolnesting;
> > > + struct list_headreserved_iova_list;
> > 
> > This alignment leads to poor packing in the structure, put it above the 
> > bools.
> 
> ok
> 
> > 
> > >  };
> > >
> > >  struct vfio_domain {
> > > @@ -77,6 +78,15 @@ struct vfio_dma {
> > >   int prot;   /* IOMMU_READ/WRITE */
> > >  };
> > >
> > > +struct vfio_resvd_region {
> > > + dma_addr_t  iova;
> > > + size_t  size;
> > > + int prot;   /* IOMMU_READ/WRITE */
> > > + int refcount;   /* ref count of mappings */
> > > + uint64_tmap_paddr;  /* Mapped Physical Address
> > */
> > 
> > phys_addr_t
> 
> Ok,
> 
> > 
> > > + struct list_head next;
> > > +};
> > > +
> > >  struct vfio_group {
> > >   struct iommu_group  *iommu_group;
> > >   struct list_headnext;
> > > @@ -106,6 +116,38 @@ static struct vfio_dma *vfio_find_dma(struct
> > vfio_iommu *iommu,
> > >   return NULL;
> > >  }
> > >
> > > +/* This function must be called with iommu->lock held */ static bool
> > > +vfio_overlap_with_resvd_region(struct vfio_iommu *iommu,
> > > +dma_addr_t start, size_t size) {
> > > + struct vfio_resvd_region *region;
> > > +
> > > + list_for_each_entry(region, >reserved_iova_list, next) {
> > > + if (region->iova < start)
> > > + return (start - region->iova < region->size);
> > > + else if (start < region->iova)
> > > + return (region->iova - start < size);
> > 
> > <= on both of the return lines?
> 
> I think is should be "<" and not "=<", no ?

Yep, looks like you're right.  Maybe there's a more straightforward way
to do this.

> > 
> > > +
> > > + return (region->size > 0 && size > 0);
> > > + }
> > > +
> > > + return false;
> > > +}
> > > +
> > > +/* This function must be called with iommu->lock held */ static
> > > +struct vfio_resvd_region *vfio_find_resvd_region(struct vfio_iommu
> > *iommu,
> > > +  dma_addr_t start, size_t
> > size) {
> > > + struct vfio_resvd_region *region;
> > > +
> > > + list_for_each_entry(region, >reserved_iova_list, next)
> > > + if (region->iova == start && region->size == size)
> > > + return region;
> > > +
> > > + return NULL;
> > > +}
> > > +
> > >  static void vfio_link_dma(struct vfio_iommu *iommu, struct vfio_dma
> > > *new)  {
> > >   struct rb_node **link = >dma_list.rb_node, *parent =
> > NULL; @@
> > > -580,7 +622,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
> > >
> > >   mutex_lock(>lock);
> > >
> > > - if (vfio_find_dma(iommu, iova, size)) {
> > > + if (vfio_find_dma(iommu, iova, size) ||
> > > + vfio_overlap_with_resvd_region(iommu, iova, size)) {
> > >   mutex_unlock(>lock);
> > >   return -EEXIST;
> > >   }
> > > @@ -626,6 +669,127 @@ static int vfio_dma_do_map(struct vfio_iommu
> > *iommu,
> > >   return ret;
> > >  }
> > >
> > > +/* This function must be called with iommu->lock held */ static int
> > > +vfio_iommu_resvd_region_del(struct vfio_iommu *iommu,
> > > + dma_addr_t iova, size_t size, int prot) {
> > > + struct vfio_resvd_region *res_region;
> > 
> > Have some consistency in naming, just use "region".
> 
> Ok,
> 
> > 

Re: [RFC PATCH 3/6] vfio: Extend iommu-info to return MSIs automap state

2015-10-05 Thread Alex Williamson
On Mon, 2015-10-05 at 06:00 +, Bhushan Bharat wrote:
> > -1138,6 +1156,8 @@
> > > static long vfio_iommu_type1_ioctl(void *iommu_data,
> > >   }
> > >   } else if (cmd == VFIO_IOMMU_GET_INFO) {
> > >   struct vfio_iommu_type1_info info;
> > > + struct iommu_domain_msi_maps msi_maps;
> > > + int ret;
> > >
> > >   minsz = offsetofend(struct vfio_iommu_type1_info,
> > iova_pgsizes);
> > >
> > > @@ -1149,6 +1169,18 @@ static long vfio_iommu_type1_ioctl(void
> > > *iommu_data,
> > >
> > >   info.flags = 0;
> > >
> > > + ret = vfio_domains_get_msi_maps(iommu, _maps);
> > > + if (ret)
> > > + return ret;
> > 
> > And now ioctl(VFIO_IOMMU_GET_INFO) no longer works for any IOMMU
> > implementing domain_get_attr but not supporting
> > DOMAIN_ATTR_MSI_MAPPING.
> 
> With this current patch version this will get the default assumed behavior as 
> you commented on previous patch. 

How so?

+   msi_maps->automap = true;
+   msi_maps->override_automap = false;
+
+   if (domain->ops->domain_get_attr)
+   ret = domain->ops->domain_get_attr(domain, attr, data);

If domain_get_attr is implemented, but DOMAIN_ATTR_MSI_MAPPING is not,
ret should be an error code.

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH 4/6] vfio: Add interface to iommu-map/unmap MSI pages

2015-10-05 Thread Alex Williamson
On Mon, 2015-10-05 at 06:27 +, Bhushan Bharat wrote:
> 
> 
> > -Original Message-
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Saturday, October 03, 2015 4:16 AM
> > To: Bhushan Bharat-R65777 
> > Cc: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org;
> > christoffer.d...@linaro.org; eric.au...@linaro.org; pranavku...@linaro.org;
> > marc.zyng...@arm.com; will.dea...@arm.com
> > Subject: Re: [RFC PATCH 4/6] vfio: Add interface to iommu-map/unmap MSI
> > pages
> > 
> > On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote:
> > > For MSI interrupts to work for a pass-through devices we need to have
> > > mapping of msi-pages in iommu. Now on some platforms (like x86) does
> > > this msi-pages mapping happens magically and in these case they
> > > chooses an iova which they somehow know that it will never overlap
> > > with guest memory. But this magic iova selection may not be always
> > > true for all platform (like PowerPC and ARM64).
> > >
> > > Also on x86 platform, there is no problem as long as running a
> > > x86-guest on x86-host but there can be issues when running a non-x86
> > > guest on
> > > x86 host or other userspace applications like (I think ODP/DPDK).
> > > As in these cases there can be chances that it overlaps with guest
> > > memory mapping.
> > 
> > Wow, it's amazing anything works... smoke and mirrors.
> > 
> > > This patch add interface to iommu-map and iommu-unmap msi-pages at
> > > reserved iova chosen by userspace.
> > >
> > > Signed-off-by: Bharat Bhushan 
> > > ---
> > >  drivers/vfio/vfio.c |  52 +++
> > >  drivers/vfio/vfio_iommu_type1.c | 111
> > 
> > >  include/linux/vfio.h|   9 +++-
> > >  3 files changed, 171 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index
> > > 2fb29df..a817d2d 100644
> > > --- a/drivers/vfio/vfio.c
> > > +++ b/drivers/vfio/vfio.c
> > > @@ -605,6 +605,58 @@ static int vfio_iommu_group_notifier(struct
> > notifier_block *nb,
> > >   return NOTIFY_OK;
> > >  }
> > >
> > > +int vfio_device_map_msi(struct vfio_device *device, uint64_t msi_addr,
> > > + uint32_t size, uint64_t *msi_iova) {
> > > + struct vfio_container *container = device->group->container;
> > > + struct vfio_iommu_driver *driver;
> > > + int ret;
> > > +
> > > + /* Validate address and size */
> > > + if (!msi_addr || !size || !msi_iova)
> > > + return -EINVAL;
> > > +
> > > + down_read(>group_lock);
> > > +
> > > + driver = container->iommu_driver;
> > > + if (!driver || !driver->ops || !driver->ops->msi_map) {
> > > + up_read(>group_lock);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + ret = driver->ops->msi_map(container->iommu_data,
> > > +msi_addr, size, msi_iova);
> > > +
> > > + up_read(>group_lock);
> > > + return ret;
> > > +}
> > > +
> > > +int vfio_device_unmap_msi(struct vfio_device *device, uint64_t
> > msi_iova,
> > > +   uint64_t size)
> > > +{
> > > + struct vfio_container *container = device->group->container;
> > > + struct vfio_iommu_driver *driver;
> > > + int ret;
> > > +
> > > + /* Validate address and size */
> > > + if (!msi_iova || !size)
> > > + return -EINVAL;
> > > +
> > > + down_read(>group_lock);
> > > +
> > > + driver = container->iommu_driver;
> > > + if (!driver || !driver->ops || !driver->ops->msi_unmap) {
> > > + up_read(>group_lock);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + ret = driver->ops->msi_unmap(container->iommu_data,
> > > +  msi_iova, size);
> > > +
> > > + up_read(>group_lock);
> > > + return ret;
> > > +}
> > > +
> > >  /**
> > >   * VFIO driver API
> > >   */
> > > diff --git a/drivers/vfio/vfio_iommu_type1.c
> > > b/drivers/vfio/vfio_iommu_type1.c index 3315fb6..ab376c2 100644
> > > --- a/drivers/vfio/vfio_iommu_type1.c
> > > +++ b/drivers/vfio/vfio_iommu_type1.c
> > > @@ -1003,12 +1003,34 @@ out_free:
> > >   return ret;
> > >  }
> > >
> > > +static void vfio_iommu_unmap_all_reserved_regions(struct vfio_iommu
> > > +*iommu) {
> > > + struct vfio_resvd_region *region;
> > > + struct vfio_domain *d;
> > > +
> > > + list_for_each_entry(region, >reserved_iova_list, next) {
> > > + list_for_each_entry(d, >domain_list, next) {
> > > + if (!region->map_paddr)
> > > + continue;
> > > +
> > > + if (!iommu_iova_to_phys(d->domain, region->iova))
> > > + continue;
> > > +
> > > + iommu_unmap(d->domain, region->iova,
> > PAGE_SIZE);
> > 
> > PAGE_SIZE?  Why not region->size?
> 
> Yes, this should be region->size.
> 
> > 
> > > + region->map_paddr = 0;
> > > + cond_resched();
> > > + }
> > > + }
> > > +}
> > > +
> > >  static void 

Re: [RFC PATCH 5/6] vfio-pci: Create iommu mapping for msi interrupt

2015-10-05 Thread Alex Williamson
On Mon, 2015-10-05 at 07:20 +, Bhushan Bharat wrote:
> 
> 
> > -Original Message-
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Saturday, October 03, 2015 4:17 AM
> > To: Bhushan Bharat-R65777 
> > Cc: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org;
> > christoffer.d...@linaro.org; eric.au...@linaro.org; pranavku...@linaro.org;
> > marc.zyng...@arm.com; will.dea...@arm.com
> > Subject: Re: [RFC PATCH 5/6] vfio-pci: Create iommu mapping for msi
> > interrupt
> > 
> > On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote:
> > > An MSI-address is allocated and programmed in pcie device during
> > > interrupt configuration. Now for a pass-through device, try to create
> > > the iommu mapping for this allocted/programmed msi-address.  If the
> > > iommu mapping is created and the msi address programmed in the pcie
> > > device is different from msi-iova as per iommu programming then
> > > reconfigure the pci device to use msi-iova as msi address.
> > >
> > > Signed-off-by: Bharat Bhushan 
> > > ---
> > >  drivers/vfio/pci/vfio_pci_intrs.c | 36
> > > ++--
> > >  1 file changed, 34 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c
> > > b/drivers/vfio/pci/vfio_pci_intrs.c
> > > index 1f577b4..c9690af 100644
> > > --- a/drivers/vfio/pci/vfio_pci_intrs.c
> > > +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> > > @@ -312,13 +312,23 @@ static int vfio_msi_set_vector_signal(struct
> > vfio_pci_device *vdev,
> > >   int irq = msix ? vdev->msix[vector].vector : pdev->irq + vector;
> > >   char *name = msix ? "vfio-msix" : "vfio-msi";
> > >   struct eventfd_ctx *trigger;
> > > + struct msi_msg msg;
> > > + struct vfio_device *device;
> > > + uint64_t msi_addr, msi_iova;
> > >   int ret;
> > >
> > >   if (vector >= vdev->num_ctx)
> > >   return -EINVAL;
> > >
> > > + device = vfio_device_get_from_dev(>dev);
> > 
> > Have you looked at this function?  I don't think we want to be doing that
> > every time we want to poke the interrupt configuration.
> 
> I am trying to describe what I understood, a device can have many interrupts 
> and we should setup iommu only once, when called for the first time to 
> enable/setup interrupt.
> Similarly when disabling the interrupt we should iommu-unmap when called for 
> the last enabled interrupt for that device. Now with this understanding, 
> should I move this map-unmap to separate functions and call them from 
> vfio_msi_set_block() rather than in vfio_msi_set_vector_signal()

Interrupts can be setup and torn down at any time and I don't see how
one function or the other makes much difference.
vfio_device_get_from_dev() is enough overhead that the data we need
should be cached if we're going to call it with some regularity.  Maybe
vfio_iommu_driver_ops.open() should be called with a pointer to the
vfio_device... or the vfio_group.

> >  Also note that
> > IOMMU mappings don't operate on devices, but groups, so maybe we want
> > to pass the group.
> 
> Yes, it operates on group. I hesitated to add an API to get group. Do you 
> suggest to that it is ok to add API to get group from device.

No, the above suggestion is probably better.

> > 
> > > + if (device == NULL)
> > > + return -EINVAL;
> > 
> > This would be a legitimate BUG_ON(!device)
> > 
> > > +
> > >   if (vdev->ctx[vector].trigger) {
> > >   free_irq(irq, vdev->ctx[vector].trigger);
> > > + get_cached_msi_msg(irq, );
> > > + msi_iova = ((u64)msg.address_hi << 32) | msg.address_lo;
> > > + vfio_device_unmap_msi(device, msi_iova, PAGE_SIZE);
> > >   kfree(vdev->ctx[vector].name);
> > >   eventfd_ctx_put(vdev->ctx[vector].trigger);
> > >   vdev->ctx[vector].trigger = NULL;
> > > @@ -346,12 +356,11 @@ static int vfio_msi_set_vector_signal(struct
> > vfio_pci_device *vdev,
> > >* cached value of the message prior to enabling.
> > >*/
> > >   if (msix) {
> > > - struct msi_msg msg;
> > > -
> > >   get_cached_msi_msg(irq, );
> > >   pci_write_msi_msg(irq, );
> > >   }
> > >
> > > +
> > 
> > gratuitous newline
> > 
> > >   ret = request_irq(irq, vfio_msihandler, 0,
> > > vdev->ctx[vector].name, trigger);
> > >   if (ret) {
> > > @@ -360,6 +369,29 @@ static int vfio_msi_set_vector_signal(struct
> > vfio_pci_device *vdev,
> > >   return ret;
> > >   }
> > >
> > > + /* Re-program the new-iova in pci-device in case there is
> > > +  * different iommu-mapping created for programmed msi-address.
> > > +  */
> > > + get_cached_msi_msg(irq, );
> > > + msi_iova = 0;
> > > + msi_addr = (u64)(msg.address_hi) << 32 | (u64)(msg.address_lo);
> > > + ret = vfio_device_map_msi(device, msi_addr, PAGE_SIZE,
> > _iova);
> > > + if (ret) {
> > > + free_irq(irq, vdev->ctx[vector].trigger);
> > > + kfree(vdev->ctx[vector].name);
> > > +

Re: [RFC PATCH 6/6] arm-smmu: Allow to set iommu mapping for MSI

2015-10-05 Thread Alex Williamson
On Mon, 2015-10-05 at 08:33 +, Bhushan Bharat wrote:
> 
> 
> > -Original Message-
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Saturday, October 03, 2015 4:17 AM
> > To: Bhushan Bharat-R65777 
> > Cc: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org;
> > christoffer.d...@linaro.org; eric.au...@linaro.org; pranavku...@linaro.org;
> > marc.zyng...@arm.com; will.dea...@arm.com
> > Subject: Re: [RFC PATCH 6/6] arm-smmu: Allow to set iommu mapping for
> > MSI
> > 
> > On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote:
> > > Finally ARM SMMU declare that iommu-mapping for MSI-pages are not set
> > > automatically and it should be set explicitly.
> > >
> > > Signed-off-by: Bharat Bhushan 
> > > ---
> > >  drivers/iommu/arm-smmu.c | 7 ++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index
> > > a3956fb..9d37e72 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -1401,13 +1401,18 @@ static int arm_smmu_domain_get_attr(struct
> > iommu_domain *domain,
> > >   enum iommu_attr attr, void *data)  {
> > >   struct arm_smmu_domain *smmu_domain =
> > to_smmu_domain(domain);
> > > + struct iommu_domain_msi_maps *msi_maps;
> > >
> > >   switch (attr) {
> > >   case DOMAIN_ATTR_NESTING:
> > >   *(int *)data = (smmu_domain->stage ==
> > ARM_SMMU_DOMAIN_NESTED);
> > >   return 0;
> > >   case DOMAIN_ATTR_MSI_MAPPING:
> > > - /* Dummy handling added */
> > > + msi_maps = data;
> > > +
> > > + msi_maps->automap = false;
> > > + msi_maps->override_automap = true;
> > > +
> > >   return 0;
> > >   default:
> > >   return -ENODEV;
> > 
> > In previous discussions I understood one of the problems you were trying to
> > solve was having a limited number of MSI banks and while you may be able
> > to get isolated MSI banks for some number of users, it wasn't unlimited and
> > sharing may be required.  I don't see any of that addressed in this series.
> 
> That problem was on PowerPC. Infact there were two problems, one which MSI 
> bank to be used and second how to create iommu-mapping for device assigned to 
> userspace.
> First problem was PowerPC specific and that will be solved separately.
> For second problem, earlier I tried to added a couple of MSI specific ioctls 
> and you suggested (IIUC) that we should have a generic reserved-iova type of 
> API and then we can map MSI bank using reserved-iova and this will not 
> require involvement of user-space.
> 
> > 
> > Also, the management of reserved IOVAs vs MSI addresses looks really
> > dubious to me.  How does your platform pick an MSI address and what are
> > we breaking by covertly changing it?  We seem to be masking over at the
> > VFIO level, where there should be lower level interfaces doing the right 
> > thing
> > when we configure MSI on the device.
> 
> Yes, In my understanding the right solution should be:
>  1) VFIO driver should know what physical-msi-address will be used for 
> devices in an iommu-group.
> I did not find an generic API, on PowerPC I added some function in 
> ffrescale msi-driver and called from vfio-iommu-fsl-pamu.c (not yet 
> upstreamed).
>  2) VFIO driver should know what IOVA to be used for creating iommu-mapping 
> (VFIO APIs patch of this patch series)
>  3) VFIO driver will create the iommu-mapping using (1) and (2)
>  4) VFIO driver should be able to tell the msi-driver that for a given device 
> it should use different IOVA. So when composing the msi message (for the 
> devices is the given iommu-group) it should use that programmed iova as 
> MSI-address. This interface also needed to be developed.
> 
> I was not sure of which approach we should take. The current approach in the 
> patch is simple to develop so I went ahead to take input but I agree this 
> does not look very good.
> What do you think, should drop this approach and work out the approach as 
> described above.

I'm certainly not interested in applying an maintaining an interim
solution that isn't the right one.  It seems like VFIO is too involved
in this process in your example.  On x86 we have per vector isolation
and the only thing we're missing is reporting back of the region used by
MSI vectors as reserved IOVA space (but it's standard on x86, so an x86
VM user will never use it for IOVA).  In your model, the MSI IOVA space
is programmable, but it has page granularity (I assume).  Therefore we
shouldn't be sharing that page with anyone.  That seems to suggest we
need to allocate a page of vector space from the host kernel, setup the
IOVA mapping, and then the host kernel should know to only allocate MSI
vectors for these devices from that pre-allocated page.  Otherwise we
need to call the interrupts unsafe, like we do on x86 without 

[PATCH v2 1/2] add hooks for armv8 fp/simd lazy switch

2015-10-05 Thread Mario Smarduch
This patch adds hooks to support fp/simd lazy switch. A vcpu flag to track
fp/simd state, and flag offset in vcpu structure.

Signed-off-by: Mario Smarduch 
---
 arch/arm64/include/asm/kvm_host.h | 3 +++
 arch/arm64/kernel/asm-offsets.c   | 1 +
 2 files changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 4562459..03f25d0 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -157,6 +157,9 @@ struct kvm_vcpu_arch {
/* Interrupt related fields */
u64 irq_lines;  /* IRQ and FIQ levels */
 
+   /* Track fp/simd lazy switch */
+   u32 vfp_lazy;
+
/* Cache some mmu pages needed inside spinlock regions */
struct kvm_mmu_memory_cache mmu_page_cache;
 
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 8d89cf8..8311da4 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -124,6 +124,7 @@ int main(void)
   DEFINE(VCPU_HCR_EL2, offsetof(struct kvm_vcpu, arch.hcr_el2));
   DEFINE(VCPU_MDCR_EL2,offsetof(struct kvm_vcpu, arch.mdcr_el2));
   DEFINE(VCPU_IRQ_LINES,   offsetof(struct kvm_vcpu, arch.irq_lines));
+  DEFINE(VCPU_VFP_LAZY, offsetof(struct kvm_vcpu, arch.vfp_lazy));
   DEFINE(VCPU_HOST_CONTEXT,offsetof(struct kvm_vcpu, 
arch.host_cpu_context));
   DEFINE(VCPU_HOST_DEBUG_STATE, offsetof(struct kvm_vcpu, 
arch.host_debug_state));
   DEFINE(VCPU_TIMER_CNTV_CTL,  offsetof(struct kvm_vcpu, 
arch.timer_cpu.cntv_ctl));
-- 
1.9.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 2/2] enable armv8 fp/simd lazy switch

2015-10-05 Thread Mario Smarduch
This patch enables arm64 lazy fp/simd switch. Removes the ARM constraint,
and follows the same approach as armv7 version - found here.

https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016567.html

To summarize - provided the guest accesses fp/simd unit we limit number
of fp/simd context switches to two per vCPU execution schedule.

Signed-off-by: Mario Smarduch 
---
 arch/arm/kvm/arm.c   |  2 --
 arch/arm64/include/asm/kvm_asm.h |  1 +
 arch/arm64/kvm/hyp.S | 59 +++-
 3 files changed, 41 insertions(+), 21 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 1b1f9e9..fe609f1 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -112,12 +112,10 @@ void kvm_arch_check_processor_compat(void *rtn)
  */
 static void kvm_switch_fp_regs(struct kvm_vcpu *vcpu)
 {
-#ifdef CONFIG_ARM
if (vcpu->arch.vfp_lazy == 1) {
kvm_call_hyp(__kvm_restore_host_vfp_state, vcpu);
vcpu->arch.vfp_lazy = 0;
}
-#endif
 }
 
 /**
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 5e37710..83dcac5 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -117,6 +117,7 @@ extern char __kvm_hyp_vector[];
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
+extern void __kvm_restore_host_vfp_state(struct kvm_vcpu *vcpu);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index e583613..ea99f66 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -385,14 +385,6 @@
tbz \tmp, #KVM_ARM64_DEBUG_DIRTY_SHIFT, \target
 .endm
 
-/*
- * Branch to target if CPTR_EL2.TFP bit is set (VFP/SIMD trapping enabled)
- */
-.macro skip_fpsimd_state tmp, target
-   mrs \tmp, cptr_el2
-   tbnz\tmp, #CPTR_EL2_TFP_SHIFT, \target
-.endm
-
 .macro compute_debug_state target
// Compute debug state: If any of KDE, MDE or KVM_ARM64_DEBUG_DIRTY
// is set, we do a full save/restore cycle and disable trapping.
@@ -433,10 +425,6 @@
mrs x5, ifsr32_el2
stp x4, x5, [x3]
 
-   skip_fpsimd_state x8, 2f
-   mrs x6, fpexc32_el2
-   str x6, [x3, #16]
-2:
skip_debug_state x8, 1f
mrs x7, dbgvcr32_el2
str x7, [x3, #24]
@@ -481,8 +469,15 @@
isb
 99:
msr hcr_el2, x2
-   mov x2, #CPTR_EL2_TTA
+
+   mov x2, #0
+   ldr w3, [x0, #VCPU_VFP_LAZY]
+   tbnzw3, #0, 98f
+
orr x2, x2, #CPTR_EL2_TFP
+98:
+   orr x2, x2, #CPTR_EL2_TTA
+
msr cptr_el2, x2
 
mov x2, #(1 << 15)  // Trap CP15 Cr=15
@@ -669,14 +664,12 @@ __restore_debug:
ret
 
 __save_fpsimd:
-   skip_fpsimd_state x3, 1f
save_fpsimd
-1: ret
+   ret
 
 __restore_fpsimd:
-   skip_fpsimd_state x3, 1f
restore_fpsimd
-1: ret
+   ret
 
 switch_to_guest_fpsimd:
pushx4, lr
@@ -688,6 +681,9 @@ switch_to_guest_fpsimd:
 
mrs x0, tpidr_el2
 
+   mov w2, #1
+   str w2, [x0, #VCPU_VFP_LAZY]
+
ldr x2, [x0, #VCPU_HOST_CONTEXT]
kern_hyp_va x2
bl __save_fpsimd
@@ -763,7 +759,6 @@ __kvm_vcpu_return:
add x2, x0, #VCPU_CONTEXT
 
save_guest_regs
-   bl __save_fpsimd
bl __save_sysregs
 
skip_debug_state x3, 1f
@@ -784,7 +779,6 @@ __kvm_vcpu_return:
kern_hyp_va x2
 
bl __restore_sysregs
-   bl __restore_fpsimd
/* Clear FPSIMD and Trace trapping */
msr cptr_el2, xzr
 
@@ -863,6 +857,33 @@ ENTRY(__kvm_flush_vm_context)
ret
 ENDPROC(__kvm_flush_vm_context)
 
+/**
+ * kvm_switch_fp_regs() - switch guest/host VFP/SIMD registers
+ * @vcpu:  pointer to vcpu structure.
+ *
+ */
+ENTRY(__kvm_restore_host_vfp_state)
+   pushx4, lr
+
+   kern_hyp_va x0
+   add x2, x0, #VCPU_CONTEXT
+
+   // Load Guest HCR, determine if guest is 32 or 64 bit
+   ldr x3, [x0, #VCPU_HCR_EL2]
+   tbnzx3, #HCR_RW_SHIFT, 1f
+   mrs x4, fpexc32_el2
+   str x4, [x2, #CPU_SYSREG_OFFSET(FPEXC32_EL2)]
+1:
+   bl __save_fpsimd
+
+   ldr x2, [x0, #VCPU_HOST_CONTEXT]
+   kern_hyp_va x2
+   bl __restore_fpsimd
+
+   pop x4, lr
+   ret
+ENDPROC(__kvm_restore_host_vfp_state)
+
 __kvm_hyp_panic:
// Guess the context by looking at VTTBR:
// If zero, then we're already a host.
-- 
1.9.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 0/2] KVM/arm64: add fp/simd lazy switch support

2015-10-05 Thread Mario Smarduch
This patch series is a followup to the armv7 fp/simd lazy switch
implementation, uses similar approach and depends on the series -
https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016567.html
Patches are based on 4.3-rc2 commit 1f93e4a96c91093

Patches are based on earlier arm64 fp/simd optimization work -
https://lists.cs.columbia.edu/pipermail/kvmarm/2015-July/015748.html

And subsequent fixes by Marc and Christoffer at KVM Forum hackathon to handle
32-bit guest on 64 bit host - 
https://lists.cs.columbia.edu/pipermail/kvmarm/2015-August/016128.html

The patch series have been tested on Foundation Model arm64/arm64 and
arm32/arm64. The test program used can be found here 

https://github.com/mjsmar/arm-arm64-fpsimd-test

Launched upto 16 instances on 4-way Guest and another 16 on the host (both 
cases 1mS sleep), ran overnight. 

Changes v1->v2:
- Tested arm32/arm64
- rebased to 4.3-rc2
- changed a couple register accesses from 64 to 32 bit 

Mario Smarduch (2):
  add hooks for armv8 fp/simd lazy switch
  enable armv8 fp/simd lazy switch

 arch/arm/kvm/arm.c|  2 --
 arch/arm64/include/asm/kvm_asm.h  |  1 +
 arch/arm64/include/asm/kvm_host.h |  3 ++
 arch/arm64/kernel/asm-offsets.c   |  1 +
 arch/arm64/kvm/hyp.S  | 59 ++-
 5 files changed, 45 insertions(+), 21 deletions(-)

-- 
1.9.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v5 0/3] vfio: platform: return device properties for a platform device

2015-10-05 Thread Christoffer Dall
[cc'ing Mark R. and Shannon for their input on FDT and ACPI].

On Mon, Oct 05, 2015 at 11:07:35AM +0100, Peter Maydell wrote:
> On 5 October 2015 at 10:37, Christoffer Dall
>  wrote:
> > On Fri, Oct 02, 2015 at 11:53:07PM +0100, Peter Maydell wrote:
> >>  * I don't really want to build in a lot of infrastructure into
> >>QEMU to either build the DTC compiler into it or else introduce
> >>a runtime dependency on the dtc binary
> >
> > what level of complexity are we really talking about here?  Doesn't QEMU
> > already link against libfdt and doesn't it support exactly what we're
> > trying to do here?
> 
> We link against libfdt, but libfdt is for manipulating and creating
> dt binary blobs. As I understand it what sysfs exposes to userspace
> is not a dt binary blob (or fragment thereof) but a bit of dt source.
> libfdt doesn't do parsing of source into blobs; that is done by the
> dtc compiler, which QEMU doesn't currently build and which is set
> up to build only a separate command line binary anyway, not to be
> a utility library for compiling dt sources.
> 
> Do correct me if I'm wrong about the sysfs interface -- if it is
> just exposing dt blobs that would be easier to deal with.
> 

I thought there was also /sys/firmware/fdt being the unflattened one
accessible by libfdt, but dtc -I dtb seems to be unhappy parsin this on
my Mustang.

Mark, can you refresh my memory about this?

> > I also don't think this is the job of VFIO, assuming there is some
> > better place to do this.  I initially thought exposing device properties
> > in some canonical format from sysfs independently from whether the
> > system was booted with ACPI or DT was the right thing to do, but the
> > counter argument to this was essentially that the guest kernel needs the
> > same description as the host kernel and therefore we really have to find
> > a way to pass the HW description bits on to the guest.
> 
> So we end up with even more code to pass through ACPI table
> fragments as well :-(
> 

Probaby, yes.  But I think it's even worse, because there's no guarantee
you can get at the original firmware data, because it's all from the
DSDT so invoking the initial 'probe' method of the device entry could
theoretically overwrite itself...  (If I remember the details
correctly).

At least saying ACPI guest on ACPI, and DT guest on DT host, would be a
reasonable limitation for anything like this.

But perhaps this really is a place where you need the transparency of DT
vs. ACPI to do this.

This is all outside my area of expertise, so take whatever I said above
with that disclaim.er

> >> (except "don't do device passthrough for platform devices, insist
> >> on a real bus like PCI"...)
> >>
> > While I appreciate the simplicity of this solution from our
> > (maintainers') point of view, I still see the latter point as relatively
> > moot.  We have hardware with 10G platform ethernet devices that people
> > want to do device assignment on already, and I think we should try to
> > find a reasonable set of boundaries for setups that we can support
> > upstream instead of this becoming a black hole of derivative code bases
> > to do this sort of thing.
> 
> Yeah, speaking more seriously I agree we do need to do something
> here. It's just all irritatingly awkward...
> 

Agreed.

-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 09/15] arm64: Add page size to the kernel image header

2015-10-05 Thread Suzuki K. Poulose

On 02/10/15 16:49, Catalin Marinas wrote:

On Tue, Sep 15, 2015 at 04:41:18PM +0100, Suzuki K. Poulose wrote:

From: Ard Biesheuvel 

This patch adds the page size to the arm64 kernel image header
so that one can infer the PAGESIZE used by the kernel. This will
be helpful to diagnose failures to boot the kernel with page size
not supported by the CPU.

Signed-off-by: Ard Biesheuvel 


This patch needs you signed-off-by as well since you are posting it. And
IIRC I acked it as well, I'll check.



Yes, you did mention that you were OK with the patch. But I thought there was
no  'Acked-by' tag added. Hence didn't pick that up.



If you are fine with adding your signed-of-by, I can add it to the patch
when applying (unless I see other issues with the series).



Yes, please go ahead, if I don't have to send another version depending on
the review of KVM bits. If I not, I will add the S-o-b and your Acked-by.


Thanks
Suzuki

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v5 0/3] vfio: platform: return device properties for a platform device

2015-10-05 Thread Mark Rutland
On Mon, Oct 05, 2015 at 12:32:00PM +0200, Christoffer Dall wrote:
> [cc'ing Mark R. and Shannon for their input on FDT and ACPI].
> 
> On Mon, Oct 05, 2015 at 11:07:35AM +0100, Peter Maydell wrote:
> > On 5 October 2015 at 10:37, Christoffer Dall
> >  wrote:
> > > On Fri, Oct 02, 2015 at 11:53:07PM +0100, Peter Maydell wrote:
> > >>  * I don't really want to build in a lot of infrastructure into
> > >>QEMU to either build the DTC compiler into it or else introduce
> > >>a runtime dependency on the dtc binary
> > >
> > > what level of complexity are we really talking about here?  Doesn't QEMU
> > > already link against libfdt and doesn't it support exactly what we're
> > > trying to do here?
> > 
> > We link against libfdt, but libfdt is for manipulating and creating
> > dt binary blobs. As I understand it what sysfs exposes to userspace
> > is not a dt binary blob (or fragment thereof) but a bit of dt source.
> > libfdt doesn't do parsing of source into blobs; that is done by the
> > dtc compiler, which QEMU doesn't currently build and which is set
> > up to build only a separate command line binary anyway, not to be
> > a utility library for compiling dt sources.
> > 
> > Do correct me if I'm wrong about the sysfs interface -- if it is
> > just exposing dt blobs that would be easier to deal with.
> > 
> 
> I thought there was also /sys/firmware/fdt being the unflattened one
> accessible by libfdt, but dtc -I dtb seems to be unhappy parsin this on
> my Mustang.
> 
> Mark, can you refresh my memory about this?

/sys/firmware/fdt is the original, unadulterated DTB passed to the
kernel, while /sys/firmware/device-tree is the unflattened form which
may have had overlays (or perhaps other fixups) applied.

For runtime stuff I'd generally expect /sys/firmware/device-tree to be
preferable (e.g. if we want to expose some device added by an overlay).
We added /sys/firmware/fdt for kexec, as memory reservations are in the
DTB header, which doesn't show up in the unflattened tree.

> > > I also don't think this is the job of VFIO, assuming there is some
> > > better place to do this.  I initially thought exposing device properties
> > > in some canonical format from sysfs independently from whether the
> > > system was booted with ACPI or DT was the right thing to do, but the
> > > counter argument to this was essentially that the guest kernel needs the
> > > same description as the host kernel and therefore we really have to find
> > > a way to pass the HW description bits on to the guest.
> > 
> > So we end up with even more code to pass through ACPI table
> > fragments as well :-(
> > 
> 
> Probaby, yes.  But I think it's even worse, because there's no guarantee
> you can get at the original firmware data, because it's all from the
> DSDT so invoking the initial 'probe' method of the device entry could
> theoretically overwrite itself...  (If I remember the details
> correctly).

I don't think you can realistically expose host AML to the guest. Even
if it doesn't overwrite itself, it could perform oneshot configuration
that breaks if executed repeatedly, try to access otehr bits of ACPI
which may or may not be present, call to secure firmware, etc.

Mark.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v5 0/3] vfio: platform: return device properties for a platform device

2015-10-05 Thread Christoffer Dall
[why are you top-posting?]

On Mon, Oct 05, 2015 at 11:42:38AM +0200, Baptiste Reynal wrote:
> In this patch series we want to wrap an already available kernel
> interface to expose a device property to userspace, 

which 'already available kernel interface' is that exactly?

> in order to keep
> the code lighter on the userspace. We need those properties in VFIO as

I'm not sure I agree with your 'need those properties in VFIO' statement
here, can you elaborate?

> VFIO grants the possibility to develop userspace drivers.
> 
> The sysfs doesn't seems to be ready for this kind of usage. We can
> only find raw data that require heavy parsing. Here we retrieve
> directly usable data and it can be extended later according to new
> needs (as it is already done with ACPI).

Why couldn't you expose this kind of data through sysfs instead of VFIO
and independently of VFIO?  Would that be more wrong/difficult/whatever?

> 
> This interface has been developed for VFIO and is currently bound to
> it, though there is no special dependencies with it. We could make it
> more generic, but I can only think of VFIO to use it.

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 09/15] arm64: Add page size to the kernel image header

2015-10-05 Thread Ard Biesheuvel
On 5 October 2015 at 14:02, Suzuki K. Poulose  wrote:
> On 02/10/15 16:49, Catalin Marinas wrote:
>>
>> On Tue, Sep 15, 2015 at 04:41:18PM +0100, Suzuki K. Poulose wrote:
>>>
>>> From: Ard Biesheuvel 
>>>
>>> This patch adds the page size to the arm64 kernel image header
>>> so that one can infer the PAGESIZE used by the kernel. This will
>>> be helpful to diagnose failures to boot the kernel with page size
>>> not supported by the CPU.
>>>
>>> Signed-off-by: Ard Biesheuvel 
>>
>>
>> This patch needs you signed-off-by as well since you are posting it. And
>> IIRC I acked it as well, I'll check.
>>
>
> Yes, you did mention that you were OK with the patch. But I thought there
> was
> no  'Acked-by' tag added. Hence didn't pick that up.
>

In my version of this patch (which I sent separately before noticing
that Suzuki had already folded it into his series), I took the comment
from Catalin to the email in which I suggested this change as an
implicit Acked-by.

-- 
Ard.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 17/22] arm64/kvm: Make use of the system wide safe values

2015-10-05 Thread Suzuki K. Poulose
Use the system wide safe value from the new API for safer
decisions

Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Suzuki K. Poulose 
---
 arch/arm64/kvm/reset.c|2 +-
 arch/arm64/kvm/sys_regs.c |   12 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 91cf535..f34745c 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -53,7 +53,7 @@ static bool cpu_has_32bit_el1(void)
 {
u64 pfr0;
 
-   pfr0 = read_cpuid(ID_AA64PFR0_EL1);
+   pfr0 = read_system_reg(SYS_ID_AA64PFR0_EL1);
return !!(pfr0 & 0x20);
 }
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index d03d3af..87a64e8 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -693,13 +693,13 @@ static bool trap_dbgidr(struct kvm_vcpu *vcpu,
if (p->is_write) {
return ignore_write(vcpu, p);
} else {
-   u64 dfr = read_cpuid(ID_AA64DFR0_EL1);
-   u64 pfr = read_cpuid(ID_AA64PFR0_EL1);
-   u32 el3 = !!((pfr >> 12) & 0xf);
+   u64 dfr = read_system_reg(SYS_ID_AA64DFR0_EL1);
+   u64 pfr = read_system_reg(SYS_ID_AA64PFR0_EL1);
+   u32 el3 = !!cpuid_feature_extract_field(pfr, 
ID_AA64PFR0_EL3_SHIFT);
 
-   *vcpu_reg(vcpu, p->Rt) = dfr >> 20) & 0xf) << 28) |
- (((dfr >> 12) & 0xf) << 24) |
- (((dfr >> 28) & 0xf) << 20) |
+   *vcpu_reg(vcpu, p->Rt) = dfr >> ID_AA64DFR0_WRPS_SHIFT) & 
0xf) << 28) |
+ (((dfr >> ID_AA64DFR0_BRPS_SHIFT) & 
0xf) << 24) |
+ (((dfr >> ID_AA64DFR0_CTX_CMPS_SHIFT) 
& 0xf) << 20) |
  (6 << 16) | (el3 << 14) | (el3 << 
12));
return true;
}
-- 
1.7.9.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFT - PATCH v2 0/2] KVM/arm64: add fp/simd lazy switch support

2015-10-05 Thread Christoffer Dall
On Mon, Oct 05, 2015 at 09:14:57AM -0700, Mario Smarduch wrote:
> Hi Christoffer,
>I just managed to boot qemu arm32 up on arm64 (last Fri - thanks for the 
> tip
> - there were few other issue to clean up), so let me retest it again. Also I
> noticed some refactoring would help both 32 and 64 bit patches.
> 
> Yes I could provide a the user space tests as well.
> 
I'd like those regardless as I generally test my queue before pushing it
to next.

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFT - PATCH v2 0/2] KVM/arm64: add fp/simd lazy switch support

2015-10-05 Thread Christoffer Dall
On Tue, Sep 22, 2015 at 04:34:01PM -0700, Mario Smarduch wrote:
> This is a 2nd itteration for arm64, v1 patches were posted by mistake from an 
> older branch which included several bugs. Hopefully didn't waste too much of 
> anyones time.
> 
> This patch series is a followup to the armv7 fp/simd lazy switch
> implementation, uses similar approach and depends on the series - see
> https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016516.html
> 
> It's based on earlier arm64 fp/simd optimization work - see
> https://lists.cs.columbia.edu/pipermail/kvmarm/2015-July/015748.html
> 
> And subsequent fixes by Marc and Christoffer at KVM Forum hackathon to handle
> 32-bit guest on 64 bit host (and may require more here) - see
> https://lists.cs.columbia.edu/pipermail/kvmarm/2015-August/016128.html
> 
> This series has be tested with arm64 on arm64 with several FP applications 
> running on host and guest, with substantial decrease on number of 
> fp/simd context switches. From about 30% down to 2% with one guest running.
> 
> At this time I don't have arm32/arm64 working and hoping Christoffer and/or
> Marc (or anyone) can test 32-bit guest/64-bit host.
> 
Did you already have some test infrastructure/applications that I can
reuse for this purpose or do I have to write userspace software?

-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 09/15] arm64: Add page size to the kernel image header

2015-10-05 Thread Christoffer Dall
On Fri, Oct 02, 2015 at 05:50:38PM +0100, Marc Zyngier wrote:
> On 02/10/15 17:31, Catalin Marinas wrote:
> > On Fri, Oct 02, 2015 at 04:49:01PM +0100, Catalin Marinas wrote:
> >> On Tue, Sep 15, 2015 at 04:41:18PM +0100, Suzuki K. Poulose wrote:
> >>> From: Ard Biesheuvel 
> >>>
> >>> This patch adds the page size to the arm64 kernel image header
> >>> so that one can infer the PAGESIZE used by the kernel. This will
> >>> be helpful to diagnose failures to boot the kernel with page size
> >>> not supported by the CPU.
> >>>
> >>> Signed-off-by: Ard Biesheuvel 
> >>
> >> This patch needs you signed-off-by as well since you are posting it. And
> >> IIRC I acked it as well, I'll check.
> >>
> >> If you are fine with adding your signed-of-by, I can add it to the patch
> >> when applying (unless I see other issues with the series).
> > 
> > Actually, I just realised that the kvm patches don't have any acks, so
> > I'm not taking this series yet. You may want to repost once you have all
> > the acks in place.
> > 
> 
> I'm in the middle of it. I should be done next week (I may have said the
> same thing two weeks ago...).
> 
Very near the top of my to-review list as well, will try to get it done
this week.

-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm