Re: [PATCH v3 17/20] vgic: Add support for 52bit guest physical address

2018-07-04 Thread Auger Eric
Hi Suzuki,

On 06/29/2018 01:15 PM, Suzuki K Poulose wrote:
> From: Kristina Martsenko 
> 
> Add support for handling 52bit guest physical address to the
> VGIC layer. So far we have limited the guest physical address
> to 48bits, by explicitly masking the upper bits. This patch
> removes the restriction. We do not have to check if the host
> supports 52bit as the gpa is always validated during an access.
> (e.g, kvm_{read/write}_guest, kvm_is_visible_gfn()).
> Also, the ITS table save-restore is also not affected with
> the enhancement. The DTE entries already store the bits[51:8]
> of the ITT_addr (with a 256byte alignment).
> 
> Cc: Marc Zyngier 
> Cc: Christoffer Dall 
> Signed-off-by: Kristina Martsenko 
> [ Macro clean ups, fix PROPBASER and PENDBASER accesses ]
> Signed-off-by: Suzuki K Poulose 
> ---
>  include/linux/irqchip/arm-gic-v3.h |  5 +
>  virt/kvm/arm/vgic/vgic-its.c   | 36 ++--
>  virt/kvm/arm/vgic/vgic-mmio-v3.c   |  2 --
>  3 files changed, 15 insertions(+), 28 deletions(-)
> 
> diff --git a/include/linux/irqchip/arm-gic-v3.h 
> b/include/linux/irqchip/arm-gic-v3.h
> index cbb872c..bc4b95b 100644
> --- a/include/linux/irqchip/arm-gic-v3.h
> +++ b/include/linux/irqchip/arm-gic-v3.h
> @@ -346,6 +346,8 @@
>  #define GITS_CBASER_RaWaWt   GIC_BASER_CACHEABILITY(GITS_CBASER, INNER, 
> RaWaWt)
>  #define GITS_CBASER_RaWaWb   GIC_BASER_CACHEABILITY(GITS_CBASER, INNER, 
> RaWaWb)
>  
> +#define GITS_CBASER_ADDRESS(cbaser)  ((cbaser) & GENMASK_ULL(52, 12))
> +
>  #define GITS_BASER_NR_REGS   8
>  
>  #define GITS_BASER_VALID (1ULL << 63)
> @@ -377,6 +379,9 @@
>  #define GITS_BASER_ENTRY_SIZE_MASK   GENMASK_ULL(52, 48)
>  #define GITS_BASER_PHYS_52_to_48(phys)   
> \
>   (((phys) & GENMASK_ULL(47, 16)) | (((phys) >> 48) & 0xf) << 12)
> +#define GITS_BASER_ADDR_48_to_52(baser)  
> \
> + (((baser) & GENMASK_ULL(47, 16)) | (((baser) >> 12) & 0xf) << 48)
only works if page_size = 64kB which is the case in vITS but as it is in
irqchip header, may be worth a comment?
> +
>  #define GITS_BASER_SHAREABILITY_SHIFT(10)
>  #define GITS_BASER_InnerShareable\
>   GIC_BASER_SHAREABILITY(GITS_BASER, InnerShareable)
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index 4ed79c9..c6eb390 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -234,13 +234,6 @@ static struct its_ite *find_ite(struct vgic_its *its, 
> u32 device_id,
>   list_for_each_entry(dev, &(its)->device_list, dev_list) \
>   list_for_each_entry(ite, &(dev)->itt_head, ite_list)
>  
> -/*
> - * We only implement 48 bits of PA at the moment, although the ITS
> - * supports more. Let's be restrictive here.
> - */
> -#define BASER_ADDRESS(x) ((x) & GENMASK_ULL(47, 16))
> -#define CBASER_ADDRESS(x)((x) & GENMASK_ULL(47, 12))
> -
>  #define GIC_LPI_OFFSET 8192
>  
>  #define VITS_TYPER_IDBITS 16
> @@ -752,6 +745,7 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 
> baser, u32 id,
>  {
>   int l1_tbl_size = GITS_BASER_NR_PAGES(baser) * SZ_64K;
>   u64 indirect_ptr, type = GITS_BASER_TYPE(baser);
> + phys_addr_t base = GITS_BASER_ADDR_48_to_52(baser);
>   int esz = GITS_BASER_ENTRY_SIZE(baser);
>   int index;
>   gfn_t gfn;
> @@ -776,7 +770,7 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 
> baser, u32 id,
>   if (id >= (l1_tbl_size / esz))
>   return false;
>  
> - addr = BASER_ADDRESS(baser) + id * esz;
> + addr = base + id * esz;
>   gfn = addr >> PAGE_SHIFT;
>  
>   if (eaddr)
> @@ -791,7 +785,7 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 
> baser, u32 id,
>  
>   /* Each 1st level entry is represented by a 64-bit value. */
>   if (kvm_read_guest_lock(its->dev->kvm,
> -BASER_ADDRESS(baser) + index * sizeof(indirect_ptr),
> +base + index * sizeof(indirect_ptr),
>  &indirect_ptr, sizeof(indirect_ptr)))
>   return false;
>  
> @@ -801,11 +795,7 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 
> baser, u32 id,
>   if (!(indirect_ptr & BIT_ULL(63)))
>   return false;
>  
> - /*
> -  * Mask the guest physical address and calculate the frame number.
> -  * Any address beyond our supported 48 bits of PA will be caught
> -  * by the actual check in the final step.
> -  */
> + /* Mask the guest physical address and calculate the frame number. */
>   indirect_ptr &= GENMASK_ULL(51, 16);
>  
>   /* Find the address of the actual entry */
> @@ -1297,9 +1287,6 @@ static u64 vgic_sanitise_its_baser(u64 reg)
> GITS_BASER_OUTER_CACHEABILITY_SHIFT,
>  

Re: [PATCH v3 10/20] kvm: arm64: Dynamic configuration of VTTBR mask

2018-07-04 Thread Auger Eric
Hi Suzuki,

On 07/03/2018 01:54 PM, Suzuki K Poulose wrote:
> Hi Eric,
> 
> On 02/07/18 15:41, Auger Eric wrote:
>> Hi Suzuki,
>>
>> On 06/29/2018 01:15 PM, Suzuki K Poulose wrote:
>>> On arm64 VTTBR_EL2:BADDR holds the base address for the stage2
>>> translation table. The Arm ARM mandates that the bits BADDR[x-1:0]
>>> should be 0, where 'x' is defined for a given IPA Size and the
>>> number of levels for a translation granule size. It is defined
>>> using some magical constants. This patch is a reverse engineered
>>> implementation to calculate the 'x' at runtime for a given ipa and
>>> number of page table levels. See patch for more details.
>>>
>>> Cc: Marc Zyngier 
>>> Cc: Christoffer Dall 
>>> Signed-off-by: Suzuki K Poulose 
>>> ---
>>> Changes since V2:
>>>   - Part 1 of spilt from VTCR & VTTBR dynamic configuration
>>> ---
>>>   arch/arm64/include/asm/kvm_arm.h | 60
>>> +---
>>>   arch/arm64/include/asm/kvm_mmu.h | 25 -
>>>   2 files changed, 80 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/kvm_arm.h
>>> b/arch/arm64/include/asm/kvm_arm.h
>>> index 3dffd38..c557f45 100644
>>> --- a/arch/arm64/include/asm/kvm_arm.h
>>> +++ b/arch/arm64/include/asm/kvm_arm.h
>>> @@ -140,8 +140,6 @@
>>>* Note that when using 4K pages, we concatenate two first level
>>> page tables
>>>* together. With 16K pages, we concatenate 16 first level page
>>> tables.
>>>*
>>> - * The magic numbers used for VTTBR_X in this patch can be found in
>>> Tables
>>> - * D4-23 and D4-25 in ARM DDI 0487A.b.
>> Isn't it a pretty old reference? Could you refer to C.a?
> 
> Sure, I will update the references everywhere.
> 
>>> + *
>>> + * The algorithm defines the expectations on the BaseAddress (for
>>> the page
>>> + * table) bits resolved at each level based on the page size, entry
>>> level
>>> + * and T0SZ. The variable "x" in the algorithm also affects the
>>> VTTBR:BADDR
>>> + * for stage2 page table.
>>> + *
>>> + * The value of "x" is calculated as :
>>> + *x = Magic_N - T0SZ
>>> + *
>>> + * where Magic_N is an integer depending on the page size and the entry
>>> + * level of the page table as below:
>>> + *
>>> + *
>>> + *| Entry level|  4K16K   64K |
>>> + *
>>> + *| Level: 0 (4 levels)| 28   |  -  |  -  |
>>> + *
>>> + *| Level: 1 (3 levels)| 37   | 31  | 25  |
>>> + *
>>> + *| Level: 2 (2 levels)| 46   | 42  | 38  |
>>> + *
>>> + *| Level: 3 (1 level)| -| 53  | 51  |
>>> + *
>> I understand entry level = Lookup level in the table.
> 
> Entry level => The level at which we start the page table walk for
> a given address (This is in line with the ARM ARM). So,
> 
> Entry_level = (4 - Number_of_Page_table_levels)
> 
>> But you may want to compute x for BaseAddress matching lookup level 2
>> with number of levels = 4.
> 
> No, the BaseAddress is only calcualted for the "Entry_level". So the
> above case doesn't exist at all.
> 
>> So shouldn't you s/Number of levels/4 - entry_level?
> 
> Ok, I now understood what you are referring to [0]
>> for BADDR we want the BaseAddr of the initial lookup level so
>> effectively the entry level we are interested in is 4 - number of levels
>> and we don't care or d) condition. At least this is my understanding ;-)
>> If correct you may slightly reword the explanation?
> 
> 
>>> + *
>>> + * We have a magic formula for the Magic_N below.
>>> + *
>>> + *  Magic_N(PAGE_SIZE, Entry_Level) = 64 - ((PAGE_SHIFT - 3) *
>>> Number of levels)
> 
> [0] ^^^
> 
> 
> 
>>> + *
>>> + * where number of levels = (4 - Entry_Level).
> 
> ^^^ Doesn't this help make it clear ? Using the expansion makes it a bit
> more
> unreadable below.

I just wanted to mention the tables you refer (D4-23 and D4-25) give
Magic_N for a larger scope as they deal with any lookup level while we
only care about the entry level for BADDR. So I was a little bit
confused when reading the explanation but that's not a big deal.

> 
>>>   +/*
>>> + * Get the magic number 'x' for VTTBR:BADDR of this KVM instance.
>>> + * With v8.2 LVA extensions, 'x' should be a minimum of 6 with
>>> + * 52bit IPS.
>> Link to the spec?
> 
> Sure, will add it.
> 
> Thanks for the patience to review :-)
you're welcome ;-)

Eric
> 
> Cheers
> Suzuki
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 10/20] kvm: arm64: Dynamic configuration of VTTBR mask

2018-07-04 Thread Suzuki K Poulose

On 07/04/2018 09:24 AM, Auger Eric wrote:

+ *
+ * We have a magic formula for the Magic_N below.
+ *
+ *  Magic_N(PAGE_SIZE, Entry_Level) = 64 - ((PAGE_SHIFT - 3) *
Number of levels)


[0] ^^^




+ *
+ * where number of levels = (4 - Entry_Level).


^^^ Doesn't this help make it clear ? Using the expansion makes it a bit
more
unreadable below.


I just wanted to mention the tables you refer (D4-23 and D4-25) give
Magic_N for a larger scope as they deal with any lookup level while we
only care about the entry level for BADDR. So I was a little bit
confused when reading the explanation but that's not a big deal.


Ah, ok. I will try to clarify it.

Cheers
Suzuki
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 0/6] KVM: arm/arm64: vgic: Virtual interrupt grouping support

2018-07-04 Thread Christoffer Dall
This small series addresses a peculiarity of the current VGIC
implementation, namely that we don't support interrupt grouping.

KVM either implements a GICv2 without support for the security
extensions, or a GICv3 with DS=1.  For GICv2, on systems without the
security extensions, group 0 interrupts can be configured to be either
signalled as FIQs or as IRQs by the VM, whereas group 1 interrupts are
always IRQs.  For GICv3, with DS=1, group 1 interrupts are always IRQs
and group 0 interrupts are always FIQs, and there is no concept of
secure vs. non-secure group 1 interrupts when DS=1.

We were treating all interrupts on GICv2 as group 0, but yet telling the
guest that they were group 1.  The first patch changes this behavior,
which seems to have no effect on no known guests, but still.

The remaining patches introduce proper interrupt grouping support, along
with MMIO accessors for the VM and userspace to retrieve and set the
which group SGIs, PPIs, and SPIs belong to.  LPIs are always group 1
interrupts as per the architecture, and there is no way to modify this
configuration (no IGROUPR registers for LPIs or equivalent ITS
commands).

Lightly tested on Seattle, TX1, and the foundation model.  I've run a
GICv2 guest on a GICv3 host on the foundation model.  I've tested
migration on Seattle, and as expected, migrating a payload from an older
kernel fails, but migration on same kernel works fine both before and
after this series.  See the last patch for more info.

Applies to v4.18-rc3.

Changes since v1:
 - Bumped implementation revision field in GICD_IIDR along with changes
 - Changed logic in init code to correctly detect vgic emulation type
 - Rebased to v4.18-rc3

Christoffer Dall (6):
  KVM: arm/arm64: vgic: Define GICD_IIDR fields for GICv2 and GIv3
  KVM: arm/arm64: vgic: Keep track of implementation revision
  KVM: arm/arm64: vgic: GICv2 IGROUPR should read as zero
  KVM: arm/arm64: vgic: Add group field to struct irq
  KVM: arm/arm64: vgic: Signal IRQs using their configured group
  KVM: arm/arm64: vgic: Allow configuration of interrupt groups

 include/kvm/arm_vgic.h |  4 
 include/linux/irqchip/arm-gic-v3.h | 10 ++
 include/linux/irqchip/arm-gic.h| 11 +++
 virt/kvm/arm/vgic/vgic-debug.c |  8 +---
 virt/kvm/arm/vgic/vgic-init.c  | 20 ++--
 virt/kvm/arm/vgic/vgic-its.c   |  1 +
 virt/kvm/arm/vgic/vgic-mmio-v2.c   | 19 +++
 virt/kvm/arm/vgic/vgic-mmio-v3.c   | 20 +++-
 virt/kvm/arm/vgic/vgic-mmio.c  | 38 ++
 virt/kvm/arm/vgic/vgic-mmio.h  |  6 ++
 virt/kvm/arm/vgic/vgic-v2.c|  3 +++
 virt/kvm/arm/vgic/vgic-v3.c|  6 +-
 12 files changed, 127 insertions(+), 19 deletions(-)

--
2.7.4

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 1/6] KVM: arm/arm64: vgic: Define GICD_IIDR fields for GICv2 and GIv3

2018-07-04 Thread Christoffer Dall
Instead of hardcoding the shifts and masks in the GICD_IIDR register
emulation, let's add the definition of these fields to the GIC header
files and use them.

This will make things more obivous when we're going to bump the revision
in the IIDR when we'll make guest-visible changes to the implementation.

Signed-off-by: Christoffer Dall 
---
 include/linux/irqchip/arm-gic-v3.h | 10 ++
 include/linux/irqchip/arm-gic.h| 10 ++
 virt/kvm/arm/vgic/vgic-mmio-v2.c   |  3 ++-
 virt/kvm/arm/vgic/vgic-mmio-v3.c   |  3 ++-
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index cbb872c..b22f9df 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -61,6 +61,16 @@
 #define GICD_CTLR_ENABLE_G1A   (1U << 1)
 #define GICD_CTLR_ENABLE_G1(1U << 0)

+#define GICD_IIDR_IMPLEMENTER_SHIFT0
+#define GICD_IIDR_IMPLEMENTER_MASK (0xfff << GICD_IIDR_IMPLEMENTER_SHIFT)
+#define GICD_IIDR_REVISION_SHIFT   12
+#define GICD_IIDR_REVISION_MASK(0xf << 
GICD_IIDR_REVISION_SHIFT)
+#define GICD_IIDR_VARIANT_SHIFT16
+#define GICD_IIDR_VARIANT_MASK (0xf << GICD_IIDR_VARIANT_SHIFT)
+#define GICD_IIDR_PRODUCT_ID_SHIFT 24
+#define GICD_IIDR_PRODUCT_ID_MASK  (0xff << GICD_IIDR_PRODUCT_ID_SHIFT)
+
+
 /*
  * In systems with a single security state (what we emulate in KVM)
  * the meaning of the interrupt group enable bits is slightly different
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 68d8b1f..484f5bf 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -71,6 +71,16 @@
(GICD_INT_DEF_PRI << 8) |\
GICD_INT_DEF_PRI)

+#define GICD_IIDR_IMPLEMENTER_SHIFT0
+#define GICD_IIDR_IMPLEMENTER_MASK (0xfff << GICD_IIDR_IMPLEMENTER_SHIFT)
+#define GICD_IIDR_REVISION_SHIFT   12
+#define GICD_IIDR_REVISION_MASK(0xf << 
GICD_IIDR_REVISION_SHIFT)
+#define GICD_IIDR_VARIANT_SHIFT16
+#define GICD_IIDR_VARIANT_MASK (0xf << GICD_IIDR_VARIANT_SHIFT)
+#define GICD_IIDR_PRODUCT_ID_SHIFT 24
+#define GICD_IIDR_PRODUCT_ID_MASK  (0xff << GICD_IIDR_PRODUCT_ID_SHIFT)
+
+
 #define GICH_HCR   0x0
 #define GICH_VTR   0x4
 #define GICH_VMCR  0x8
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v2.c b/virt/kvm/arm/vgic/vgic-mmio-v2.c
index ffc587b..af44e569 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v2.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v2.c
@@ -37,7 +37,8 @@ static unsigned long vgic_mmio_read_v2_misc(struct kvm_vcpu 
*vcpu,
value |= (atomic_read(&vcpu->kvm->online_vcpus) - 1) << 5;
break;
case GIC_DIST_IIDR:
-   value = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
+   value = (PRODUCT_ID_KVM << GICD_IIDR_PRODUCT_ID_SHIFT) |
+   (IMPLEMENTER_ARM << GICD_IIDR_IMPLEMENTER_SHIFT);
break;
default:
return 0;
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index 2877840..c03f424 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -81,7 +81,8 @@ static unsigned long vgic_mmio_read_v3_misc(struct kvm_vcpu 
*vcpu,
}
break;
case GICD_IIDR:
-   value = (PRODUCT_ID_KVM << 24) | (IMPLEMENTER_ARM << 0);
+   value = (PRODUCT_ID_KVM << GICD_IIDR_PRODUCT_ID_SHIFT) |
+   (IMPLEMENTER_ARM << GICD_IIDR_IMPLEMENTER_SHIFT);
break;
default:
return 0;
--
2.7.4

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 2/6] KVM: arm/arm64: vgic: Keep track of implementation revision

2018-07-04 Thread Christoffer Dall
As we are about to tweak implementation aspects of the VGIC emulation,
while still preserving some level of backwards compatibility support,
add a field to keep track of the implementation revision field which is
reported to the VM and to userspace.

Signed-off-by: Christoffer Dall 
---
 include/kvm/arm_vgic.h   | 3 +++
 virt/kvm/arm/vgic/vgic-init.c| 1 +
 virt/kvm/arm/vgic/vgic-mmio-v2.c | 6 --
 virt/kvm/arm/vgic/vgic-mmio-v3.c | 6 --
 4 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index cfdd248..7e64c46 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -217,6 +217,9 @@ struct vgic_dist {
/* vGIC model the kernel emulates for the guest (GICv2 or GICv3) */
u32 vgic_model;

+   /* Implementation revision as reported in the GICD_IIDR */
+   u32 implementation_rev;
+
/* Do injected MSIs require an additional device ID? */
boolmsis_require_devid;

diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index b714179..8b6fc45 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -298,6 +298,7 @@ int vgic_init(struct kvm *kvm)

vgic_debug_init(kvm);

+   dist->implementation_rev = 0;
dist->initialized = true;

 out:
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v2.c b/virt/kvm/arm/vgic/vgic-mmio-v2.c
index af44e569..f0c5351 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v2.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v2.c
@@ -25,19 +25,21 @@
 static unsigned long vgic_mmio_read_v2_misc(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
 {
+   struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
u32 value;

switch (addr & 0x0c) {
case GIC_DIST_CTRL:
-   value = vcpu->kvm->arch.vgic.enabled ? GICD_ENABLE : 0;
+   value = vgic->enabled ? GICD_ENABLE : 0;
break;
case GIC_DIST_CTR:
-   value = vcpu->kvm->arch.vgic.nr_spis + VGIC_NR_PRIVATE_IRQS;
+   value = vgic->nr_spis + VGIC_NR_PRIVATE_IRQS;
value = (value >> 5) - 1;
value |= (atomic_read(&vcpu->kvm->online_vcpus) - 1) << 5;
break;
case GIC_DIST_IIDR:
value = (PRODUCT_ID_KVM << GICD_IIDR_PRODUCT_ID_SHIFT) |
+   (vgic->implementation_rev << GICD_IIDR_REVISION_SHIFT) |
(IMPLEMENTER_ARM << GICD_IIDR_IMPLEMENTER_SHIFT);
break;
default:
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index c03f424..ebe10a0 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -62,16 +62,17 @@ bool vgic_supports_direct_msis(struct kvm *kvm)
 static unsigned long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
 {
+   struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
u32 value = 0;

switch (addr & 0x0c) {
case GICD_CTLR:
-   if (vcpu->kvm->arch.vgic.enabled)
+   if (vgic->enabled)
value |= GICD_CTLR_ENABLE_SS_G1;
value |= GICD_CTLR_ARE_NS | GICD_CTLR_DS;
break;
case GICD_TYPER:
-   value = vcpu->kvm->arch.vgic.nr_spis + VGIC_NR_PRIVATE_IRQS;
+   value = vgic->nr_spis + VGIC_NR_PRIVATE_IRQS;
value = (value >> 5) - 1;
if (vgic_has_its(vcpu->kvm)) {
value |= (INTERRUPT_ID_BITS_ITS - 1) << 19;
@@ -82,6 +83,7 @@ static unsigned long vgic_mmio_read_v3_misc(struct kvm_vcpu 
*vcpu,
break;
case GICD_IIDR:
value = (PRODUCT_ID_KVM << GICD_IIDR_PRODUCT_ID_SHIFT) |
+   (vgic->implementation_rev << GICD_IIDR_REVISION_SHIFT) |
(IMPLEMENTER_ARM << GICD_IIDR_IMPLEMENTER_SHIFT);
break;
default:
--
2.7.4

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 3/6] KVM: arm/arm64: vgic: GICv2 IGROUPR should read as zero

2018-07-04 Thread Christoffer Dall
We currently don't support grouping in the emulated VGIC, which is a
known defect on KVM (not hurting any currently used guests as far as
we're aware). This is currently handled by treating all interrupts as
group 0 interrupts for an emulated GICv2 and always signaling interrupts
as group 0 to the virtual CPU interface.

However, when reading which group interrupts belongs to in the guest
from the emulated VGIC, the VGIC currently reports group 1 instead of
group 0, which is misleading.  Fix this temporarily before introducing
full group support by changing the hander to _raz instead of _rao.

Fixes: fb848db39661a "KVM: arm/arm64: vgic-new: Add GICv2 MMIO handling 
framework"
Signed-off-by: Christoffer Dall 
---
 virt/kvm/arm/vgic/vgic-init.c| 2 +-
 virt/kvm/arm/vgic/vgic-mmio-v2.c | 8 +++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 8b6fc45..230c922 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -298,7 +298,7 @@ int vgic_init(struct kvm *kvm)

vgic_debug_init(kvm);

-   dist->implementation_rev = 0;
+   dist->implementation_rev = 1;
dist->initialized = true;

 out:
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v2.c b/virt/kvm/arm/vgic/vgic-mmio-v2.c
index f0c5351..db646f1 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v2.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v2.c
@@ -22,6 +22,12 @@
 #include "vgic.h"
 #include "vgic-mmio.h"

+/*
+ * The Revision field in the IIDR have the following meanings:
+ *
+ * Revision 1: Report GICv2 interrupts as group 0 instead of group 1
+ */
+
 static unsigned long vgic_mmio_read_v2_misc(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
 {
@@ -365,7 +371,7 @@ static const struct vgic_register_region 
vgic_v2_dist_registers[] = {
vgic_mmio_read_v2_misc, vgic_mmio_write_v2_misc, 12,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_IGROUP,
-   vgic_mmio_read_rao, vgic_mmio_write_wi, NULL, NULL, 1,
+   vgic_mmio_read_raz, vgic_mmio_write_wi, NULL, NULL, 1,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ENABLE_SET,
vgic_mmio_read_enable, vgic_mmio_write_senable, NULL, NULL, 1,
--
2.7.4

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 4/6] KVM: arm/arm64: vgic: Add group field to struct irq

2018-07-04 Thread Christoffer Dall
In preparation for proper group 0 and group 1 support in the vgic, we
add a field in the struct irq to store the group of all interrupts.

We initialize the group to group 0 when emulating GICv2 and to group 1
when emulating GICv3, just like we treat them today.  LPIs are always
group 1.  We also continue to ignore writes from the guest, preserving
existing functionality, for now.

Finally, we also add this field to the vgic debug logic to show the
group for all interrupts.

Signed-off-by: Christoffer Dall 
---
 include/kvm/arm_vgic.h |  1 +
 virt/kvm/arm/vgic/vgic-debug.c |  8 +---
 virt/kvm/arm/vgic/vgic-init.c  | 19 +--
 virt/kvm/arm/vgic/vgic-its.c   |  1 +
 4 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7e64c46..c661d0e 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -133,6 +133,7 @@ struct vgic_irq {
u8 source;  /* GICv2 SGIs only */
u8 active_source;   /* GICv2 SGIs only */
u8 priority;
+   u8 group;   /* 0 == group 0, 1 == group 1 */
enum vgic_irq_config config;/* Level or edge */

/*
diff --git a/virt/kvm/arm/vgic/vgic-debug.c b/virt/kvm/arm/vgic/vgic-debug.c
index c589d4c..d3a403f 100644
--- a/virt/kvm/arm/vgic/vgic-debug.c
+++ b/virt/kvm/arm/vgic/vgic-debug.c
@@ -148,6 +148,7 @@ static void print_dist_state(struct seq_file *s, struct 
vgic_dist *dist)

seq_printf(s, "P=pending_latch, L=line_level, A=active\n");
seq_printf(s, "E=enabled, H=hw, C=config (level=1, edge=0)\n");
+   seq_printf(s, "G=group\n");
 }

 static void print_header(struct seq_file *s, struct vgic_irq *irq,
@@ -162,8 +163,8 @@ static void print_header(struct seq_file *s, struct 
vgic_irq *irq,
}

seq_printf(s, "\n");
-   seq_printf(s, "%s%2d TYP   ID TGT_ID PLAEHC HWID   TARGET SRC PRI 
VCPU_ID\n", hdr, id);
-   seq_printf(s, 
"---\n");
+   seq_printf(s, "%s%2d TYP   ID TGT_ID PLAEHCG HWID   TARGET SRC PRI 
VCPU_ID\n", hdr, id);
+   seq_printf(s, 
"\n");
 }

 static void print_irq_state(struct seq_file *s, struct vgic_irq *irq,
@@ -182,7 +183,7 @@ static void print_irq_state(struct seq_file *s, struct 
vgic_irq *irq,

seq_printf(s, "   %s %4d "
  "%2d "
- "%d%d%d%d%d%d "
+ "%d%d%d%d%d%d%d "
  "%8d "
  "%8x "
  " %2x "
@@ -197,6 +198,7 @@ static void print_irq_state(struct seq_file *s, struct 
vgic_irq *irq,
irq->enabled,
irq->hw,
irq->config == VGIC_CONFIG_LEVEL,
+   irq->group,
irq->hwintid,
irq->mpidr,
irq->source,
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 230c922..a7c19cd 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -175,10 +175,13 @@ static int kvm_vgic_dist_init(struct kvm *kvm, unsigned 
int nr_spis)
irq->vcpu = NULL;
irq->target_vcpu = vcpu0;
kref_init(&irq->refcount);
-   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V2)
+   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V2) {
irq->targets = 0;
-   else
+   irq->group = 0;
+   } else {
irq->mpidr = 0;
+   irq->group = 1;
+   }
}
return 0;
 }
@@ -227,6 +230,18 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
/* PPIs */
irq->config = VGIC_CONFIG_LEVEL;
}
+
+   /*
+* GICv3 can only be created via the KVM_DEVICE_CREATE API and
+* so we always know the emulation type at this point as it's
+* either explicitly configured as GICv3, or explicitly
+* configured as GICv2, or not configured yet which also
+* implies GICv2.
+*/
+   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
+   irq->group = 1;
+   else
+   irq->group = 0;
}

if (!irqchip_in_kernel(vcpu->kvm))
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 4ed79c9..92840c0 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -71,6 +71,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 
intid,
kref_init(&irq->refcount);
irq->intid = intid;
irq->target_vcpu = vcpu;
+   irq->group = 1;

spin_lock_irqsave(

[PATCH v2 5/6] KVM: arm/arm64: vgic: Signal IRQs using their configured group

2018-07-04 Thread Christoffer Dall
Now when we have a group configuration on the struct IRQ, use this state
when populating the LR and signaling interrupts as either group 0 or
group 1 to the VM.  Depending on the model of the emulated GIC, and the
guest's configuration of the VMCR, interrupts may be signaled as IRQs or
FIQs to the VM.

Signed-off-by: Christoffer Dall 
---
 include/linux/irqchip/arm-gic.h | 1 +
 virt/kvm/arm/vgic/vgic-v2.c | 3 +++
 virt/kvm/arm/vgic/vgic-v3.c | 6 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 484f5bf..6c4aaf0 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -104,6 +104,7 @@
 #define GICH_LR_PENDING_BIT(1 << 28)
 #define GICH_LR_ACTIVE_BIT (1 << 29)
 #define GICH_LR_EOI(1 << 19)
+#define GICH_LR_GROUP1 (1 << 30)
 #define GICH_LR_HW (1 << 31)

 #define GICH_VMCR_ENABLE_GRP0_SHIFT0
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index a5f2e44..df5e6a6 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -159,6 +159,9 @@ void vgic_v2_populate_lr(struct kvm_vcpu *vcpu, struct 
vgic_irq *irq, int lr)
}
}

+   if (irq->group)
+   val |= GICH_LR_GROUP1;
+
if (irq->hw) {
val |= GICH_LR_HW;
val |= irq->hwintid << GICH_LR_PHYSID_CPUID_SHIFT;
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index cdce653..530b849 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -197,11 +197,7 @@ void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, struct 
vgic_irq *irq, int lr)
if (vgic_irq_is_mapped_level(irq) && (val & ICH_LR_PENDING_BIT))
irq->line_level = false;

-   /*
-* We currently only support Group1 interrupts, which is a
-* known defect. This needs to be addressed at some point.
-*/
-   if (model == KVM_DEV_TYPE_ARM_VGIC_V3)
+   if (irq->group)
val |= ICH_LR_GROUP;

val |= (u64)irq->priority << ICH_LR_PRIORITY_SHIFT;
--
2.7.4

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 6/6] KVM: arm/arm64: vgic: Allow configuration of interrupt groups

2018-07-04 Thread Christoffer Dall
Implement the required MMIO accessors for GICv2 and GICv3 for the
IGROUPR distributor and redistributor registers.

This can allow guests to change behavior compared to running on previous
versions of KVM, but only to align with the architecture and hardware
implementations.

This also allows userspace to configure the groups for interrupts.  Note
that this potentially results in GICv2 guests not receiving interrupts
after migration if migrating from an older kernel that exposes GICv2
interrupts as group 1.

Cc: Andrew Jones 
Signed-off-by: Christoffer Dall 
---
I implemented (but stashed) a version of this which predicated the
behavior based on the value of GICD_IIDR revision field, falling back to
ignoring writes and resetting GICv2 groups to 0 if the guest wrote a
revision less than 2.  However, current QEMU implementations simply
don't write the GICD_IIDR, so this doesn't help at all without changing
QEMU anyhow.

The only actual fix I can see here to work around the problem in the
kernel is to require an opt-in to allow restoring groups from userspace,
but that's a lot of logic to support cross-kernel version migration.

Question: Do we expect that cross-kernel version migration is a critical
feature that people really expect to work, and do we actually have
examples of catering to this in the kernel elsewhere?  (Also, how would
then that relate to the whole 'adding a new sysreg breaks migration'
situation?)

 virt/kvm/arm/vgic/vgic-init.c|  2 +-
 virt/kvm/arm/vgic/vgic-mmio-v2.c |  4 +++-
 virt/kvm/arm/vgic/vgic-mmio-v3.c | 11 +--
 virt/kvm/arm/vgic/vgic-mmio.c| 38 ++
 virt/kvm/arm/vgic/vgic-mmio.h|  6 ++
 5 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index a7c19cd..c0c0b88 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -313,7 +313,7 @@ int vgic_init(struct kvm *kvm)

vgic_debug_init(kvm);

-   dist->implementation_rev = 1;
+   dist->implementation_rev = 2;
dist->initialized = true;

 out:
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v2.c b/virt/kvm/arm/vgic/vgic-mmio-v2.c
index db646f1..a7f09b5 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v2.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v2.c
@@ -26,6 +26,8 @@
  * The Revision field in the IIDR have the following meanings:
  *
  * Revision 1: Report GICv2 interrupts as group 0 instead of group 1
+ * Revision 2: Interrupt groups are guest-configurable and signaled using
+ *their configured groups.
  */

 static unsigned long vgic_mmio_read_v2_misc(struct kvm_vcpu *vcpu,
@@ -371,7 +373,7 @@ static const struct vgic_register_region 
vgic_v2_dist_registers[] = {
vgic_mmio_read_v2_misc, vgic_mmio_write_v2_misc, 12,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_IGROUP,
-   vgic_mmio_read_raz, vgic_mmio_write_wi, NULL, NULL, 1,
+   vgic_mmio_read_group, vgic_mmio_write_group, NULL, NULL, 1,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ENABLE_SET,
vgic_mmio_read_enable, vgic_mmio_write_senable, NULL, NULL, 1,
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index ebe10a0..49df2a1 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -59,6 +59,13 @@ bool vgic_supports_direct_msis(struct kvm *kvm)
return kvm_vgic_global_state.has_gicv4 && vgic_has_its(kvm);
 }

+/*
+ * The Revision field in the IIDR have the following meanings:
+ *
+ * Revision 2: Interrupt groups are guest-configurable and signaled using
+ *their configured groups.
+ */
+
 static unsigned long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
 {
@@ -454,7 +461,7 @@ static const struct vgic_register_region 
vgic_v3_dist_registers[] = {
vgic_mmio_read_rao, vgic_mmio_write_wi, 4,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_IGROUPR,
-   vgic_mmio_read_rao, vgic_mmio_write_wi, NULL, NULL, 1,
+   vgic_mmio_read_group, vgic_mmio_write_group, NULL, NULL, 1,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ISENABLER,
vgic_mmio_read_enable, vgic_mmio_write_senable, NULL, NULL, 1,
@@ -527,7 +534,7 @@ static const struct vgic_register_region 
vgic_v3_rdbase_registers[] = {

 static const struct vgic_register_region vgic_v3_sgibase_registers[] = {
REGISTER_DESC_WITH_LENGTH(GICR_IGROUPR0,
-   vgic_mmio_read_rao, vgic_mmio_write_wi, 4,
+   vgic_mmio_read_group, vgic_mmio_write_group, 4,
VGIC_ACCESS_32bit),
REGISTER_DESC_WITH_LENGTH(GICR_ISENABLER0,
vgic_mmio_read_enable, vgic_mmio_write_senable, 4,
diff --git a/virt/kvm/arm/v

Re: [PATCH v2 0/5] Add virtio-iommu driver

2018-07-04 Thread Jean-Philippe Brucker
On 27/06/18 20:59, Michael S. Tsirkin wrote:
>> Another reason to keep the MMIO transport option is that one
>> virtio-iommu can manage DMA from endpoints on multiple PCI domains at
>> the same time, as well as platform devices. Some VMMs might want that,
>> in which case the IOMMU would be a separate platform device.
> 
> Which buses are managed by the IOMMU is separate from the bus
> on which it's programming interface resides.

Sorry I didn't get this. We probably don't want to instantiate a PCI
root complex just for the IOMMU, so it needs to be in the same PCI
segment as managed endpoints. For example in my VM the AMD IOMMU is
presented as 00:02.0, between other devices on PCI bus 00.

In any case, I have a solution for virtio-pci that works with DT and
ACPI, and isn't excessively awful. I'll probably send it as part of the
next version.

Thanks,
Jean

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 1/5] dt-bindings: virtio: Specify #iommu-cells value for a virtio-iommu

2018-07-04 Thread Robin Murphy

On 27/06/18 18:46, Rob Herring wrote:

On Tue, Jun 26, 2018 at 11:59 AM Jean-Philippe Brucker
 wrote:


On 25/06/18 20:27, Rob Herring wrote:

On Thu, Jun 21, 2018 at 08:06:51PM +0100, Jean-Philippe Brucker wrote:

A virtio-mmio node may represent a virtio-iommu device. This is discovered
by the virtio driver at probe time, but the DMA topology isn't
discoverable and must be described by firmware. For DT the standard IOMMU
description is used, as specified in bindings/iommu/iommu.txt and
bindings/pci/pci-iommu.txt. Like many other IOMMUs, virtio-iommu
distinguishes masters by their endpoint IDs, which requires one IOMMU cell
in the "iommus" property.

Signed-off-by: Jean-Philippe Brucker 
---
  Documentation/devicetree/bindings/virtio/mmio.txt | 8 
  1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt 
b/Documentation/devicetree/bindings/virtio/mmio.txt
index 5069c1b8e193..337da0e3a87f 100644
--- a/Documentation/devicetree/bindings/virtio/mmio.txt
+++ b/Documentation/devicetree/bindings/virtio/mmio.txt
@@ -8,6 +8,14 @@ Required properties:
  - reg:  control registers base address and size including 
configuration space
  - interrupts:   interrupt generated by the device

+Required properties for virtio-iommu:
+
+- #iommu-cells: When the node describes a virtio-iommu device, it is
+linked to DMA masters using the "iommus" property as
+described in devicetree/bindings/iommu/iommu.txt. For
+virtio-iommu #iommu-cells must be 1, each cell describing
+a single endpoint ID.


The iommus property should also be documented for the client side.


Isn't section "IOMMU master node" of iommu.txt sufficient? Since the
iommus property applies to any DMA master, not only virtio-mmio devices,
the canonical description in iommu.txt seems the best place for it, and
I'm not sure what to add in this file. Maybe a short example below the
virtio_block one?


No, because somewhere we have to capture if 'iommus' is valid for
'virtio-mmio' or not. Hopefully soon we'll actually be able to
validate that.


Indeed, it's rather unusual to have a single compatible which may either 
be an IOMMU or an IOMMU client (but not both at once, I hope!), so 
nailing down the exact semantics as clearly as possible would definitely 
be desirable.


Robin.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvmtool test PATCH 22/24] kvmtool: arm64: Add support for guest physical address size

2018-07-04 Thread Will Deacon
On Fri, Jun 29, 2018 at 12:15:42PM +0100, Suzuki K Poulose wrote:
> Add an option to specify the physical address size used by this
> VM.
> 
> Signed-off-by: Suzuki K Poulose 
> ---
>  arm/aarch64/include/kvm/kvm-config-arch.h | 5 -
>  arm/include/arm-common/kvm-config-arch.h  | 1 +
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h 
> b/arm/aarch64/include/kvm/kvm-config-arch.h
> index 04be43d..dabd22c 100644
> --- a/arm/aarch64/include/kvm/kvm-config-arch.h
> +++ b/arm/aarch64/include/kvm/kvm-config-arch.h
> @@ -8,7 +8,10 @@
>   "Create PMUv3 device"), \
>   OPT_U64('\0', "kaslr-seed", &(cfg)->kaslr_seed, \
>   "Specify random seed for Kernel Address Space " \
> - "Layout Randomization (KASLR)"),
> + "Layout Randomization (KASLR)"),\
> + OPT_INTEGER('\0', "phys-shift", &(cfg)->phys_shift, \
> + "Specify maximum physical address size (not "   \
> + "the amount of memory)"),

Given that this is a shift value, I think the help message could be more
informative. Something like:

"Specify maximum number of bits in a guest physical address"

I think I'd actually leave out any mention of memory, because this does
actually have an effect on the amount of addressable memory in a way that I
don't think we want to describe in half of a usage message line :)

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvmtool test PATCH 24/24] kvmtool: arm: Add support for creating VM with PA size

2018-07-04 Thread Will Deacon
On Fri, Jun 29, 2018 at 12:15:44PM +0100, Suzuki K Poulose wrote:
> diff --git a/arm/kvm.c b/arm/kvm.c
> index 5701d41..b1969be 100644
> --- a/arm/kvm.c
> +++ b/arm/kvm.c
> @@ -11,6 +11,8 @@
>  #include 
>  #include 
>  
> +unsigned long kvm_arm_type;
> +
>  struct kvm_ext kvm_req_ext[] = {
>   { DEFINE_KVM_EXT(KVM_CAP_IRQCHIP) },
>   { DEFINE_KVM_EXT(KVM_CAP_ONE_REG) },
> @@ -18,6 +20,26 @@ struct kvm_ext kvm_req_ext[] = {
>   { 0, 0 },
>  };
>  
> +#ifndef KVM_ARM_GET_MAX_VM_PHYS_SHIFT
> +#define KVM_ARM_GET_MAX_VM_PHYS_SHIFT_IO(KVMIO, 0x0b)
> +#endif
> +
> +void kvm__arch_init_hyp(struct kvm *kvm)
> +{
> + int max_ipa;
> +
> + max_ipa = ioctl(kvm->sys_fd, KVM_ARM_GET_MAX_VM_PHYS_SHIFT);
> + if (max_ipa < 0)
> + max_ipa = 40;
> + if (!kvm->cfg.arch.phys_shift)
> + kvm->cfg.arch.phys_shift = 40;
> + if (kvm->cfg.arch.phys_shift > max_ipa)
> + die("Requested PA size (%u) is not supported by the host 
> (%ubits)\n",
> + kvm->cfg.arch.phys_shift, max_ipa);
> + if (kvm->cfg.arch.phys_shift != 40)
> + kvm_arm_type = kvm->cfg.arch.phys_shift;
> +}

Seems a bit weird that the "machine type identifier" to KVM_CREATE_VM is
dedicated entirely to holding the physical address shift verbatim. Is this
really the ABI?

Also, couldn't KVM figure it out automatically if you add memslots at high
addresses, making this a niche tunable outside of testing?

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up

2018-07-04 Thread Will Deacon
Hi James,

On Tue, Jun 26, 2018 at 06:00:56PM +0100, James Morse wrote:
> The aim of this series is to wire arm64's SDEI into APEI.
> 
> On arm64 we have three APEI notifications that are NMI-like, and
> in the unlikely event that all three are supported by a platform,
> they can interrupt each other.
> The GHES driver shouldn't have to deal with this, so this series aims
> to make it re-entrant.
> 
> To do that, we refactor the estatus queue to allow multiple notifications
> to use it, then convert NOTIFY_SEA to always be described as NMI-like,
> and to use the estatus queue.
> 
> From here we push the locking and fixmap choices out to the notification
> functions, and remove the use of per-ghes estatus and flags. This removes
> the in_nmi() 'timebomb' in ghes_copy_tofrom_phys().
> 
> Things get sticky when an NMI notification needs to know how big the
> CPER records might be, before reading it. This series splits
> ghes_estatus_read() to let us peek at the buffer. A side effect of this
> is the 20byte header will get read twice. (how does it work today? it
> reads the records into a per-ghes worst-case sized buffer, allocates
> the correct size and copies the records. in_nmi() use of this per-ghes
> buffer needs eliminating).
> 
> One alternative was to trust firmware's 'max raw data length' and use
> that to allocate 'enough' memory. We don't use this value today, so its
> probably wrong on some sytem somewhere.
> 
> Since v4 patches 5,8-15 are new, otherwise changes are noted in the patch.

The little bits touching arch/arm64/ all look fine to me here, but it looks
like other patches need review separately and ultimately I suspect you're
going to route it via some other tree.

Let me know if you need me to help with anything.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvmtool test PATCH 24/24] kvmtool: arm: Add support for creating VM with PA size

2018-07-04 Thread Marc Zyngier
On Wed, 04 Jul 2018 15:22:42 +0100,
Will Deacon  wrote:
> 
> On Fri, Jun 29, 2018 at 12:15:44PM +0100, Suzuki K Poulose wrote:
> > diff --git a/arm/kvm.c b/arm/kvm.c
> > index 5701d41..b1969be 100644
> > --- a/arm/kvm.c
> > +++ b/arm/kvm.c
> > @@ -11,6 +11,8 @@
> >  #include 
> >  #include 
> >  
> > +unsigned long kvm_arm_type;
> > +
> >  struct kvm_ext kvm_req_ext[] = {
> > { DEFINE_KVM_EXT(KVM_CAP_IRQCHIP) },
> > { DEFINE_KVM_EXT(KVM_CAP_ONE_REG) },
> > @@ -18,6 +20,26 @@ struct kvm_ext kvm_req_ext[] = {
> > { 0, 0 },
> >  };
> >  
> > +#ifndef KVM_ARM_GET_MAX_VM_PHYS_SHIFT
> > +#define KVM_ARM_GET_MAX_VM_PHYS_SHIFT  _IO(KVMIO, 0x0b)
> > +#endif
> > +
> > +void kvm__arch_init_hyp(struct kvm *kvm)
> > +{
> > +   int max_ipa;
> > +
> > +   max_ipa = ioctl(kvm->sys_fd, KVM_ARM_GET_MAX_VM_PHYS_SHIFT);
> > +   if (max_ipa < 0)
> > +   max_ipa = 40;
> > +   if (!kvm->cfg.arch.phys_shift)
> > +   kvm->cfg.arch.phys_shift = 40;
> > +   if (kvm->cfg.arch.phys_shift > max_ipa)
> > +   die("Requested PA size (%u) is not supported by the host 
> > (%ubits)\n",
> > +   kvm->cfg.arch.phys_shift, max_ipa);
> > +   if (kvm->cfg.arch.phys_shift != 40)
> > +   kvm_arm_type = kvm->cfg.arch.phys_shift;
> > +}
> 
> Seems a bit weird that the "machine type identifier" to KVM_CREATE_VM is
> dedicated entirely to holding the physical address shift verbatim. Is this
> really the ABI?
> 
> Also, couldn't KVM figure it out automatically if you add memslots at high
> addresses, making this a niche tunable outside of testing?

Not really. Let's say I want my IPA space split in two: memory covers
the low 47 bit, and I want MMIO spanning the top 47 bit. With your
scheme, you'd end-up with a 47bit IPA space, while you really want 48
bits (MMIO space implemented by userspace isn't registered to the
kernel).

M.

-- 
Jazz is not dead, it just smell funny.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvmtool test PATCH 22/24] kvmtool: arm64: Add support for guest physical address size

2018-07-04 Thread Julien Grall

Hi,

On 04/07/18 15:09, Will Deacon wrote:

On Fri, Jun 29, 2018 at 12:15:42PM +0100, Suzuki K Poulose wrote:

Add an option to specify the physical address size used by this
VM.

Signed-off-by: Suzuki K Poulose 
---
  arm/aarch64/include/kvm/kvm-config-arch.h | 5 -
  arm/include/arm-common/kvm-config-arch.h  | 1 +
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h 
b/arm/aarch64/include/kvm/kvm-config-arch.h
index 04be43d..dabd22c 100644
--- a/arm/aarch64/include/kvm/kvm-config-arch.h
+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
@@ -8,7 +8,10 @@
"Create PMUv3 device"),   \
OPT_U64('\0', "kaslr-seed", &(cfg)->kaslr_seed,\
"Specify random seed for Kernel Address Space "   \
-   "Layout Randomization (KASLR)"),
+   "Layout Randomization (KASLR)"),  \
+   OPT_INTEGER('\0', "phys-shift", &(cfg)->phys_shift,\
+   "Specify maximum physical address size (not " \
+   "the amount of memory)"),


Given that this is a shift value, I think the help message could be more
informative. Something like:

"Specify maximum number of bits in a guest physical address"

I think I'd actually leave out any mention of memory, because this does
actually have an effect on the amount of addressable memory in a way that I
don't think we want to describe in half of a usage message line :)

Is there any particular reasons to expose this option to the user?

I have recently sent a series to allow the user to specify the position
of the RAM [1]. With that series in mind, I think the user would not 
really need to specify the maximum physical shift. Instead we could 
automatically find it.


Cheers,

[1] 
http://archive.armlinux.org.uk/lurker/message/20180510.140428.1c295b5b.en.html




Will



--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvmtool test PATCH 24/24] kvmtool: arm: Add support for creating VM with PA size

2018-07-04 Thread Will Deacon
On Wed, Jul 04, 2018 at 03:41:18PM +0100, Marc Zyngier wrote:
> On Wed, 04 Jul 2018 15:22:42 +0100,
> Will Deacon  wrote:
> > 
> > On Fri, Jun 29, 2018 at 12:15:44PM +0100, Suzuki K Poulose wrote:
> > > diff --git a/arm/kvm.c b/arm/kvm.c
> > > index 5701d41..b1969be 100644
> > > --- a/arm/kvm.c
> > > +++ b/arm/kvm.c
> > > @@ -11,6 +11,8 @@
> > >  #include 
> > >  #include 
> > >  
> > > +unsigned long kvm_arm_type;
> > > +
> > >  struct kvm_ext kvm_req_ext[] = {
> > >   { DEFINE_KVM_EXT(KVM_CAP_IRQCHIP) },
> > >   { DEFINE_KVM_EXT(KVM_CAP_ONE_REG) },
> > > @@ -18,6 +20,26 @@ struct kvm_ext kvm_req_ext[] = {
> > >   { 0, 0 },
> > >  };
> > >  
> > > +#ifndef KVM_ARM_GET_MAX_VM_PHYS_SHIFT
> > > +#define KVM_ARM_GET_MAX_VM_PHYS_SHIFT_IO(KVMIO, 0x0b)
> > > +#endif
> > > +
> > > +void kvm__arch_init_hyp(struct kvm *kvm)
> > > +{
> > > + int max_ipa;
> > > +
> > > + max_ipa = ioctl(kvm->sys_fd, KVM_ARM_GET_MAX_VM_PHYS_SHIFT);
> > > + if (max_ipa < 0)
> > > + max_ipa = 40;
> > > + if (!kvm->cfg.arch.phys_shift)
> > > + kvm->cfg.arch.phys_shift = 40;
> > > + if (kvm->cfg.arch.phys_shift > max_ipa)
> > > + die("Requested PA size (%u) is not supported by the host 
> > > (%ubits)\n",
> > > + kvm->cfg.arch.phys_shift, max_ipa);
> > > + if (kvm->cfg.arch.phys_shift != 40)
> > > + kvm_arm_type = kvm->cfg.arch.phys_shift;
> > > +}
> > 
> > Seems a bit weird that the "machine type identifier" to KVM_CREATE_VM is
> > dedicated entirely to holding the physical address shift verbatim. Is this
> > really the ABI?
> > 
> > Also, couldn't KVM figure it out automatically if you add memslots at high
> > addresses, making this a niche tunable outside of testing?
> 
> Not really. Let's say I want my IPA space split in two: memory covers
> the low 47 bit, and I want MMIO spanning the top 47 bit. With your
> scheme, you'd end-up with a 47bit IPA space, while you really want 48
> bits (MMIO space implemented by userspace isn't registered to the
> kernel).

That still sounds quite niche for a VM. Does QEMU do that? In any case,
having KVM automatically increase the IPA bits to cover the memslots it
knows about would make sense to me, and also be sufficient for kvmtool
without us having to add an extra command-line argument.

The MMIO case might be better dealt with by having a way to register MMIO
regions rather than having the PA bits exposed directly.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 15/20] kvm: arm/arm64: Allow tuning the physical address size for VM

2018-07-04 Thread Will Deacon
Hi Suzuki,

On Fri, Jun 29, 2018 at 12:15:35PM +0100, Suzuki K Poulose wrote:
> Allow specifying the physical address size for a new VM via
> the kvm_type argument for KVM_CREATE_VM ioctl. This allows
> us to finalise the stage2 page table format as early as possible
> and hence perform the right checks on the memory slots without
> complication. The size is encoded as Log2(PA_Size) in the bits[7:0]
> of the type field and can encode more information in the future if
> required. The IPA size is still capped at 40bits.
> 
> Cc: Marc Zyngier 
> Cc: Christoffer Dall 
> Cc: Peter Maydel 
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Signed-off-by: Suzuki K Poulose 
> ---
>  arch/arm/include/asm/kvm_mmu.h   |  2 ++
>  arch/arm64/include/asm/kvm_arm.h | 10 +++---
>  arch/arm64/include/asm/kvm_mmu.h |  2 ++
>  include/uapi/linux/kvm.h | 10 ++
>  virt/kvm/arm/arm.c   | 24 ++--
>  5 files changed, 39 insertions(+), 9 deletions(-)

[...]

> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 4df9bb6..fa4cab0 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -751,6 +751,16 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_S390_SIE_PAGE_OFFSET 1
>  
>  /*
> + * On arm/arm64, machine type can be used to request the physical
> + * address size for the VM. Bits [7-0] have been reserved for the
> + * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
> + * value 0 implies the default IPA size, which is 40bits.
> + */
> +#define KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK  0xff
> +#define KVM_VM_TYPE_ARM_PHYS_SHIFT(x)\
> + ((x) & KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK)

This seems like you're allocating quite a lot of bits in a non-extensible
interface to a fairly esoteric parameter. Would it be better to add another
ioctl, or condense the number of sizes you support instead?

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvmtool test PATCH 22/24] kvmtool: arm64: Add support for guest physical address size

2018-07-04 Thread Will Deacon
On Wed, Jul 04, 2018 at 04:00:11PM +0100, Julien Grall wrote:
> On 04/07/18 15:09, Will Deacon wrote:
> >On Fri, Jun 29, 2018 at 12:15:42PM +0100, Suzuki K Poulose wrote:
> >>Add an option to specify the physical address size used by this
> >>VM.
> >>
> >>Signed-off-by: Suzuki K Poulose 
> >>---
> >>  arm/aarch64/include/kvm/kvm-config-arch.h | 5 -
> >>  arm/include/arm-common/kvm-config-arch.h  | 1 +
> >>  2 files changed, 5 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h 
> >>b/arm/aarch64/include/kvm/kvm-config-arch.h
> >>index 04be43d..dabd22c 100644
> >>--- a/arm/aarch64/include/kvm/kvm-config-arch.h
> >>+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
> >>@@ -8,7 +8,10 @@
> >>"Create PMUv3 device"), \
> >>OPT_U64('\0', "kaslr-seed", &(cfg)->kaslr_seed, \
> >>"Specify random seed for Kernel Address Space " \
> >>-   "Layout Randomization (KASLR)"),
> >>+   "Layout Randomization (KASLR)"),\
> >>+   OPT_INTEGER('\0', "phys-shift", &(cfg)->phys_shift, \
> >>+   "Specify maximum physical address size (not "   \
> >>+   "the amount of memory)"),
> >
> >Given that this is a shift value, I think the help message could be more
> >informative. Something like:
> >
> > "Specify maximum number of bits in a guest physical address"
> >
> >I think I'd actually leave out any mention of memory, because this does
> >actually have an effect on the amount of addressable memory in a way that I
> >don't think we want to describe in half of a usage message line :)
> Is there any particular reasons to expose this option to the user?
> 
> I have recently sent a series to allow the user to specify the position
> of the RAM [1]. With that series in mind, I think the user would not really
> need to specify the maximum physical shift. Instead we could automatically
> find it.

Marc makes a good point that it doesn't help for MMIO regions, so I'm trying
to understand whether we can do something differently there and avoid
sacrificing the type parameter.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvmtool test PATCH 24/24] kvmtool: arm: Add support for creating VM with PA size

2018-07-04 Thread Suzuki K Poulose

Hi Will,

On 07/04/2018 03:22 PM, Will Deacon wrote:

On Fri, Jun 29, 2018 at 12:15:44PM +0100, Suzuki K Poulose wrote:

diff --git a/arm/kvm.c b/arm/kvm.c
index 5701d41..b1969be 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -11,6 +11,8 @@
  #include 
  #include 
  
+unsigned long kvm_arm_type;

+
  struct kvm_ext kvm_req_ext[] = {
{ DEFINE_KVM_EXT(KVM_CAP_IRQCHIP) },
{ DEFINE_KVM_EXT(KVM_CAP_ONE_REG) },
@@ -18,6 +20,26 @@ struct kvm_ext kvm_req_ext[] = {
{ 0, 0 },
  };
  
+#ifndef KVM_ARM_GET_MAX_VM_PHYS_SHIFT

+#define KVM_ARM_GET_MAX_VM_PHYS_SHIFT  _IO(KVMIO, 0x0b)
+#endif
+
+void kvm__arch_init_hyp(struct kvm *kvm)
+{
+   int max_ipa;
+
+   max_ipa = ioctl(kvm->sys_fd, KVM_ARM_GET_MAX_VM_PHYS_SHIFT);
+   if (max_ipa < 0)
+   max_ipa = 40;
+   if (!kvm->cfg.arch.phys_shift)
+   kvm->cfg.arch.phys_shift = 40;
+   if (kvm->cfg.arch.phys_shift > max_ipa)
+   die("Requested PA size (%u) is not supported by the host 
(%ubits)\n",
+   kvm->cfg.arch.phys_shift, max_ipa);
+   if (kvm->cfg.arch.phys_shift != 40)
+   kvm_arm_type = kvm->cfg.arch.phys_shift;
+}


Seems a bit weird that the "machine type identifier" to KVM_CREATE_VM is
dedicated entirely to holding the physical address shift verbatim. Is this
really the ABI?


The bits[7:0] of the machine type has been reserved for the IPA shift.
This version is missing the updates to the ABI documentation, I have it
for the next version.



Also, couldn't KVM figure it out automatically if you add memslots at high
addresses, making this a niche tunable outside of testing?


The stage2 pgd size is really dependent on the max IPA. Also, unlike the 
stage1 (where the maximum size will be 1 page), the size can go upto 16

pages (and different number of levels due to concatenation), so we need
to finalize this at least before the first memory gets mapped (RAM or 
Device). That implies, we cannot wait until all the memory slots are

created.

The first version of the series added a separate ioctl for specifying
the limit, which had its own complexities. So, this ABI was suggested
to keep things simpler.


Suzuki
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCHv4 05/10] arm64/cpufeature: detect pointer authentication

2018-07-04 Thread Will Deacon
On Fri, May 25, 2018 at 11:01:37AM +0100, Mark Rutland wrote:
> On Wed, May 23, 2018 at 09:48:28AM +0100, Suzuki K Poulose wrote:
> > On 03/05/18 14:20, Mark Rutland wrote:
> > > diff --git a/arch/arm64/include/asm/cpucaps.h 
> > > b/arch/arm64/include/asm/cpucaps.h
> > > index bc51b72fafd4..9dcb4d1b14f5 100644
> > > --- a/arch/arm64/include/asm/cpucaps.h
> > > +++ b/arch/arm64/include/asm/cpucaps.h
> > > @@ -48,7 +48,10 @@
> > >   #define ARM64_HAS_CACHE_IDC 27
> > >   #define ARM64_HAS_CACHE_DIC 28
> > >   #define ARM64_HW_DBM29
> > > +#define ARM64_HAS_ADDRESS_AUTH_ARCH  30
> > > +#define ARM64_HAS_ADDRESS_AUTH_IMP_DEF   31
> > 
> > Where are these caps used ? I couldn't find anything in the series
> > that uses them. Otherwise looks good to me.
> 
> Those were consumed by KVM support, which needed to detect if CPUs had
> mismatched support. Currently they're just placeholders as I need a
> cpucap value for the separate IMP-DEF / architected probing cases.
> 
> I *could* get rid of those and just have the ARM64_HAS_ADDRESS_AUTH case
> log "Address authentication", but I wanted to have separate messages for
> IMP-DEF vs architected.

Why? Surely it only matters if we find a mixture, and then you'll shout
loudly. I'd certainly be in favour of reducing the number of caps you're
adding here, particularly if they're just there for a line in dmesg.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCHv4 00/10] ARMv8.3 pointer authentication userspace support

2018-07-04 Thread Will Deacon
HI Mark,

On Thu, May 03, 2018 at 02:20:21PM +0100, Mark Rutland wrote:
> This series adds support for the ARMv8.3 pointer authentication extension,
> enabling userspace return address protection with recent versions of GCC.

As discussed off-list, I'd really like to get a feel for how we might use
this to protect the kernel before we commit to a userspace ABI, especially
given that there is some level of resource sharing with the keys.

So ideally, we'd merge this along with the support for the kernel side.
In the meantime, is it worth taking any of the ID register sanity checks
now?

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 15/20] kvm: arm/arm64: Allow tuning the physical address size for VM

2018-07-04 Thread Suzuki K Poulose

On 07/04/2018 04:51 PM, Will Deacon wrote:

Hi Suzuki,

On Fri, Jun 29, 2018 at 12:15:35PM +0100, Suzuki K Poulose wrote:

Allow specifying the physical address size for a new VM via
the kvm_type argument for KVM_CREATE_VM ioctl. This allows
us to finalise the stage2 page table format as early as possible
and hence perform the right checks on the memory slots without
complication. The size is encoded as Log2(PA_Size) in the bits[7:0]
of the type field and can encode more information in the future if
required. The IPA size is still capped at 40bits.

Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: Peter Maydel 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Suzuki K Poulose 
---
  arch/arm/include/asm/kvm_mmu.h   |  2 ++
  arch/arm64/include/asm/kvm_arm.h | 10 +++---
  arch/arm64/include/asm/kvm_mmu.h |  2 ++
  include/uapi/linux/kvm.h | 10 ++
  virt/kvm/arm/arm.c   | 24 ++--
  5 files changed, 39 insertions(+), 9 deletions(-)


[...]


diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 4df9bb6..fa4cab0 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -751,6 +751,16 @@ struct kvm_ppc_resize_hpt {
  #define KVM_S390_SIE_PAGE_OFFSET 1
  
  /*

+ * On arm/arm64, machine type can be used to request the physical
+ * address size for the VM. Bits [7-0] have been reserved for the
+ * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
+ * value 0 implies the default IPA size, which is 40bits.
+ */
+#define KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK0xff
+#define KVM_VM_TYPE_ARM_PHYS_SHIFT(x)  \
+   ((x) & KVM_VM_TYPE_ARM_PHYS_SHIFT_MASK)


This seems like you're allocating quite a lot of bits in a non-extensible
interface to a fairly esoteric parameter. Would it be better to add another
ioctl, or condense the number of sizes you support instead?


As I explained in the other thread, we need the size as soon as the VM
is created. The major challenge is keeping the backward compatibility by
mapping 0 to 40bits. I will give it a thought.

Suzuki
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm