Re: [PATCH 07/14] KVM: arm64: Enable forcing page-level stage-2 mappings

2021-07-19 Thread Quentin Perret
On Monday 19 Jul 2021 at 15:24:32 (+0100), Marc Zyngier wrote:
> On Mon, 19 Jul 2021 11:47:28 +0100,
> Quentin Perret  wrote:
> > 
> > Much of the stage-2 manipulation logic relies on being able to destroy
> > block mappings if e.g. installing a smaller mapping in the range. The
> > rationale for this behaviour is that stage-2 mappings can always be
> > re-created lazily. However, this gets more complicated when the stage-2
> > page-table is used to store metadata about the underlying pages. In such
> > a case, destroying a block mapping may lead to losing part of the
> > state, and confuse the user of those metadata (such as the hypervisor in
> > nVHE protected mode).
> > 
> > To fix this, introduce a callback function in the pgtable struct which
> > is called during all map operations to determine whether the mappings
> > can us blocks, or should be forced to page-granularity level. This is
> 
> nit: use?

Ack.

> > used by the hypervisor when creating the host stage-2 to force
> > page-level mappings when using non-default protection attributes.
> > 
> > Signed-off-by: Quentin Perret 
> > ---
> >  arch/arm64/include/asm/kvm_pgtable.h  | 63 +--
> >  arch/arm64/kvm/hyp/nvhe/mem_protect.c | 16 +--
> >  arch/arm64/kvm/hyp/pgtable.c  | 20 +++--
> >  3 files changed, 69 insertions(+), 30 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
> > b/arch/arm64/include/asm/kvm_pgtable.h
> > index af62203d2f7a..dd72653314c7 100644
> > --- a/arch/arm64/include/asm/kvm_pgtable.h
> > +++ b/arch/arm64/include/asm/kvm_pgtable.h
> > @@ -75,25 +75,6 @@ enum kvm_pgtable_stage2_flags {
> > KVM_PGTABLE_S2_IDMAP= BIT(1),
> >  };
> >  
> > -/**
> > - * struct kvm_pgtable - KVM page-table.
> > - * @ia_bits:   Maximum input address size, in bits.
> > - * @start_level:   Level at which the page-table walk starts.
> > - * @pgd:   Pointer to the first top-level entry of the page-table.
> > - * @mm_ops:Memory management callbacks.
> > - * @mmu:   Stage-2 KVM MMU struct. Unused for stage-1 page-tables.
> > - */
> > -struct kvm_pgtable {
> > -   u32 ia_bits;
> > -   u32 start_level;
> > -   kvm_pte_t   *pgd;
> > -   struct kvm_pgtable_mm_ops   *mm_ops;
> > -
> > -   /* Stage-2 only */
> > -   struct kvm_s2_mmu   *mmu;
> > -   enum kvm_pgtable_stage2_flags   flags;
> > -};
> > -
> >  /**
> >   * enum kvm_pgtable_prot - Page-table permissions and attributes.
> >   * @KVM_PGTABLE_PROT_X:Execute permission.
> > @@ -109,11 +90,41 @@ enum kvm_pgtable_prot {
> > KVM_PGTABLE_PROT_DEVICE = BIT(3),
> >  };
> >  
> > -#define PAGE_HYP   (KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_W)
> > +#define KVM_PGTABLE_PROT_RW(KVM_PGTABLE_PROT_R | 
> > KVM_PGTABLE_PROT_W)
> > +#define KVM_PGTABLE_PROT_RWX   (KVM_PGTABLE_PROT_RW | 
> > KVM_PGTABLE_PROT_X)
> > +
> > +#define PAGE_HYP   KVM_PGTABLE_PROT_RW
> >  #define PAGE_HYP_EXEC  (KVM_PGTABLE_PROT_R | 
> > KVM_PGTABLE_PROT_X)
> >  #define PAGE_HYP_RO(KVM_PGTABLE_PROT_R)
> >  #define PAGE_HYP_DEVICE(PAGE_HYP | KVM_PGTABLE_PROT_DEVICE)
> >  
> > +typedef bool (*kvm_pgtable_want_pte_cb_t)(u64 addr, u64 end,
> > + enum kvm_pgtable_prot prot);
> > +
> > +/**
> > + * struct kvm_pgtable - KVM page-table.
> > + * @ia_bits:   Maximum input address size, in bits.
> > + * @start_level:   Level at which the page-table walk starts.
> > + * @pgd:   Pointer to the first top-level entry of the page-table.
> > + * @mm_ops:Memory management callbacks.
> > + * @mmu:   Stage-2 KVM MMU struct. Unused for stage-1 page-tables.
> > + * @flags: Stage-2 page-table flags.
> > + * @want_pte_cb:   Callback function used during map operations to decide
> > + * whether block mappings can be used to map the given IPA
> > + * range.
> > + */
> > +struct kvm_pgtable {
> > +   u32 ia_bits;
> > +   u32 start_level;
> > +   kvm_pte_t   *pgd;
> > +   struct kvm_pgtable_mm_ops   *mm_ops;
> > +
> > +   /* Stage-2 only */
> > +   struct kvm_s2_mmu   *mmu;
> > +   enum kvm_pgtable_stage2_flags   flags;
> > +   kvm_pgtable_want_pte_cb_t   want_pte_cb;
> > +};
> 
> nit: does this whole definition really need to move around?

The alternative is to move (or forward declare) enum kvm_pgtable_prot
higher up in the file, but I have no strong opinion, so whatever you
prefer will work for me.

> > +
> >  /**
> >   * struct kvm_mem_range - Range of Intermediate Physical Addresses
> >   * @start: Start of the range.
> > @@ -216,21 

[PATCH 07/14] KVM: arm64: Enable forcing page-level stage-2 mappings

2021-07-19 Thread Quentin Perret
Much of the stage-2 manipulation logic relies on being able to destroy
block mappings if e.g. installing a smaller mapping in the range. The
rationale for this behaviour is that stage-2 mappings can always be
re-created lazily. However, this gets more complicated when the stage-2
page-table is used to store metadata about the underlying pages. In such
a case, destroying a block mapping may lead to losing part of the
state, and confuse the user of those metadata (such as the hypervisor in
nVHE protected mode).

To fix this, introduce a callback function in the pgtable struct which
is called during all map operations to determine whether the mappings
can us blocks, or should be forced to page-granularity level. This is
used by the hypervisor when creating the host stage-2 to force
page-level mappings when using non-default protection attributes.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h  | 63 +--
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 16 +--
 arch/arm64/kvm/hyp/pgtable.c  | 20 +++--
 3 files changed, 69 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index af62203d2f7a..dd72653314c7 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -75,25 +75,6 @@ enum kvm_pgtable_stage2_flags {
KVM_PGTABLE_S2_IDMAP= BIT(1),
 };
 
-/**
- * struct kvm_pgtable - KVM page-table.
- * @ia_bits:   Maximum input address size, in bits.
- * @start_level:   Level at which the page-table walk starts.
- * @pgd:   Pointer to the first top-level entry of the page-table.
- * @mm_ops:Memory management callbacks.
- * @mmu:   Stage-2 KVM MMU struct. Unused for stage-1 page-tables.
- */
-struct kvm_pgtable {
-   u32 ia_bits;
-   u32 start_level;
-   kvm_pte_t   *pgd;
-   struct kvm_pgtable_mm_ops   *mm_ops;
-
-   /* Stage-2 only */
-   struct kvm_s2_mmu   *mmu;
-   enum kvm_pgtable_stage2_flags   flags;
-};
-
 /**
  * enum kvm_pgtable_prot - Page-table permissions and attributes.
  * @KVM_PGTABLE_PROT_X:Execute permission.
@@ -109,11 +90,41 @@ enum kvm_pgtable_prot {
KVM_PGTABLE_PROT_DEVICE = BIT(3),
 };
 
-#define PAGE_HYP   (KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_W)
+#define KVM_PGTABLE_PROT_RW(KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_W)
+#define KVM_PGTABLE_PROT_RWX   (KVM_PGTABLE_PROT_RW | KVM_PGTABLE_PROT_X)
+
+#define PAGE_HYP   KVM_PGTABLE_PROT_RW
 #define PAGE_HYP_EXEC  (KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_X)
 #define PAGE_HYP_RO(KVM_PGTABLE_PROT_R)
 #define PAGE_HYP_DEVICE(PAGE_HYP | KVM_PGTABLE_PROT_DEVICE)
 
+typedef bool (*kvm_pgtable_want_pte_cb_t)(u64 addr, u64 end,
+ enum kvm_pgtable_prot prot);
+
+/**
+ * struct kvm_pgtable - KVM page-table.
+ * @ia_bits:   Maximum input address size, in bits.
+ * @start_level:   Level at which the page-table walk starts.
+ * @pgd:   Pointer to the first top-level entry of the page-table.
+ * @mm_ops:Memory management callbacks.
+ * @mmu:   Stage-2 KVM MMU struct. Unused for stage-1 page-tables.
+ * @flags: Stage-2 page-table flags.
+ * @want_pte_cb:   Callback function used during map operations to decide
+ * whether block mappings can be used to map the given IPA
+ * range.
+ */
+struct kvm_pgtable {
+   u32 ia_bits;
+   u32 start_level;
+   kvm_pte_t   *pgd;
+   struct kvm_pgtable_mm_ops   *mm_ops;
+
+   /* Stage-2 only */
+   struct kvm_s2_mmu   *mmu;
+   enum kvm_pgtable_stage2_flags   flags;
+   kvm_pgtable_want_pte_cb_t   want_pte_cb;
+};
+
 /**
  * struct kvm_mem_range - Range of Intermediate Physical Addresses
  * @start: Start of the range.
@@ -216,21 +227,25 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 
addr, u64 size, u64 phys,
 u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift);
 
 /**
- * kvm_pgtable_stage2_init_flags() - Initialise a guest stage-2 page-table.
+ * kvm_pgtable_stage2_init_full() - Initialise a guest stage-2 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
  * @arch:  Arch-specific KVM structure representing the guest virtual
  * machine.
  * @mm_ops:Memory management callbacks.
  * @flags: Stage-2 configuration flags.
+ * @want_pte_cb: Callback function used during map operations to decide
+ * whether block mappings can be used to 

Re: [PATCH 07/14] KVM: arm64: Enable forcing page-level stage-2 mappings

2021-07-19 Thread Marc Zyngier
On Mon, 19 Jul 2021 11:47:28 +0100,
Quentin Perret  wrote:
> 
> Much of the stage-2 manipulation logic relies on being able to destroy
> block mappings if e.g. installing a smaller mapping in the range. The
> rationale for this behaviour is that stage-2 mappings can always be
> re-created lazily. However, this gets more complicated when the stage-2
> page-table is used to store metadata about the underlying pages. In such
> a case, destroying a block mapping may lead to losing part of the
> state, and confuse the user of those metadata (such as the hypervisor in
> nVHE protected mode).
> 
> To fix this, introduce a callback function in the pgtable struct which
> is called during all map operations to determine whether the mappings
> can us blocks, or should be forced to page-granularity level. This is

nit: use?

> used by the hypervisor when creating the host stage-2 to force
> page-level mappings when using non-default protection attributes.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/include/asm/kvm_pgtable.h  | 63 +--
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c | 16 +--
>  arch/arm64/kvm/hyp/pgtable.c  | 20 +++--
>  3 files changed, 69 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
> b/arch/arm64/include/asm/kvm_pgtable.h
> index af62203d2f7a..dd72653314c7 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -75,25 +75,6 @@ enum kvm_pgtable_stage2_flags {
>   KVM_PGTABLE_S2_IDMAP= BIT(1),
>  };
>  
> -/**
> - * struct kvm_pgtable - KVM page-table.
> - * @ia_bits: Maximum input address size, in bits.
> - * @start_level: Level at which the page-table walk starts.
> - * @pgd: Pointer to the first top-level entry of the page-table.
> - * @mm_ops:  Memory management callbacks.
> - * @mmu: Stage-2 KVM MMU struct. Unused for stage-1 page-tables.
> - */
> -struct kvm_pgtable {
> - u32 ia_bits;
> - u32 start_level;
> - kvm_pte_t   *pgd;
> - struct kvm_pgtable_mm_ops   *mm_ops;
> -
> - /* Stage-2 only */
> - struct kvm_s2_mmu   *mmu;
> - enum kvm_pgtable_stage2_flags   flags;
> -};
> -
>  /**
>   * enum kvm_pgtable_prot - Page-table permissions and attributes.
>   * @KVM_PGTABLE_PROT_X:  Execute permission.
> @@ -109,11 +90,41 @@ enum kvm_pgtable_prot {
>   KVM_PGTABLE_PROT_DEVICE = BIT(3),
>  };
>  
> -#define PAGE_HYP (KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_W)
> +#define KVM_PGTABLE_PROT_RW  (KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_W)
> +#define KVM_PGTABLE_PROT_RWX (KVM_PGTABLE_PROT_RW | KVM_PGTABLE_PROT_X)
> +
> +#define PAGE_HYP KVM_PGTABLE_PROT_RW
>  #define PAGE_HYP_EXEC(KVM_PGTABLE_PROT_R | 
> KVM_PGTABLE_PROT_X)
>  #define PAGE_HYP_RO  (KVM_PGTABLE_PROT_R)
>  #define PAGE_HYP_DEVICE  (PAGE_HYP | KVM_PGTABLE_PROT_DEVICE)
>  
> +typedef bool (*kvm_pgtable_want_pte_cb_t)(u64 addr, u64 end,
> +   enum kvm_pgtable_prot prot);
> +
> +/**
> + * struct kvm_pgtable - KVM page-table.
> + * @ia_bits: Maximum input address size, in bits.
> + * @start_level: Level at which the page-table walk starts.
> + * @pgd: Pointer to the first top-level entry of the page-table.
> + * @mm_ops:  Memory management callbacks.
> + * @mmu: Stage-2 KVM MMU struct. Unused for stage-1 page-tables.
> + * @flags:   Stage-2 page-table flags.
> + * @want_pte_cb: Callback function used during map operations to decide
> + *   whether block mappings can be used to map the given IPA
> + *   range.
> + */
> +struct kvm_pgtable {
> + u32 ia_bits;
> + u32 start_level;
> + kvm_pte_t   *pgd;
> + struct kvm_pgtable_mm_ops   *mm_ops;
> +
> + /* Stage-2 only */
> + struct kvm_s2_mmu   *mmu;
> + enum kvm_pgtable_stage2_flags   flags;
> + kvm_pgtable_want_pte_cb_t   want_pte_cb;
> +};

nit: does this whole definition really need to move around?

> +
>  /**
>   * struct kvm_mem_range - Range of Intermediate Physical Addresses
>   * @start:   Start of the range.
> @@ -216,21 +227,25 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 
> addr, u64 size, u64 phys,
>  u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift);
>  
>  /**
> - * kvm_pgtable_stage2_init_flags() - Initialise a guest stage-2 page-table.
> + * kvm_pgtable_stage2_init_full() - Initialise a guest stage-2 page-table.
>   * @pgt: Uninitialised page-table structure to initialise.
>   * @arch:Arch-specific KVM structure