On Tuesday 20 Jul 2021 at 11:13:31 (+0100), Marc Zyngier wrote:
> On Mon, 19 Jul 2021 16:49:13 +0100,
> Quentin Perret <qper...@google.com> wrote:
> > 
> > On Monday 19 Jul 2021 at 15:43:34 (+0100), Marc Zyngier wrote:
> > > On Mon, 19 Jul 2021 11:47:29 +0100,
> > > Quentin Perret <qper...@google.com> wrote:
> > > > 
> > > > The hypervisor will soon be in charge of tracking ownership of all
> > > > memory pages in the system. The current page-tracking infrastructure at
> > > > EL2 only allows binary states: a page is either owned or not by an
> > > > entity. But a number of use-cases will require more complex states for
> > > > pages that are shared between two entities (host, hypervisor, or 
> > > > guests).
> > > > 
> > > > In preparation for supporting these use-cases, introduce in the KVM
> > > > page-table library some infrastructure allowing to tag shared pages
> > > > using ignored bits (a.k.a. software bits) in PTEs.
> > > > 
> > > > Signed-off-by: Quentin Perret <qper...@google.com>
> > > > ---
> > > >  arch/arm64/include/asm/kvm_pgtable.h |  5 +++++
> > > >  arch/arm64/kvm/hyp/pgtable.c         | 25 +++++++++++++++++++++++++
> > > >  2 files changed, 30 insertions(+)
> > > > 
> > > > diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
> > > > b/arch/arm64/include/asm/kvm_pgtable.h
> > > > index dd72653314c7..f6d3d5c8910d 100644
> > > > --- a/arch/arm64/include/asm/kvm_pgtable.h
> > > > +++ b/arch/arm64/include/asm/kvm_pgtable.h
> > > > @@ -81,6 +81,8 @@ enum kvm_pgtable_stage2_flags {
> > > >   * @KVM_PGTABLE_PROT_W:                Write permission.
> > > >   * @KVM_PGTABLE_PROT_R:                Read permission.
> > > >   * @KVM_PGTABLE_PROT_DEVICE:   Device attributes.
> > > > + * @KVM_PGTABLE_STATE_SHARED:  Page shared with another entity.
> > > > + * @KVM_PGTABLE_STATE_BORROWED:        Page borrowed from another 
> > > > entity.
> > > >   */
> > > >  enum kvm_pgtable_prot {
> > > >         KVM_PGTABLE_PROT_X                      = BIT(0),
> > > > @@ -88,6 +90,9 @@ enum kvm_pgtable_prot {
> > > >         KVM_PGTABLE_PROT_R                      = BIT(2),
> > > >  
> > > >         KVM_PGTABLE_PROT_DEVICE                 = BIT(3),
> > > > +
> > > > +       KVM_PGTABLE_STATE_SHARED                = BIT(4),
> > > > +       KVM_PGTABLE_STATE_BORROWED              = BIT(5),
> > > 
> > > I'd rather have some indirection here, as we have other potential
> > > users for the SW bits outside of pKVM (see the NV series, which uses
> > > some of these SW bits as the backend for TTL-based TLB invalidation).
> > > 
> > > Can we instead only describe the SW bit states in this enum, and let
> > > the users map the semantic they require onto that state? See [1] for
> > > what I carry in the NV branch.
> > 
> > Works for me -- I just wanted to make sure we don't have users in
> > different places that use the same bits without knowing, but no strong
> > opinions, so happy to change.
> > 
> > > >  };
> > > >  
> > > >  #define KVM_PGTABLE_PROT_RW    (KVM_PGTABLE_PROT_R | 
> > > > KVM_PGTABLE_PROT_W)
> > > > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > > > index 5bdbe7a31551..51598b79dafc 100644
> > > > --- a/arch/arm64/kvm/hyp/pgtable.c
> > > > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > > > @@ -211,6 +211,29 @@ static kvm_pte_t kvm_init_invalid_leaf_owner(u8 
> > > > owner_id)
> > > >         return FIELD_PREP(KVM_INVALID_PTE_OWNER_MASK, owner_id);
> > > >  }
> > > >  
> > > > +static kvm_pte_t pte_ignored_bit_prot(enum kvm_pgtable_prot prot)
> > > 
> > > Can we call these sw rather than ignored?
> > 
> > Sure.
> > 
> > > > +{
> > > > +       kvm_pte_t ignored_bits = 0;
> > > > +
> > > > +       /*
> > > > +        * Ignored bits 0 and 1 are reserved to track the memory 
> > > > ownership
> > > > +        * state of each page:
> > > > +        *   00: The page is owned solely by the page-table owner.
> > > > +        *   01: The page is owned by the page-table owner, but is 
> > > > shared
> > > > +        *       with another entity.
> > > > +        *   10: The page is shared with, but not owned by the 
> > > > page-table owner.
> > > > +        *   11: Reserved for future use (lending).
> > > > +        */
> > > > +       if (prot & KVM_PGTABLE_STATE_SHARED) {
> > > > +               if (prot & KVM_PGTABLE_STATE_BORROWED)
> > > > +                       ignored_bits |= BIT(1);
> > > > +               else
> > > > +                       ignored_bits |= BIT(0);
> > > > +       }
> > > > +
> > > > +       return FIELD_PREP(KVM_PTE_LEAF_ATTR_IGNORED, ignored_bits);
> > > > +}
> > > > +
> > > >  static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, 
> > > > u64 addr,
> > > >                                   u32 level, kvm_pte_t *ptep,
> > > >                                   enum kvm_pgtable_walk_flags flag)
> > > > @@ -357,6 +380,7 @@ static int hyp_set_prot_attr(enum kvm_pgtable_prot 
> > > > prot, kvm_pte_t *ptep)
> > > >         attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_AP, ap);
> > > >         attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh);
> > > >         attr |= KVM_PTE_LEAF_ATTR_LO_S1_AF;
> > > > +       attr |= pte_ignored_bit_prot(prot);
> > > >         *ptep = attr;
> > > >  
> > > >         return 0;
> > > > @@ -558,6 +582,7 @@ static int stage2_set_prot_attr(struct kvm_pgtable 
> > > > *pgt, enum kvm_pgtable_prot p
> > > >  
> > > >         attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
> > > >         attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF;
> > > > +       attr |= pte_ignored_bit_prot(prot);
> > > >         *ptep = attr;
> > > >  
> > > >         return 0;
> > > 
> > > How about kvm_pgtable_stage2_relax_perms()?
> > 
> > It should leave SW bits untouched, and it really felt like a path were
> > we want to change permissions and nothing else. What did you have in
> > mind?
> 
> It isn't clear to me that it would not (cannot?) be used to change
> other bits, given that it takes an arbitrary 'prot' set.

Sure, though it already ignores KVM_PGTABLE_PROT_DEVICE.

I guess the thing I find hard to reason about is that
kvm_pgtable_stage2_relax_perms() is 'additive'. E.g. it can make a
mapping RW if it was RO, but not the other way around. With the current
patch-set it wasn't really clear how that should translate to
KVM_PGTABLE_STATE_SHARED and such.

> If there is
> such an intended restriction, we definitely should document it.

Ack, that's definitely missing. And in fact I should probably make
kvm_pgtable_stage2_relax_perms() return -EINVAL if we're passing prot
values it can't handle.

Cheers,
Quentin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to