Re: [PATCH v5 2/3] KVM: x86: Introduce allow list for MSR emulation

2020-08-10 Thread Aaron Lewis
On Fri, Aug 7, 2020 at 8:57 AM Alexander Graf  wrote:
>
> It's not desireable to have all MSRs always handled by KVM kernel space. Some
> MSRs would be useful to handle in user space to either emulate behavior (like
> uCode updates) or differentiate whether they are valid based on the CPU model.
>
> To allow user space to specify which MSRs it wants to see handled by KVM,
> this patch introduces a new ioctl to push allow lists of bitmaps into
> KVM. Based on these bitmaps, KVM can then decide whether to reject MSR access.
> With the addition of KVM_CAP_X86_USER_SPACE_MSR it can also deflect the
> denied MSR events to user space to operate on.
>
> If no allowlist is populated, MSR handling stays identical to before.
>
> Signed-off-by: KarimAllah Ahmed 
> Signed-off-by: Alexander Graf 
>
> ---
>
> v2 -> v3:
>
>   - document flags for KVM_X86_ADD_MSR_ALLOWLIST
>   - generalize exit path, always unlock when returning
>   - s/KVM_CAP_ADD_MSR_ALLOWLIST/KVM_CAP_X86_MSR_ALLOWLIST/g
>   - Add KVM_X86_CLEAR_MSR_ALLOWLIST
>
> v3 -> v4:
>   - lock allow check and clearing
>   - free bitmaps on clear
>
> v4 -> v5:
>
>   - use srcu
> ---
>  Documentation/virt/kvm/api.rst  |  91 ++
>  arch/x86/include/asm/kvm_host.h |  10 ++
>  arch/x86/include/uapi/asm/kvm.h |  15 +++
>  arch/x86/kvm/x86.c  | 160 
>  include/uapi/linux/kvm.h|   5 +
>  5 files changed, 281 insertions(+)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 2ca38649b3d4..9cb36060f61c 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -4697,6 +4697,82 @@ KVM_PV_VM_VERIFY
>Verify the integrity of the unpacked image. Only if this succeeds,
>KVM is allowed to start protected VCPUs.
>
> +4.126 KVM_X86_ADD_MSR_ALLOWLIST
> +---
> +
> +:Capability: KVM_CAP_X86_MSR_ALLOWLIST
> +:Architectures: x86
> +:Type: vm ioctl
> +:Parameters: struct kvm_msr_allowlist
> +:Returns: 0 on success, < 0 on error
> +
> +::
> +
> +  struct kvm_msr_allowlist {
> + __u32 flags;
> + __u32 nmsrs; /* number of msrs in bitmap */
> + __u32 base;  /* base address for the MSRs bitmap */
> + __u32 pad;
> +
> + __u8 bitmap[0]; /* a set bit allows that the operation set in flags 
> */
> +  };

Couldn't the struct look like this given there are 3 flags and nmsrs
can't be greater than KVM_MSR_ALLOWLIST_MAX_LEN (0x600)
struct kvm_msr_allowlist {
__u16 flags;
__u16 nmsrs; /* number of msrs in bitmap */
__u32 base;  /* base address for the MSRs bitmap */
__u8 bitmap[0]; /* a set bit allows that the operation set in flags */
};



> +
> +flags values:
> +
> +KVM_MSR_ALLOW_READ
> +
> +  Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
> +  indicates that a read should immediately fail, while a 1 indicates that
> +  a read should be handled by the normal KVM MSR emulation logic.
> +
> +KVM_MSR_ALLOW_WRITE
> +
> +  Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
> +  indicates that a write should immediately fail, while a 1 indicates that
> +  a write should be handled by the normal KVM MSR emulation logic.
> +
> +KVM_MSR_ALLOW_READ | KVM_MSR_ALLOW_WRITE
> +
> +  Filter booth read and write accesses to MSRs using the given bitmap. A 0
> +  in the bitmap indicates that both reads and writes should immediately fail,
> +  while a 1 indicates that reads and writes should be handled by the normal
> +  KVM MSR emulation logic.

nit: Filter both

> +
> +This ioctl allows user space to define a set of bitmaps of MSR ranges to
> +specify whether a certain MSR access is allowed or not.
> +
> +If this ioctl has never been invoked, MSR accesses are not guarded and the
> +old KVM in-kernel emulation behavior is fully preserved.
> +
> +As soon as the first allow list was specified, only allowed MSR accesses
> +are permitted inside of KVM's MSR code.
> +
> +Each allowlist specifies a range of MSRs to potentially allow access on.
> +The range goes from MSR index [base .. base+nmsrs]. The flags field
> +indicates whether reads, writes or both reads and writes are permitted
> +by setting a 1 bit in the bitmap for the corresponding MSR index.
> +
> +If an MSR access is not permitted through the allow list, it generates a
> +#GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that
> +allows user space to deflect and potentially handle various MSR accesses
> +into user space.
> +
> +4.124 KVM_X86_CLEAR_MSR_ALLOWLIST
> +-
> +
> +:Capability: KVM_CAP_X86_MSR_ALLOWLIST
> +:Architectures: x86
> +:Type: vcpu ioctl
> +:Parameters: none
> +:Returns: 0
> +
> +This ioctl resets all internal MSR allow lists. After this call, no allow
> +list is present and the guest would execute as if no allow lists were set,
> +so all MSRs are considered allowed and thus handled by the in-kernel MSR
> +emulati

[PATCH v5 2/3] KVM: x86: Introduce allow list for MSR emulation

2020-08-07 Thread Alexander Graf
It's not desireable to have all MSRs always handled by KVM kernel space. Some
MSRs would be useful to handle in user space to either emulate behavior (like
uCode updates) or differentiate whether they are valid based on the CPU model.

To allow user space to specify which MSRs it wants to see handled by KVM,
this patch introduces a new ioctl to push allow lists of bitmaps into
KVM. Based on these bitmaps, KVM can then decide whether to reject MSR access.
With the addition of KVM_CAP_X86_USER_SPACE_MSR it can also deflect the
denied MSR events to user space to operate on.

If no allowlist is populated, MSR handling stays identical to before.

Signed-off-by: KarimAllah Ahmed 
Signed-off-by: Alexander Graf 

---

v2 -> v3:

  - document flags for KVM_X86_ADD_MSR_ALLOWLIST
  - generalize exit path, always unlock when returning
  - s/KVM_CAP_ADD_MSR_ALLOWLIST/KVM_CAP_X86_MSR_ALLOWLIST/g
  - Add KVM_X86_CLEAR_MSR_ALLOWLIST

v3 -> v4:
  - lock allow check and clearing
  - free bitmaps on clear

v4 -> v5:

  - use srcu
---
 Documentation/virt/kvm/api.rst  |  91 ++
 arch/x86/include/asm/kvm_host.h |  10 ++
 arch/x86/include/uapi/asm/kvm.h |  15 +++
 arch/x86/kvm/x86.c  | 160 
 include/uapi/linux/kvm.h|   5 +
 5 files changed, 281 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 2ca38649b3d4..9cb36060f61c 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4697,6 +4697,82 @@ KVM_PV_VM_VERIFY
   Verify the integrity of the unpacked image. Only if this succeeds,
   KVM is allowed to start protected VCPUs.
 
+4.126 KVM_X86_ADD_MSR_ALLOWLIST
+---
+
+:Capability: KVM_CAP_X86_MSR_ALLOWLIST
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_msr_allowlist
+:Returns: 0 on success, < 0 on error
+
+::
+
+  struct kvm_msr_allowlist {
+ __u32 flags;
+ __u32 nmsrs; /* number of msrs in bitmap */
+ __u32 base;  /* base address for the MSRs bitmap */
+ __u32 pad;
+
+ __u8 bitmap[0]; /* a set bit allows that the operation set in flags */
+  };
+
+flags values:
+
+KVM_MSR_ALLOW_READ
+
+  Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
+  indicates that a read should immediately fail, while a 1 indicates that
+  a read should be handled by the normal KVM MSR emulation logic.
+
+KVM_MSR_ALLOW_WRITE
+
+  Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
+  indicates that a write should immediately fail, while a 1 indicates that
+  a write should be handled by the normal KVM MSR emulation logic.
+
+KVM_MSR_ALLOW_READ | KVM_MSR_ALLOW_WRITE
+
+  Filter booth read and write accesses to MSRs using the given bitmap. A 0
+  in the bitmap indicates that both reads and writes should immediately fail,
+  while a 1 indicates that reads and writes should be handled by the normal
+  KVM MSR emulation logic.
+
+This ioctl allows user space to define a set of bitmaps of MSR ranges to
+specify whether a certain MSR access is allowed or not.
+
+If this ioctl has never been invoked, MSR accesses are not guarded and the
+old KVM in-kernel emulation behavior is fully preserved.
+
+As soon as the first allow list was specified, only allowed MSR accesses
+are permitted inside of KVM's MSR code.
+
+Each allowlist specifies a range of MSRs to potentially allow access on.
+The range goes from MSR index [base .. base+nmsrs]. The flags field
+indicates whether reads, writes or both reads and writes are permitted
+by setting a 1 bit in the bitmap for the corresponding MSR index.
+
+If an MSR access is not permitted through the allow list, it generates a
+#GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that
+allows user space to deflect and potentially handle various MSR accesses
+into user space.
+
+4.124 KVM_X86_CLEAR_MSR_ALLOWLIST
+-
+
+:Capability: KVM_CAP_X86_MSR_ALLOWLIST
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0
+
+This ioctl resets all internal MSR allow lists. After this call, no allow
+list is present and the guest would execute as if no allow lists were set,
+so all MSRs are considered allowed and thus handled by the in-kernel MSR
+emulation logic.
+
+No vCPU may be in running state when calling this ioctl.
+
 
 5. The kvm_run structure
 
@@ -6213,3 +6289,18 @@ writes to user space. It can be enabled on a VM level. 
If enabled, MSR
 accesses that would usually trigger a #GP by KVM into the guest will
 instead get bounced to user space through the KVM_EXIT_X86_RDMSR and
 KVM_EXIT_X86_WRMSR exit notifications.
+
+8.25 KVM_CAP_X86_MSR_ALLOWLIST
+--
+
+:Architectures: x86
+
+This capability indicates that KVM supports emulation of only select MSR
+registers. With this capability exposed, KVM exports two new VM ioctls:
+KVM_X86_ADD_MSR_ALLOWLIST which user