Re: [PATCH 11/32] KVM: MIPS: Add VZ capability

2017-03-03 Thread James Hogan
On Thu, Mar 02, 2017 at 10:34:07PM +, James Hogan wrote:
> I suppose the exception is T It shouldn't assume that just because VZ
> is available that T isn't (even if that is the case right now). It
> could always just try KVM_CREATE_VM with kvm type 0 and detect the error
> I suppose, but capabilities are nicer.
> 
> Maybe I'll redefine KVM_CAP_MIPS_VZ a bit, such that the value returned
> + 1 is a bitmask of supported kvm types:
> has T = !!( (v + 1) & BIT(KVM_VM_MIPS_TE) )
> has VZ  = !!( (v + 1) & BIT(KVM_VM_MIPS_VZ) )
> 
> That way old kernels which return 0 are consistent, and other
> implementations could be added if really necessary without confusing
> userland (but fingers crossed it'll never ever be necessary).

Actually I think the way I had designed KVM_CAP_MIPS_VZ is fine. I had
defined it as an enumeration rather than a mask because it isn't
expected you'd have more than one hardware virtualisation type able to
run on a particular core.

Whether T is still supported is I think better exposed by a new
KVM_CAP_MIPS_TE capability, indicating whether T is exposed when
KVM_CAP_MIPS_VZ is also set.

It would be set to 1 on new kernels whenever T is supported.

For compatibility with older kernels, userland would be expected to
determine whether T is present by:
check(KVM_CAP_MIPS_VZ) == 0 || check(KVM_CAP_MIPS_TE) != 0

Old userland that doesn't check KVM_CAP_MIPS_TE would just hit an EINVAL
from KVM_CREATE_VM if T isn't supported.

Cheers
James


signature.asc
Description: Digital signature


Re: [PATCH 11/32] KVM: MIPS: Add VZ capability

2017-03-03 Thread Paolo Bonzini


On 03/03/2017 13:37, James Hogan wrote:
> Actually I think the way I had designed KVM_CAP_MIPS_VZ is fine. I had
> defined it as an enumeration rather than a mask because it isn't
> expected you'd have more than one hardware virtualisation type able to
> run on a particular core.
> 
> Whether T is still supported is I think better exposed by a new
> KVM_CAP_MIPS_TE capability, indicating whether T is exposed when
> KVM_CAP_MIPS_VZ is also set.
> 
> It would be set to 1 on new kernels whenever T is supported.
> 
> For compatibility with older kernels, userland would be expected to
> determine whether T is present by:
> check(KVM_CAP_MIPS_VZ) == 0 || check(KVM_CAP_MIPS_TE) != 0
> 
> Old userland that doesn't check KVM_CAP_MIPS_TE would just hit an EINVAL
> from KVM_CREATE_VM if T isn't supported.

That's okay.

Paolo



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 11/32] KVM: MIPS: Add VZ capability

2017-03-02 Thread James Hogan
On Thu, Mar 02, 2017 at 01:20:05PM +0100, Paolo Bonzini wrote:
> On 02/03/2017 12:39, James Hogan wrote:
> > It can't right now, though with relocation of the kernel now implemented
> > in MIPS Linux for KASLR, and hopes for a more generic EVA implementation
> > (which can require the kernel to be linked in a completely different
> > segment) it isn't completely infeasible.
> 
> What about the other way round, sticking a minimal T stub in kernel
> space and running the kernel in userspace?  Would it be feasible or
> would it be as complex as KVM itself?

You mean have a fallback in the guest kernel to keep kernel running from
userspace addresses in kernel mode so it works in VZ guests and
non-virtualized?

Interesting idea. I think it would involve a lot of complexity. It could
forgo some of the emulation of privileged instructions that KVM T
does since its running in kernel mode, but memory management would be
more complex, and invasive changes would be required to the kernel.

- Memory privilege protection is on the granularity of segments, so with
  the traditional segment layout all of USeg (0x..0x7FFF) is
  accessible to user mode, so you'd still need to utilise ASIDs to
  separate the address spaces of actual user programs running in
  0x..0x3FFF from the kernel code running in
  0x4000..0x7FFF.

- USeg is always TLB mapped. That means any kernel code could trigger
  TLB exceptions, which breaks existing assumptions (e.g. normally from
  unmapped kernel segments you can disable interrupts and then
  manipulate the TLB, but that isn't safe if a TLB refill exception
  could happen at any time and clobber the TLB registers). If in the
  future we manage to workaround these issues and map the kernel (for
  security/protection purposes), then it would be easier, but then we'll
  likely already have the capability to fully relocate into a different
  segment.

> > 1) QEMU, which I've implemented using the kvm_type machine callback.
> > This allows the KVM type to be specified with e.g.
> >   "-machine malta,accel=kvm,kvm-type=TE"
> > Otherwise it defaults to using KVM_VM_MIPS_DEFAULT.
> > 
> > When you try and load a kernel (which happens after kvm_init() has
> > already passed the kvm type into KVM_CREATE_VM) it will check that it
> > supports the current kernel type.
> >
> > 2) My kvm test application, which uses KVM_VM_MIPS_DEFAULT by default
> > and hackily maps itself into the guest physical address space to run C
> > code test cases.
> 
> So this one would work for both TE and VZ because the guest is not a
> Linux kernel.

Yes, the test code is position independent and careful to avoid direct
references to any symbols. The GPA mappings are set up the same, but the
virtual addresses (PC, stack pointer etc) are set up slightly
differently depending on whether the VZ capability is present.

> I don't know...  Instinctively I would think that it's easy to get
> KVM_VM_MIPS_DEFAULT wrong and place the VZ-and-fall-back-to-TE policy in
> userspace, but I can be convinced otherwise if the failure mode is good
> enough.

Yeh, I think I agree. It isn't really necessary to have that decision
making in the kernel, and to use a particular KVM type userspace needs
to be aware about it, so it can always figure out from capabilities
which one to use prior to KVM_CREATE_VM.

I suppose the exception is T It shouldn't assume that just because VZ
is available that T isn't (even if that is the case right now). It
could always just try KVM_CREATE_VM with kvm type 0 and detect the error
I suppose, but capabilities are nicer.

Maybe I'll redefine KVM_CAP_MIPS_VZ a bit, such that the value returned
+ 1 is a bitmask of supported kvm types:
has T = !!( (v + 1) & BIT(KVM_VM_MIPS_TE) )
has VZ  = !!( (v + 1) & BIT(KVM_VM_MIPS_VZ) )

That way old kernels which return 0 are consistent, and other
implementations could be added if really necessary without confusing
userland (but fingers crossed it'll never ever be necessary).

> For example, what happens if you use KVM_SET_USER_MEMORY_REGION
> for a kernel address in TE mode?

That deals with physical addresses and user/kernel memory is
distinguished by the virtual address, so the KVM mode (T vs VZ)
doesn't make a difference here.

Cheers
James


signature.asc
Description: Digital signature


Re: [PATCH 11/32] KVM: MIPS: Add VZ capability

2017-03-02 Thread James Hogan
Hi Paolo,

On Thu, Mar 02, 2017 at 11:59:28AM +0100, Paolo Bonzini wrote:
> On 02/03/2017 10:36, James Hogan wrote:
> >  - KVM_VM_MIPS_DEFAULT = 2
> > 
> >This will provide the best available KVM implementation (even on
> >older kernels), preferring hardware assisted virtualization over trap
> >& emulate. The KVM_CAP_MIPS_VZ capability should always be checked
> >against known values to determine what type of implementation was
> >chosen.
> > 
> > This is designed to allow the desired implementation (T vs VZ) to be
> > potentially chosen at runtime rather than being fixed in the kernel
> > configuration.
> 
> Can the same kernel run on both TE and VZ?  If not, I'm not sure that
> KVM_VM_MIPS_DEFAULT is a good idea.

It can't right now, though with relocation of the kernel now implemented
in MIPS Linux for KASLR, and hopes for a more generic EVA implementation
(which can require the kernel to be linked in a completely different
segment) it isn't completely infeasible.

Currently the two uses of this I've implemented are:

1) QEMU, which I've implemented using the kvm_type machine callback.
This allows the KVM type to be specified with e.g.
  "-machine malta,accel=kvm,kvm-type=TE"
Otherwise it defaults to using KVM_VM_MIPS_DEFAULT.

When you try and load a kernel (which happens after kvm_init() has
already passed the kvm type into KVM_CREATE_VM) it will check that it
supports the current kernel type.

2) My kvm test application, which uses KVM_VM_MIPS_DEFAULT by default
and hackily maps itself into the guest physical address space to run C
code test cases.

Does that justification sound reasonable?

Cheers
James


signature.asc
Description: Digital signature


Re: [PATCH 11/32] KVM: MIPS: Add VZ capability

2017-03-02 Thread Paolo Bonzini


On 02/03/2017 10:36, James Hogan wrote:
>  - KVM_VM_MIPS_DEFAULT = 2
> 
>This will provide the best available KVM implementation (even on
>older kernels), preferring hardware assisted virtualization over trap
>& emulate. The KVM_CAP_MIPS_VZ capability should always be checked
>against known values to determine what type of implementation was
>chosen.
> 
> This is designed to allow the desired implementation (T vs VZ) to be
> potentially chosen at runtime rather than being fixed in the kernel
> configuration.

Can the same kernel run on both TE and VZ?  If not, I'm not sure that
KVM_VM_MIPS_DEFAULT is a good idea.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/32] KVM: MIPS: Add VZ capability

2017-03-02 Thread James Hogan
Add a new KVM_CAP_MIPS_VZ capability, and in order to allow MIPS KVM to
support VZ without confusing old users (which expect the trap & emulate
implementation), define and start checking KVM_CREATE_VM type codes.

The codes available are:

 - KVM_VM_MIPS_TE = 0

   This is the current value expected from the user, and will create a
   VM using trap & emulate in user mode, confined to the user mode
   address space. This may in future become unavailable if the kernel is
   only configured to support VZ, in which case the EINVAL error will be
   returned.

 - KVM_VM_MIPS_VZ = 1

   This can be provided when the KVM_CAP_MIPS_VZ capability is available
   to create a VM using VZ, with a fully virtualized guest virtual
   address space. If VZ support is unavailable in the kernel, the EINVAL
   error will be returned (although old kernels without the
   KVM_CAP_MIPS_VZ capability may well succeed and create a trap &
   emulate VM).

 - KVM_VM_MIPS_DEFAULT = 2

   This will provide the best available KVM implementation (even on
   older kernels), preferring hardware assisted virtualization over trap
   & emulate. The KVM_CAP_MIPS_VZ capability should always be checked
   against known values to determine what type of implementation was
   chosen.

This is designed to allow the desired implementation (T vs VZ) to be
potentially chosen at runtime rather than being fixed in the kernel
configuration.

Signed-off-by: James Hogan 
Cc: Paolo Bonzini 
Cc: "Radim Krčmář" 
Cc: Ralf Baechle 
Cc: Jonathan Corbet 
Cc: linux-m...@linux-mips.org
Cc: k...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
---
 Documentation/virtual/kvm/api.txt | 38 +++-
 arch/mips/kvm/mips.c  |  9 -
 include/uapi/linux/kvm.h  |  6 +-
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 069450938b79..bd54d7a30e37 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -115,12 +115,21 @@ will access the virtual machine's physical address space; 
offset zero
 corresponds to guest physical address zero.  Use of mmap() on a VM fd
 is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is
 available.
-You most certainly want to use 0 as machine type.
+You probably want to use 0 as machine type.
 
 In order to create user controlled virtual machines on S390, check
 KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as
 privileged user (CAP_SYS_ADMIN).
 
+To use hardware assisted virtualization on MIPS (VZ ASE) rather than
+the default trap & emulate implementation (which changes the virtual
+memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
+flag KVM_VM_MIPS_VZ.
+
+To use the best available virtualization type on MIPS, use the flag
+KVM_VM_MIPS_DEFAULT and check KVM_CAP_MIPS_VZ on the VM after creation
+to determine exactly which type was chosen.
+
 
 4.3 KVM_GET_MSR_INDEX_LIST
 
@@ -4143,3 +4152,30 @@ This capability, if KVM_CHECK_EXTENSION indicates that 
it is
 available, means that that the kernel can support guests using the
 hashed page table MMU defined in Power ISA V3.00 (as implemented in
 the POWER9 processor), including in-memory segment tables.
+
+8.5 KVM_CAP_MIPS_VZ
+
+Architectures: mips
+
+This capability, if KVM_CHECK_EXTENSION on the main kvm handle indicates that
+it is available, means that full hardware assisted virtualization capabilities
+of the hardware are available for use through KVM. An appropriate
+KVM_VM_MIPS_* type must be passed to KVM_CREATE_VM to create a VM which
+utilises it.
+
+If KVM_CHECK_EXTENSION on a kvm VM handle indicates that this capability is
+available, it means that the VM is using full hardware assisted virtualization
+capabilities of the hardware. This is useful to check after creating a VM with
+KVM_VM_MIPS_DEFAULT.
+
+The value returned by KVM_CHECK_EXTENSION should be compared against known
+values (see below). All other values are reserved. This is to allow for the
+possibility of other hardware assisted virtualization implementations which
+may be incompatible with the MIPS VZ ASE.
+
+ 0: The trap & emulate implementation is in use to run guest code in user
+mode. Guest virtual memory segments are rearranged to fit the guest in the
+user mode address space.
+
+ 1: The MIPS VZ ASE is in use, providing full hardware assisted
+virtualization, including standard guest virtual memory segments.
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 2a06015930eb..cd07ea27f336 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -105,6 +105,15 @@ void kvm_arch_check_processor_compat(void *rtn)
 
 int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 {
+   switch (type) {
+   case KVM_VM_MIPS_DEFAULT:
+   case KVM_VM_MIPS_TE:
+