Re: [Xen-devel] [PATCH v5] x86/hvm: Implement hvmemul_write() using real mappings

2017-09-27 Thread Jan Beulich
>>> On 27.09.17 at 14:11,  wrote:
> On 27/09/17 12:48, Alexandru Stefan ISAILA wrote:
>> On Mi, 2017-09-27 at 09:38 +0100, Andrew Cooper wrote:
>>> On 27/09/2017 09:04, Alexandru Isaila wrote:
 From: Andrew Cooper 


 -return X86EMUL_EXCEPTION;
 -case HVMTRANS_bad_gfn_to_mfn:
 -return hvmemul_linear_mmio_write(addr, bytes, p_data,
 pfec, hvmemul_ctxt, 0);
>>> Where has the if ( !mapping ) test gone?  The HVMTRANS_bad_gfn_to_mfn
>>> case needs handling.
>> There was a comment form Jan in V2. NOTE: "v1
>> comment:'Pointless"else".'"
> 
> That means that the "else " text is pointless, not the clause.  (Jan:
> Please do try to be clearer when stating "pointless else", because it
> really is ambiguous and this mistake is a valid interpretation of your
> statement.)

How would one word this in an unambiguous way? And anyway -
it should have occurred to Alexandru that deleting not the "else"
alone has an unwanted side effect. If this wasn't enough for the
ambiguity to be resolved, he should have asked back.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5] x86/hvm: Implement hvmemul_write() using real mappings

2017-09-27 Thread Andrew Cooper
On 27/09/17 12:48, Alexandru Stefan ISAILA wrote:
> On Mi, 2017-09-27 at 09:38 +0100, Andrew Cooper wrote:
>> On 27/09/2017 09:04, Alexandru Isaila wrote:
>>> From: Andrew Cooper 
>>>
>>>
>>> -return X86EMUL_EXCEPTION;
>>> -case HVMTRANS_bad_gfn_to_mfn:
>>> -return hvmemul_linear_mmio_write(addr, bytes, p_data,
>>> pfec, hvmemul_ctxt, 0);
>> Where has the if ( !mapping ) test gone?  The HVMTRANS_bad_gfn_to_mfn
>> case needs handling.
> There was a comment form Jan in V2. NOTE: "v1
> comment:'Pointless"else".'"

That means that the "else " text is pointless, not the clause.  (Jan:
Please do try to be clearer when stating "pointless else", because it
really is ambiguous and this mistake is a valid interpretation of your
statement.)

The call to hvmemul_linear_mmio_write(addr, bytes, p_data, pfec,
hvmemul_ctxt, 0); is important, and needs to stay.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5] x86/hvm: Implement hvmemul_write() using real mappings

2017-09-27 Thread Alexandru Stefan ISAILA
On Mi, 2017-09-27 at 09:38 +0100, Andrew Cooper wrote:
> On 27/09/2017 09:04, Alexandru Isaila wrote:
> >
> > From: Andrew Cooper 
> >
> >
> > -return X86EMUL_EXCEPTION;
> > -case HVMTRANS_bad_gfn_to_mfn:
> > -return hvmemul_linear_mmio_write(addr, bytes, p_data,
> > pfec, hvmemul_ctxt, 0);
> Where has the if ( !mapping ) test gone?  The HVMTRANS_bad_gfn_to_mfn
> case needs handling.

There was a comment form Jan in V2. NOTE: "v1
comment:'Pointless"else".'"


This email was scanned by Bitdefender
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5] x86/hvm: Implement hvmemul_write() using real mappings

2017-09-27 Thread Jan Beulich
>>> On 27.09.17 at 10:38,  wrote:
> On 27/09/2017 09:04, Alexandru Isaila wrote:
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -498,6 +498,156 @@ static int hvmemul_do_mmio_addr(paddr_t mmio_gpa,
>>  }
>>  
>>  /*
>> + * Map the frame(s) covering an individual linear access, for writeable
>> + * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other 
>> errors
>> + * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings.
>> + *
>> + * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is
>> + * clean before use, and poisions unused slots with INVALID_MFN.
>> + */
>> +static void *hvmemul_map_linear_addr(
>> +unsigned long linear, unsigned int bytes, uint32_t pfec,
>> +struct hvm_emulate_ctxt *hvmemul_ctxt)
>> +{
>> +struct vcpu *curr = current;
>> +void *err, *mapping;
>> +
>> +/* First and final gfns which need mapping. */
>> +unsigned long frame = linear >> PAGE_SHIFT, first = frame;
>> +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT;
> 
> To function correctly for an access which spans the 4G or 64bit
> boundary, I think you need:
> 
> if ( hvmemul_ctxt->ctxt.addr_size < 64 )
> {
> frame = (uint32_t)frame;
> final = (uint32_t) final;
> }
>> +
>> +/*
>> + * mfn points to the next free slot.  All used slots have a page 
>> reference
>> + * held on them.
>> + */
>> +mfn_t *mfn = _ctxt->mfn[0];
>> +
>> +/*
>> + * The caller has no legitimate reason for trying a zero-byte write, but
>> + * final is calculate to fail safe in release builds.
>> + *
>> + * The maximum write size depends on the number of adjacent mfns[] which
>> + * can be vmap()'d, accouting for possible misalignment within the 
>> region.
>> + * The higher level emulation callers are responsible for ensuring that
>> + * mfns[] is large enough for the requested write size.
>> + */
>> +if ( bytes == 0 ||
>> + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) )

Taking wrapping into account, this also isn't right, and hence may
need restoring to its original (slightly more involved) form.

>> +{
>> +ASSERT_UNREACHABLE();
>> +goto unhandleable;
>> +}
>> +
>> +do {
>> +enum hvm_translation_result res;
>> +struct page_info *page;
>> +pagefault_info_t pfinfo;
>> +p2m_type_t p2mt;
>> +
>> +/* Error checking.  Confirm that the current slot is clean. */
>> +ASSERT(mfn_x(*mfn) == 0);
>> +
>> +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec,
>> + , , NULL, );
>> +
>> +switch ( res )
>> +{
>> +case HVMTRANS_okay:
>> +break;
>> +
>> +case HVMTRANS_bad_linear_to_gfn:
>> +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, 
>> _ctxt->ctxt);
>> +err = ERR_PTR(~X86EMUL_EXCEPTION);
>> +goto out;
>> +
>> +case HVMTRANS_bad_gfn_to_mfn:
>> +err = NULL;
>> +goto out;
>> +
>> +case HVMTRANS_gfn_paged_out:
>> +case HVMTRANS_gfn_shared:
>> +err = ERR_PTR(~X86EMUL_RETRY);
>> +goto out;
>> +
>> +default:
>> +goto unhandleable;
>> +}
>> +
>> +*mfn++ = _mfn(page_to_mfn(page));
>> +
>> +if ( p2m_is_discard_write(p2mt) )
>> +{
>> +err = ERR_PTR(~X86EMUL_OKAY);
>> +goto out;
>> +}
>> +
>> +} while ( ++frame < final );
> 
> frame != final

This would still be wrong if final < frame, as the increment won't
wrap. I.e. clipping to 52/20 bits would be needed after the increment
(which in turn will make it better for the increment to again happen
inside the loop rather than in the while()).

>> +/* Entire access within a single frame? */
>> +if ( first == final )
>> +mapping = map_domain_page(hvmemul_ctxt->mfn[0]);
>> +/* Multiple frames? Need to vmap(). */
>> +else if ( (mapping = vmap(hvmemul_ctxt->mfn,
>> +  final - first + 1)) == NULL )

Same issue here as mentioned above. Perhaps easier to have a
local variable counting the number of pages?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5] x86/hvm: Implement hvmemul_write() using real mappings

2017-09-27 Thread Andrew Cooper
On 27/09/2017 09:04, Alexandru Isaila wrote:
> From: Andrew Cooper 
>
> An access which crosses a page boundary is performed atomically by x86
> hardware, albeit with a severe performance penalty.  An important corner case
> is when a straddled access hits two pages which differ in whether a
> translation exists, or in net access rights.
>
> The use of hvm_copy*() in hvmemul_write() is problematic, because it performs
> a translation then completes the partial write, before moving onto the next
> translation.
>
> If an individual emulated write straddles two pages, the first of which is
> writable, and the second of which is not, the first half of the write will
> complete before #PF is raised from the second half.
>
> This results in guest state corruption as a side effect of emulation, which
> has been observed to cause windows to crash while under introspection.
>
> Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an
> entire contents of a linear access, and vmap() the underlying frames to
> provide a contiguous virtual mapping for the emulator to use.  This is the
> same mechanism as used by the shadow emulation code.
>
> This will catch any translation issues and abort the emulation before any
> modifications occur.
>
> Signed-off-by: Andrew Cooper 
> Signed-off-by: Alexandru Isaila 
>
> ---
> Changes since V4:
>   - Moved the mfn increment line back
>   - Removed the new line between mfn++ and while
>
> Reviewed-by: Paul Durrant 
> ---
>  xen/arch/x86/hvm/emulate.c| 174 
> ++
>  xen/include/asm-x86/hvm/emulate.h |   7 ++
>  2 files changed, 164 insertions(+), 17 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index cc874ce..9d8be30 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -498,6 +498,156 @@ static int hvmemul_do_mmio_addr(paddr_t mmio_gpa,
>  }
>  
>  /*
> + * Map the frame(s) covering an individual linear access, for writeable
> + * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors
> + * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings.
> + *
> + * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is
> + * clean before use, and poisions unused slots with INVALID_MFN.
> + */
> +static void *hvmemul_map_linear_addr(
> +unsigned long linear, unsigned int bytes, uint32_t pfec,
> +struct hvm_emulate_ctxt *hvmemul_ctxt)
> +{
> +struct vcpu *curr = current;
> +void *err, *mapping;
> +
> +/* First and final gfns which need mapping. */
> +unsigned long frame = linear >> PAGE_SHIFT, first = frame;
> +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT;

To function correctly for an access which spans the 4G or 64bit
boundary, I think you need:

if ( hvmemul_ctxt->ctxt.addr_size < 64 )
{
    frame = (uint32_t)frame;
    final = (uint32_t) final;
}

> +
> +/*
> + * mfn points to the next free slot.  All used slots have a page 
> reference
> + * held on them.
> + */
> +mfn_t *mfn = _ctxt->mfn[0];
> +
> +/*
> + * The caller has no legitimate reason for trying a zero-byte write, but
> + * final is calculate to fail safe in release builds.
> + *
> + * The maximum write size depends on the number of adjacent mfns[] which
> + * can be vmap()'d, accouting for possible misalignment within the 
> region.
> + * The higher level emulation callers are responsible for ensuring that
> + * mfns[] is large enough for the requested write size.
> + */
> +if ( bytes == 0 ||
> + final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) )
> +{
> +ASSERT_UNREACHABLE();
> +goto unhandleable;
> +}
> +
> +do {
> +enum hvm_translation_result res;
> +struct page_info *page;
> +pagefault_info_t pfinfo;
> +p2m_type_t p2mt;
> +
> +/* Error checking.  Confirm that the current slot is clean. */
> +ASSERT(mfn_x(*mfn) == 0);
> +
> +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec,
> + , , NULL, );
> +
> +switch ( res )
> +{
> +case HVMTRANS_okay:
> +break;
> +
> +case HVMTRANS_bad_linear_to_gfn:
> +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, 
> _ctxt->ctxt);
> +err = ERR_PTR(~X86EMUL_EXCEPTION);
> +goto out;
> +
> +case HVMTRANS_bad_gfn_to_mfn:
> +err = NULL;
> +goto out;
> +
> +case HVMTRANS_gfn_paged_out:
> +case HVMTRANS_gfn_shared:
> +err = ERR_PTR(~X86EMUL_RETRY);
> +goto out;
> +
> +default:
> +goto unhandleable;
> +}
> +
> +*mfn++ = _mfn(page_to_mfn(page));
> +
> +if ( p2m_is_discard_write(p2mt) )
> +

[Xen-devel] [PATCH v5] x86/hvm: Implement hvmemul_write() using real mappings

2017-09-27 Thread Alexandru Isaila
From: Andrew Cooper 

An access which crosses a page boundary is performed atomically by x86
hardware, albeit with a severe performance penalty.  An important corner case
is when a straddled access hits two pages which differ in whether a
translation exists, or in net access rights.

The use of hvm_copy*() in hvmemul_write() is problematic, because it performs
a translation then completes the partial write, before moving onto the next
translation.

If an individual emulated write straddles two pages, the first of which is
writable, and the second of which is not, the first half of the write will
complete before #PF is raised from the second half.

This results in guest state corruption as a side effect of emulation, which
has been observed to cause windows to crash while under introspection.

Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an
entire contents of a linear access, and vmap() the underlying frames to
provide a contiguous virtual mapping for the emulator to use.  This is the
same mechanism as used by the shadow emulation code.

This will catch any translation issues and abort the emulation before any
modifications occur.

Signed-off-by: Andrew Cooper 
Signed-off-by: Alexandru Isaila 

---
Changes since V4:
- Moved the mfn increment line back
- Removed the new line between mfn++ and while

Reviewed-by: Paul Durrant 
---
 xen/arch/x86/hvm/emulate.c| 174 ++
 xen/include/asm-x86/hvm/emulate.h |   7 ++
 2 files changed, 164 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index cc874ce..9d8be30 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -498,6 +498,156 @@ static int hvmemul_do_mmio_addr(paddr_t mmio_gpa,
 }
 
 /*
+ * Map the frame(s) covering an individual linear access, for writeable
+ * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors
+ * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings.
+ *
+ * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is
+ * clean before use, and poisions unused slots with INVALID_MFN.
+ */
+static void *hvmemul_map_linear_addr(
+unsigned long linear, unsigned int bytes, uint32_t pfec,
+struct hvm_emulate_ctxt *hvmemul_ctxt)
+{
+struct vcpu *curr = current;
+void *err, *mapping;
+
+/* First and final gfns which need mapping. */
+unsigned long frame = linear >> PAGE_SHIFT, first = frame;
+unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT;
+
+/*
+ * mfn points to the next free slot.  All used slots have a page reference
+ * held on them.
+ */
+mfn_t *mfn = _ctxt->mfn[0];
+
+/*
+ * The caller has no legitimate reason for trying a zero-byte write, but
+ * final is calculate to fail safe in release builds.
+ *
+ * The maximum write size depends on the number of adjacent mfns[] which
+ * can be vmap()'d, accouting for possible misalignment within the region.
+ * The higher level emulation callers are responsible for ensuring that
+ * mfns[] is large enough for the requested write size.
+ */
+if ( bytes == 0 ||
+ final - first >= ARRAY_SIZE(hvmemul_ctxt->mfn) )
+{
+ASSERT_UNREACHABLE();
+goto unhandleable;
+}
+
+do {
+enum hvm_translation_result res;
+struct page_info *page;
+pagefault_info_t pfinfo;
+p2m_type_t p2mt;
+
+/* Error checking.  Confirm that the current slot is clean. */
+ASSERT(mfn_x(*mfn) == 0);
+
+res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec,
+ , , NULL, );
+
+switch ( res )
+{
+case HVMTRANS_okay:
+break;
+
+case HVMTRANS_bad_linear_to_gfn:
+x86_emul_pagefault(pfinfo.ec, pfinfo.linear, _ctxt->ctxt);
+err = ERR_PTR(~X86EMUL_EXCEPTION);
+goto out;
+
+case HVMTRANS_bad_gfn_to_mfn:
+err = NULL;
+goto out;
+
+case HVMTRANS_gfn_paged_out:
+case HVMTRANS_gfn_shared:
+err = ERR_PTR(~X86EMUL_RETRY);
+goto out;
+
+default:
+goto unhandleable;
+}
+
+*mfn++ = _mfn(page_to_mfn(page));
+
+if ( p2m_is_discard_write(p2mt) )
+{
+err = ERR_PTR(~X86EMUL_OKAY);
+goto out;
+}
+
+} while ( ++frame < final );
+
+/* Entire access within a single frame? */
+if ( first == final )
+mapping = map_domain_page(hvmemul_ctxt->mfn[0]);
+/* Multiple frames? Need to vmap(). */
+else if ( (mapping = vmap(hvmemul_ctxt->mfn,
+  final - first + 1)) == NULL )
+goto unhandleable;
+
+#ifndef NDEBUG /* Poision unused mfn[]s with INVALID_MFN.