Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-12 Thread Jan Beulich
>>> On 12.01.15 at 15:54,  wrote:
> On Fri, Jan 9, 2015 at 12:10 PM, Jan Beulich  wrote:
> On 09.01.15 at 12:45,  wrote:
>>> At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote:
 >>> On 09.01.15 at 12:18,  wrote:
 >> > +default:
 >> > +xfree(buf);
 >> > +ASSERT(!buf);
 >
 > looks dodgy...

 In which way? The "default" is supposed to be unreachable, and sits
 in the else branch to an if(!buf), i.e. in a release build we'll correctly
 free the buffer, while in a debug build the ASSERT() will trigger.
>>>
>>> Oh I see.  Can you please use ASSERT(0) for that?
>>
>> I sincerely dislike ASSERT(0), but if that's the only way to get
>> the patch accepted...
> 
> Um, by what set of criteria is ASSERT(!buf) better than ASSERT(0)?  At
> least the second tells the reader that you're not using ASSERT() the
> normal way.

Because the resulting message ("0") is completely meaningless,
while for "!buf" you at least have a chance of finding the original
ASSERT() without line numbers matching up 100%.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-12 Thread George Dunlap
On Fri, Jan 9, 2015 at 12:10 PM, Jan Beulich  wrote:
 On 09.01.15 at 12:45,  wrote:
>> At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote:
>>> >>> On 09.01.15 at 12:18,  wrote:
>>> >> > +default:
>>> >> > +xfree(buf);
>>> >> > +ASSERT(!buf);
>>> >
>>> > looks dodgy...
>>>
>>> In which way? The "default" is supposed to be unreachable, and sits
>>> in the else branch to an if(!buf), i.e. in a release build we'll correctly
>>> free the buffer, while in a debug build the ASSERT() will trigger.
>>
>> Oh I see.  Can you please use ASSERT(0) for that?
>
> I sincerely dislike ASSERT(0), but if that's the only way to get
> the patch accepted...

Um, by what set of criteria is ASSERT(!buf) better than ASSERT(0)?  At
least the second tells the reader that you're not using ASSERT() the
normal way.

I agree that ASSERT_UNREACHABLE() is probably the best option.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-09 Thread Tim Deegan
At 12:46 + on 09 Jan (1420804005), Andrew Cooper wrote:
> On 09/01/15 12:10, Jan Beulich wrote:
>  On 09.01.15 at 12:45,  wrote:
> >> At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote:
> >> On 09.01.15 at 12:18,  wrote:
> >> +default:
> >> +xfree(buf);
> >> +ASSERT(!buf);
>  looks dodgy...
> >>> In which way? The "default" is supposed to be unreachable, and sits
> >>> in the else branch to an if(!buf), i.e. in a release build we'll correctly
> >>> free the buffer, while in a debug build the ASSERT() will trigger.
> >> Oh I see.  Can you please use ASSERT(0) for that?
> > I sincerely dislike ASSERT(0), but if that's the only way to get
> > the patch accepted...
> >
> > Jan
> >
> 
> Perhaps introducing ASSERT_UNREACHABLE() as an alternative which is more
> obvious in nature than both ASSERT(!buf) and ASSERT(0) ?

Fine by me!

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-09 Thread Andrew Cooper
On 09/01/15 12:10, Jan Beulich wrote:
 On 09.01.15 at 12:45,  wrote:
>> At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote:
>> On 09.01.15 at 12:18,  wrote:
>> +default:
>> +xfree(buf);
>> +ASSERT(!buf);
 looks dodgy...
>>> In which way? The "default" is supposed to be unreachable, and sits
>>> in the else branch to an if(!buf), i.e. in a release build we'll correctly
>>> free the buffer, while in a debug build the ASSERT() will trigger.
>> Oh I see.  Can you please use ASSERT(0) for that?
> I sincerely dislike ASSERT(0), but if that's the only way to get
> the patch accepted...
>
> Jan
>

Perhaps introducing ASSERT_UNREACHABLE() as an alternative which is more
obvious in nature than both ASSERT(!buf) and ASSERT(0) ?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-09 Thread Jan Beulich
>>> On 09.01.15 at 12:45,  wrote:
> At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote:
>> >>> On 09.01.15 at 12:18,  wrote:
>> >> > +default:
>> >> > +xfree(buf);
>> >> > +ASSERT(!buf);
>> > 
>> > looks dodgy...
>> 
>> In which way? The "default" is supposed to be unreachable, and sits
>> in the else branch to an if(!buf), i.e. in a release build we'll correctly
>> free the buffer, while in a debug build the ASSERT() will trigger.
> 
> Oh I see.  Can you please use ASSERT(0) for that?

I sincerely dislike ASSERT(0), but if that's the only way to get
the patch accepted...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-09 Thread Tim Deegan
At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote:
> >>> On 09.01.15 at 12:18,  wrote:
> >> > +default:
> >> > +xfree(buf);
> >> > +ASSERT(!buf);
> > 
> > looks dodgy...
> 
> In which way? The "default" is supposed to be unreachable, and sits
> in the else branch to an if(!buf), i.e. in a release build we'll correctly
> free the buffer, while in a debug build the ASSERT() will trigger.

Oh I see.  Can you please use ASSERT(0) for that?

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-09 Thread Jan Beulich
>>> On 09.01.15 at 12:18,  wrote:
> At 10:56 + on 09 Jan (1420797418), Andrew Cooper wrote:
>> On 09/01/15 10:27, Jan Beulich wrote:
>> > While the REP MOVS acceleration appears to have helped qemu-traditional
>> > based guests, qemu-upstream (or really the respective video BIOSes)
>> > doesn't appear to benefit from that. Instead the acceleration added
>> > here provides a visible performance improvement during very early HVM
>> > guest boot.
>> >
>> > Signed-off-by: Jan Beulich 
>> > ---
>> > v2: fix asm() constraints in hvmemul_rep_stos(), as pointed out by
>> > Andrew. Add output operand telling the compiler that "buf" is being
>> > written.
>> 
>> Is writing buf wise? it looks like you will xfree() a wild pointer, and
>> appears to interfere the "buf != p_data" logic.
> 
> I think the constraints are correct, though the 'memory' clobber makes
> the rest of it moot.  But this:

Ah, yes, the memory clobber could be dropped now that the "=m"
is there.

>> > +default:
>> > +xfree(buf);
>> > +ASSERT(!buf);
> 
> looks dodgy...

In which way? The "default" is supposed to be unreachable, and sits
in the else branch to an if(!buf), i.e. in a release build we'll correctly
free the buffer, while in a debug build the ASSERT() will trigger.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-09 Thread Tim Deegan
At 10:56 + on 09 Jan (1420797418), Andrew Cooper wrote:
> On 09/01/15 10:27, Jan Beulich wrote:
> > While the REP MOVS acceleration appears to have helped qemu-traditional
> > based guests, qemu-upstream (or really the respective video BIOSes)
> > doesn't appear to benefit from that. Instead the acceleration added
> > here provides a visible performance improvement during very early HVM
> > guest boot.
> >
> > Signed-off-by: Jan Beulich 
> > ---
> > v2: fix asm() constraints in hvmemul_rep_stos(), as pointed out by
> > Andrew. Add output operand telling the compiler that "buf" is being
> > written.
> 
> Is writing buf wise? it looks like you will xfree() a wild pointer, and
> appears to interfere the "buf != p_data" logic.

I think the constraints are correct, though the 'memory' clobber makes
the rest of it moot.  But this:

> > +default:
> > +xfree(buf);
> > +ASSERT(!buf);

looks dodgy...

Tim.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-09 Thread Jan Beulich
>>> On 09.01.15 at 11:56,  wrote:
> On 09/01/15 10:27, Jan Beulich wrote:
>> While the REP MOVS acceleration appears to have helped qemu-traditional
>> based guests, qemu-upstream (or really the respective video BIOSes)
>> doesn't appear to benefit from that. Instead the acceleration added
>> here provides a visible performance improvement during very early HVM
>> guest boot.
>>
>> Signed-off-by: Jan Beulich 
>> ---
>> v2: fix asm() constraints in hvmemul_rep_stos(), as pointed out by
>> Andrew. Add output operand telling the compiler that "buf" is being
>> written.
> 
> Is writing buf wise? it looks like you will xfree() a wild pointer, and
> appears to interfere the "buf != p_data" logic.

Perhaps the textual description was a little ambiguous, but the code
is not: It's not the variable "buf" that gets written, but buf[] in its
entirety.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-09 Thread Andrew Cooper
On 09/01/15 10:27, Jan Beulich wrote:
> While the REP MOVS acceleration appears to have helped qemu-traditional
> based guests, qemu-upstream (or really the respective video BIOSes)
> doesn't appear to benefit from that. Instead the acceleration added
> here provides a visible performance improvement during very early HVM
> guest boot.
>
> Signed-off-by: Jan Beulich 
> ---
> v2: fix asm() constraints in hvmemul_rep_stos(), as pointed out by
> Andrew. Add output operand telling the compiler that "buf" is being
> written.

Is writing buf wise? it looks like you will xfree() a wild pointer, and
appears to interfere the "buf != p_data" logic.

~Andrew

>
> --- unstable.orig/xen/arch/x86/hvm/emulate.c  2015-01-09 11:19:01.0 
> +0100
> +++ unstable/xen/arch/x86/hvm/emulate.c   2015-01-09 11:19:36.0 
> +0100
> @@ -731,6 +731,17 @@ static int hvmemul_rep_movs_discard(
>  return X86EMUL_OKAY;
>  }
>  
> +static int hvmemul_rep_stos_discard(
> +void *p_data,
> +enum x86_segment seg,
> +unsigned long offset,
> +unsigned int bytes_per_rep,
> +unsigned long *reps,
> +struct x86_emulate_ctxt *ctxt)
> +{
> +return X86EMUL_OKAY;
> +}
> +
>  static int hvmemul_rep_outs_discard(
>  enum x86_segment src_seg,
>  unsigned long src_offset,
> @@ -982,6 +993,114 @@ static int hvmemul_rep_movs(
>  return X86EMUL_OKAY;
>  }
>  
> +static int hvmemul_rep_stos(
> +void *p_data,
> +enum x86_segment seg,
> +unsigned long offset,
> +unsigned int bytes_per_rep,
> +unsigned long *reps,
> +struct x86_emulate_ctxt *ctxt)
> +{
> +struct hvm_emulate_ctxt *hvmemul_ctxt =
> +container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
> +unsigned long addr;
> +paddr_t gpa;
> +p2m_type_t p2mt;
> +bool_t df = !!(ctxt->regs->eflags & X86_EFLAGS_DF);
> +int rc = hvmemul_virtual_to_linear(seg, offset, bytes_per_rep, reps,
> +   hvm_access_write, hvmemul_ctxt, 
> &addr);
> +
> +if ( rc == X86EMUL_OKAY )
> +{
> +uint32_t pfec = PFEC_page_present | PFEC_write_access;
> +
> +if ( hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3 )
> +pfec |= PFEC_user_mode;
> +
> +rc = hvmemul_linear_to_phys(
> +addr, &gpa, bytes_per_rep, reps, pfec, hvmemul_ctxt);
> +}
> +if ( rc != X86EMUL_OKAY )
> +return rc;
> +
> +/* Check for MMIO op */
> +(void)get_gfn_query_unlocked(current->domain, gpa >> PAGE_SHIFT, &p2mt);
> +
> +switch ( p2mt )
> +{
> +unsigned long bytes;
> +void *buf;
> +
> +default:
> +/* Allocate temporary buffer. */
> +for ( ; ; )
> +{
> +bytes = *reps * bytes_per_rep;
> +buf = xmalloc_bytes(bytes);
> +if ( buf || *reps <= 1 )
> +break;
> +*reps >>= 1;
> +}
> +
> +if ( !buf )
> +buf = p_data;
> +else
> +switch ( bytes_per_rep )
> +{
> +unsigned long dummy;
> +
> +#define CASE(bits, suffix) \
> +case (bits) / 8:   \
> +asm ( "rep stos" #suffix   \
> +  : "=m" (*(char (*)[bytes])buf),  \
> +"=D" (dummy), "=c" (dummy) \
> +  : "a" (*(const uint##bits##_t *)p_data), \
> + "1" (buf), "2" (*reps)\
> +  : "memory" );\
> +break
> +CASE(8, b);
> +CASE(16, w);
> +CASE(32, l);
> +CASE(64, q);
> +#undef CASE
> +
> +default:
> +xfree(buf);
> +ASSERT(!buf);
> +return X86EMUL_UNHANDLEABLE;
> +}
> +
> +/* Adjust address for reverse store. */
> +if ( df )
> +gpa -= bytes - bytes_per_rep;
> +
> +rc = hvm_copy_to_guest_phys(gpa, buf, bytes);
> +
> +if ( buf != p_data )
> +xfree(buf);
> +
> +switch ( rc )
> +{
> +case HVMCOPY_gfn_paged_out:
> +case HVMCOPY_gfn_shared:
> +return X86EMUL_RETRY;
> +case HVMCOPY_okay:
> +return X86EMUL_OKAY;
> +}
> +
> +gdprintk(XENLOG_WARNING,
> + "Failed REP STOS: gpa=%"PRIpaddr" reps=%lu 
> bytes_per_rep=%u\n",
> + gpa, *reps, bytes_per_rep);
> +/* fall through */
> +case p2m_mmio_direct:
> +return X86EMUL_UNHANDLEABLE;
> +
> +case p2m_mmio_dm:
> +return hvmemul_do_mmio(gpa, reps, bytes_per_rep, 0, IOREQ_WRITE, df,
> +   p_data);
> +}
> +}
> +
>  static int hvmemul_read_segment(
>  enum x86_segment seg,
>  struct segment_register *reg,
> @@ -1239,6 +1358,7 @@ static 

[Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration

2015-01-09 Thread Jan Beulich
While the REP MOVS acceleration appears to have helped qemu-traditional
based guests, qemu-upstream (or really the respective video BIOSes)
doesn't appear to benefit from that. Instead the acceleration added
here provides a visible performance improvement during very early HVM
guest boot.

Signed-off-by: Jan Beulich 
---
v2: fix asm() constraints in hvmemul_rep_stos(), as pointed out by
Andrew. Add output operand telling the compiler that "buf" is being
written.

--- unstable.orig/xen/arch/x86/hvm/emulate.c2015-01-09 11:19:01.0 
+0100
+++ unstable/xen/arch/x86/hvm/emulate.c 2015-01-09 11:19:36.0 +0100
@@ -731,6 +731,17 @@ static int hvmemul_rep_movs_discard(
 return X86EMUL_OKAY;
 }
 
+static int hvmemul_rep_stos_discard(
+void *p_data,
+enum x86_segment seg,
+unsigned long offset,
+unsigned int bytes_per_rep,
+unsigned long *reps,
+struct x86_emulate_ctxt *ctxt)
+{
+return X86EMUL_OKAY;
+}
+
 static int hvmemul_rep_outs_discard(
 enum x86_segment src_seg,
 unsigned long src_offset,
@@ -982,6 +993,114 @@ static int hvmemul_rep_movs(
 return X86EMUL_OKAY;
 }
 
+static int hvmemul_rep_stos(
+void *p_data,
+enum x86_segment seg,
+unsigned long offset,
+unsigned int bytes_per_rep,
+unsigned long *reps,
+struct x86_emulate_ctxt *ctxt)
+{
+struct hvm_emulate_ctxt *hvmemul_ctxt =
+container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
+unsigned long addr;
+paddr_t gpa;
+p2m_type_t p2mt;
+bool_t df = !!(ctxt->regs->eflags & X86_EFLAGS_DF);
+int rc = hvmemul_virtual_to_linear(seg, offset, bytes_per_rep, reps,
+   hvm_access_write, hvmemul_ctxt, &addr);
+
+if ( rc == X86EMUL_OKAY )
+{
+uint32_t pfec = PFEC_page_present | PFEC_write_access;
+
+if ( hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3 )
+pfec |= PFEC_user_mode;
+
+rc = hvmemul_linear_to_phys(
+addr, &gpa, bytes_per_rep, reps, pfec, hvmemul_ctxt);
+}
+if ( rc != X86EMUL_OKAY )
+return rc;
+
+/* Check for MMIO op */
+(void)get_gfn_query_unlocked(current->domain, gpa >> PAGE_SHIFT, &p2mt);
+
+switch ( p2mt )
+{
+unsigned long bytes;
+void *buf;
+
+default:
+/* Allocate temporary buffer. */
+for ( ; ; )
+{
+bytes = *reps * bytes_per_rep;
+buf = xmalloc_bytes(bytes);
+if ( buf || *reps <= 1 )
+break;
+*reps >>= 1;
+}
+
+if ( !buf )
+buf = p_data;
+else
+switch ( bytes_per_rep )
+{
+unsigned long dummy;
+
+#define CASE(bits, suffix) \
+case (bits) / 8:   \
+asm ( "rep stos" #suffix   \
+  : "=m" (*(char (*)[bytes])buf),  \
+"=D" (dummy), "=c" (dummy) \
+  : "a" (*(const uint##bits##_t *)p_data), \
+ "1" (buf), "2" (*reps)\
+  : "memory" );\
+break
+CASE(8, b);
+CASE(16, w);
+CASE(32, l);
+CASE(64, q);
+#undef CASE
+
+default:
+xfree(buf);
+ASSERT(!buf);
+return X86EMUL_UNHANDLEABLE;
+}
+
+/* Adjust address for reverse store. */
+if ( df )
+gpa -= bytes - bytes_per_rep;
+
+rc = hvm_copy_to_guest_phys(gpa, buf, bytes);
+
+if ( buf != p_data )
+xfree(buf);
+
+switch ( rc )
+{
+case HVMCOPY_gfn_paged_out:
+case HVMCOPY_gfn_shared:
+return X86EMUL_RETRY;
+case HVMCOPY_okay:
+return X86EMUL_OKAY;
+}
+
+gdprintk(XENLOG_WARNING,
+ "Failed REP STOS: gpa=%"PRIpaddr" reps=%lu 
bytes_per_rep=%u\n",
+ gpa, *reps, bytes_per_rep);
+/* fall through */
+case p2m_mmio_direct:
+return X86EMUL_UNHANDLEABLE;
+
+case p2m_mmio_dm:
+return hvmemul_do_mmio(gpa, reps, bytes_per_rep, 0, IOREQ_WRITE, df,
+   p_data);
+}
+}
+
 static int hvmemul_read_segment(
 enum x86_segment seg,
 struct segment_register *reg,
@@ -1239,6 +1358,7 @@ static const struct x86_emulate_ops hvm_
 .rep_ins   = hvmemul_rep_ins,
 .rep_outs  = hvmemul_rep_outs,
 .rep_movs  = hvmemul_rep_movs,
+.rep_stos  = hvmemul_rep_stos,
 .read_segment  = hvmemul_read_segment,
 .write_segment = hvmemul_write_segment,
 .read_io   = hvmemul_read_io,
@@ -1264,6 +1384,7 @@ static const struct x86_emulate_ops hvm_
 .rep_ins   = hvmemul_rep_ins_discard,
 .rep_outs  = hvmemul_rep_outs_discard