Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
>>> On 12.01.15 at 15:54, wrote: > On Fri, Jan 9, 2015 at 12:10 PM, Jan Beulich wrote: > On 09.01.15 at 12:45, wrote: >>> At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote: >>> On 09.01.15 at 12:18, wrote: >> > +default: >> > +xfree(buf); >> > +ASSERT(!buf); > > looks dodgy... In which way? The "default" is supposed to be unreachable, and sits in the else branch to an if(!buf), i.e. in a release build we'll correctly free the buffer, while in a debug build the ASSERT() will trigger. >>> >>> Oh I see. Can you please use ASSERT(0) for that? >> >> I sincerely dislike ASSERT(0), but if that's the only way to get >> the patch accepted... > > Um, by what set of criteria is ASSERT(!buf) better than ASSERT(0)? At > least the second tells the reader that you're not using ASSERT() the > normal way. Because the resulting message ("0") is completely meaningless, while for "!buf" you at least have a chance of finding the original ASSERT() without line numbers matching up 100%. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
On Fri, Jan 9, 2015 at 12:10 PM, Jan Beulich wrote: On 09.01.15 at 12:45, wrote: >> At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote: >>> >>> On 09.01.15 at 12:18, wrote: >>> >> > +default: >>> >> > +xfree(buf); >>> >> > +ASSERT(!buf); >>> > >>> > looks dodgy... >>> >>> In which way? The "default" is supposed to be unreachable, and sits >>> in the else branch to an if(!buf), i.e. in a release build we'll correctly >>> free the buffer, while in a debug build the ASSERT() will trigger. >> >> Oh I see. Can you please use ASSERT(0) for that? > > I sincerely dislike ASSERT(0), but if that's the only way to get > the patch accepted... Um, by what set of criteria is ASSERT(!buf) better than ASSERT(0)? At least the second tells the reader that you're not using ASSERT() the normal way. I agree that ASSERT_UNREACHABLE() is probably the best option. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
At 12:46 + on 09 Jan (1420804005), Andrew Cooper wrote: > On 09/01/15 12:10, Jan Beulich wrote: > On 09.01.15 at 12:45, wrote: > >> At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote: > >> On 09.01.15 at 12:18, wrote: > >> +default: > >> +xfree(buf); > >> +ASSERT(!buf); > looks dodgy... > >>> In which way? The "default" is supposed to be unreachable, and sits > >>> in the else branch to an if(!buf), i.e. in a release build we'll correctly > >>> free the buffer, while in a debug build the ASSERT() will trigger. > >> Oh I see. Can you please use ASSERT(0) for that? > > I sincerely dislike ASSERT(0), but if that's the only way to get > > the patch accepted... > > > > Jan > > > > Perhaps introducing ASSERT_UNREACHABLE() as an alternative which is more > obvious in nature than both ASSERT(!buf) and ASSERT(0) ? Fine by me! Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
On 09/01/15 12:10, Jan Beulich wrote: On 09.01.15 at 12:45, wrote: >> At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote: >> On 09.01.15 at 12:18, wrote: >> +default: >> +xfree(buf); >> +ASSERT(!buf); looks dodgy... >>> In which way? The "default" is supposed to be unreachable, and sits >>> in the else branch to an if(!buf), i.e. in a release build we'll correctly >>> free the buffer, while in a debug build the ASSERT() will trigger. >> Oh I see. Can you please use ASSERT(0) for that? > I sincerely dislike ASSERT(0), but if that's the only way to get > the patch accepted... > > Jan > Perhaps introducing ASSERT_UNREACHABLE() as an alternative which is more obvious in nature than both ASSERT(!buf) and ASSERT(0) ? ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
>>> On 09.01.15 at 12:45, wrote: > At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote: >> >>> On 09.01.15 at 12:18, wrote: >> >> > +default: >> >> > +xfree(buf); >> >> > +ASSERT(!buf); >> > >> > looks dodgy... >> >> In which way? The "default" is supposed to be unreachable, and sits >> in the else branch to an if(!buf), i.e. in a release build we'll correctly >> free the buffer, while in a debug build the ASSERT() will trigger. > > Oh I see. Can you please use ASSERT(0) for that? I sincerely dislike ASSERT(0), but if that's the only way to get the patch accepted... Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
At 11:24 + on 09 Jan (1420799087), Jan Beulich wrote: > >>> On 09.01.15 at 12:18, wrote: > >> > +default: > >> > +xfree(buf); > >> > +ASSERT(!buf); > > > > looks dodgy... > > In which way? The "default" is supposed to be unreachable, and sits > in the else branch to an if(!buf), i.e. in a release build we'll correctly > free the buffer, while in a debug build the ASSERT() will trigger. Oh I see. Can you please use ASSERT(0) for that? Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
>>> On 09.01.15 at 12:18, wrote: > At 10:56 + on 09 Jan (1420797418), Andrew Cooper wrote: >> On 09/01/15 10:27, Jan Beulich wrote: >> > While the REP MOVS acceleration appears to have helped qemu-traditional >> > based guests, qemu-upstream (or really the respective video BIOSes) >> > doesn't appear to benefit from that. Instead the acceleration added >> > here provides a visible performance improvement during very early HVM >> > guest boot. >> > >> > Signed-off-by: Jan Beulich >> > --- >> > v2: fix asm() constraints in hvmemul_rep_stos(), as pointed out by >> > Andrew. Add output operand telling the compiler that "buf" is being >> > written. >> >> Is writing buf wise? it looks like you will xfree() a wild pointer, and >> appears to interfere the "buf != p_data" logic. > > I think the constraints are correct, though the 'memory' clobber makes > the rest of it moot. But this: Ah, yes, the memory clobber could be dropped now that the "=m" is there. >> > +default: >> > +xfree(buf); >> > +ASSERT(!buf); > > looks dodgy... In which way? The "default" is supposed to be unreachable, and sits in the else branch to an if(!buf), i.e. in a release build we'll correctly free the buffer, while in a debug build the ASSERT() will trigger. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
At 10:56 + on 09 Jan (1420797418), Andrew Cooper wrote: > On 09/01/15 10:27, Jan Beulich wrote: > > While the REP MOVS acceleration appears to have helped qemu-traditional > > based guests, qemu-upstream (or really the respective video BIOSes) > > doesn't appear to benefit from that. Instead the acceleration added > > here provides a visible performance improvement during very early HVM > > guest boot. > > > > Signed-off-by: Jan Beulich > > --- > > v2: fix asm() constraints in hvmemul_rep_stos(), as pointed out by > > Andrew. Add output operand telling the compiler that "buf" is being > > written. > > Is writing buf wise? it looks like you will xfree() a wild pointer, and > appears to interfere the "buf != p_data" logic. I think the constraints are correct, though the 'memory' clobber makes the rest of it moot. But this: > > +default: > > +xfree(buf); > > +ASSERT(!buf); looks dodgy... Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
>>> On 09.01.15 at 11:56, wrote: > On 09/01/15 10:27, Jan Beulich wrote: >> While the REP MOVS acceleration appears to have helped qemu-traditional >> based guests, qemu-upstream (or really the respective video BIOSes) >> doesn't appear to benefit from that. Instead the acceleration added >> here provides a visible performance improvement during very early HVM >> guest boot. >> >> Signed-off-by: Jan Beulich >> --- >> v2: fix asm() constraints in hvmemul_rep_stos(), as pointed out by >> Andrew. Add output operand telling the compiler that "buf" is being >> written. > > Is writing buf wise? it looks like you will xfree() a wild pointer, and > appears to interfere the "buf != p_data" logic. Perhaps the textual description was a little ambiguous, but the code is not: It's not the variable "buf" that gets written, but buf[] in its entirety. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
On 09/01/15 10:27, Jan Beulich wrote: > While the REP MOVS acceleration appears to have helped qemu-traditional > based guests, qemu-upstream (or really the respective video BIOSes) > doesn't appear to benefit from that. Instead the acceleration added > here provides a visible performance improvement during very early HVM > guest boot. > > Signed-off-by: Jan Beulich > --- > v2: fix asm() constraints in hvmemul_rep_stos(), as pointed out by > Andrew. Add output operand telling the compiler that "buf" is being > written. Is writing buf wise? it looks like you will xfree() a wild pointer, and appears to interfere the "buf != p_data" logic. ~Andrew > > --- unstable.orig/xen/arch/x86/hvm/emulate.c 2015-01-09 11:19:01.0 > +0100 > +++ unstable/xen/arch/x86/hvm/emulate.c 2015-01-09 11:19:36.0 > +0100 > @@ -731,6 +731,17 @@ static int hvmemul_rep_movs_discard( > return X86EMUL_OKAY; > } > > +static int hvmemul_rep_stos_discard( > +void *p_data, > +enum x86_segment seg, > +unsigned long offset, > +unsigned int bytes_per_rep, > +unsigned long *reps, > +struct x86_emulate_ctxt *ctxt) > +{ > +return X86EMUL_OKAY; > +} > + > static int hvmemul_rep_outs_discard( > enum x86_segment src_seg, > unsigned long src_offset, > @@ -982,6 +993,114 @@ static int hvmemul_rep_movs( > return X86EMUL_OKAY; > } > > +static int hvmemul_rep_stos( > +void *p_data, > +enum x86_segment seg, > +unsigned long offset, > +unsigned int bytes_per_rep, > +unsigned long *reps, > +struct x86_emulate_ctxt *ctxt) > +{ > +struct hvm_emulate_ctxt *hvmemul_ctxt = > +container_of(ctxt, struct hvm_emulate_ctxt, ctxt); > +unsigned long addr; > +paddr_t gpa; > +p2m_type_t p2mt; > +bool_t df = !!(ctxt->regs->eflags & X86_EFLAGS_DF); > +int rc = hvmemul_virtual_to_linear(seg, offset, bytes_per_rep, reps, > + hvm_access_write, hvmemul_ctxt, > &addr); > + > +if ( rc == X86EMUL_OKAY ) > +{ > +uint32_t pfec = PFEC_page_present | PFEC_write_access; > + > +if ( hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3 ) > +pfec |= PFEC_user_mode; > + > +rc = hvmemul_linear_to_phys( > +addr, &gpa, bytes_per_rep, reps, pfec, hvmemul_ctxt); > +} > +if ( rc != X86EMUL_OKAY ) > +return rc; > + > +/* Check for MMIO op */ > +(void)get_gfn_query_unlocked(current->domain, gpa >> PAGE_SHIFT, &p2mt); > + > +switch ( p2mt ) > +{ > +unsigned long bytes; > +void *buf; > + > +default: > +/* Allocate temporary buffer. */ > +for ( ; ; ) > +{ > +bytes = *reps * bytes_per_rep; > +buf = xmalloc_bytes(bytes); > +if ( buf || *reps <= 1 ) > +break; > +*reps >>= 1; > +} > + > +if ( !buf ) > +buf = p_data; > +else > +switch ( bytes_per_rep ) > +{ > +unsigned long dummy; > + > +#define CASE(bits, suffix) \ > +case (bits) / 8: \ > +asm ( "rep stos" #suffix \ > + : "=m" (*(char (*)[bytes])buf), \ > +"=D" (dummy), "=c" (dummy) \ > + : "a" (*(const uint##bits##_t *)p_data), \ > + "1" (buf), "2" (*reps)\ > + : "memory" );\ > +break > +CASE(8, b); > +CASE(16, w); > +CASE(32, l); > +CASE(64, q); > +#undef CASE > + > +default: > +xfree(buf); > +ASSERT(!buf); > +return X86EMUL_UNHANDLEABLE; > +} > + > +/* Adjust address for reverse store. */ > +if ( df ) > +gpa -= bytes - bytes_per_rep; > + > +rc = hvm_copy_to_guest_phys(gpa, buf, bytes); > + > +if ( buf != p_data ) > +xfree(buf); > + > +switch ( rc ) > +{ > +case HVMCOPY_gfn_paged_out: > +case HVMCOPY_gfn_shared: > +return X86EMUL_RETRY; > +case HVMCOPY_okay: > +return X86EMUL_OKAY; > +} > + > +gdprintk(XENLOG_WARNING, > + "Failed REP STOS: gpa=%"PRIpaddr" reps=%lu > bytes_per_rep=%u\n", > + gpa, *reps, bytes_per_rep); > +/* fall through */ > +case p2m_mmio_direct: > +return X86EMUL_UNHANDLEABLE; > + > +case p2m_mmio_dm: > +return hvmemul_do_mmio(gpa, reps, bytes_per_rep, 0, IOREQ_WRITE, df, > + p_data); > +} > +} > + > static int hvmemul_read_segment( > enum x86_segment seg, > struct segment_register *reg, > @@ -1239,6 +1358,7 @@ static
[Xen-devel] [PATCH v2 1/3] x86: also allow REP STOS emulation acceleration
While the REP MOVS acceleration appears to have helped qemu-traditional based guests, qemu-upstream (or really the respective video BIOSes) doesn't appear to benefit from that. Instead the acceleration added here provides a visible performance improvement during very early HVM guest boot. Signed-off-by: Jan Beulich --- v2: fix asm() constraints in hvmemul_rep_stos(), as pointed out by Andrew. Add output operand telling the compiler that "buf" is being written. --- unstable.orig/xen/arch/x86/hvm/emulate.c2015-01-09 11:19:01.0 +0100 +++ unstable/xen/arch/x86/hvm/emulate.c 2015-01-09 11:19:36.0 +0100 @@ -731,6 +731,17 @@ static int hvmemul_rep_movs_discard( return X86EMUL_OKAY; } +static int hvmemul_rep_stos_discard( +void *p_data, +enum x86_segment seg, +unsigned long offset, +unsigned int bytes_per_rep, +unsigned long *reps, +struct x86_emulate_ctxt *ctxt) +{ +return X86EMUL_OKAY; +} + static int hvmemul_rep_outs_discard( enum x86_segment src_seg, unsigned long src_offset, @@ -982,6 +993,114 @@ static int hvmemul_rep_movs( return X86EMUL_OKAY; } +static int hvmemul_rep_stos( +void *p_data, +enum x86_segment seg, +unsigned long offset, +unsigned int bytes_per_rep, +unsigned long *reps, +struct x86_emulate_ctxt *ctxt) +{ +struct hvm_emulate_ctxt *hvmemul_ctxt = +container_of(ctxt, struct hvm_emulate_ctxt, ctxt); +unsigned long addr; +paddr_t gpa; +p2m_type_t p2mt; +bool_t df = !!(ctxt->regs->eflags & X86_EFLAGS_DF); +int rc = hvmemul_virtual_to_linear(seg, offset, bytes_per_rep, reps, + hvm_access_write, hvmemul_ctxt, &addr); + +if ( rc == X86EMUL_OKAY ) +{ +uint32_t pfec = PFEC_page_present | PFEC_write_access; + +if ( hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3 ) +pfec |= PFEC_user_mode; + +rc = hvmemul_linear_to_phys( +addr, &gpa, bytes_per_rep, reps, pfec, hvmemul_ctxt); +} +if ( rc != X86EMUL_OKAY ) +return rc; + +/* Check for MMIO op */ +(void)get_gfn_query_unlocked(current->domain, gpa >> PAGE_SHIFT, &p2mt); + +switch ( p2mt ) +{ +unsigned long bytes; +void *buf; + +default: +/* Allocate temporary buffer. */ +for ( ; ; ) +{ +bytes = *reps * bytes_per_rep; +buf = xmalloc_bytes(bytes); +if ( buf || *reps <= 1 ) +break; +*reps >>= 1; +} + +if ( !buf ) +buf = p_data; +else +switch ( bytes_per_rep ) +{ +unsigned long dummy; + +#define CASE(bits, suffix) \ +case (bits) / 8: \ +asm ( "rep stos" #suffix \ + : "=m" (*(char (*)[bytes])buf), \ +"=D" (dummy), "=c" (dummy) \ + : "a" (*(const uint##bits##_t *)p_data), \ + "1" (buf), "2" (*reps)\ + : "memory" );\ +break +CASE(8, b); +CASE(16, w); +CASE(32, l); +CASE(64, q); +#undef CASE + +default: +xfree(buf); +ASSERT(!buf); +return X86EMUL_UNHANDLEABLE; +} + +/* Adjust address for reverse store. */ +if ( df ) +gpa -= bytes - bytes_per_rep; + +rc = hvm_copy_to_guest_phys(gpa, buf, bytes); + +if ( buf != p_data ) +xfree(buf); + +switch ( rc ) +{ +case HVMCOPY_gfn_paged_out: +case HVMCOPY_gfn_shared: +return X86EMUL_RETRY; +case HVMCOPY_okay: +return X86EMUL_OKAY; +} + +gdprintk(XENLOG_WARNING, + "Failed REP STOS: gpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n", + gpa, *reps, bytes_per_rep); +/* fall through */ +case p2m_mmio_direct: +return X86EMUL_UNHANDLEABLE; + +case p2m_mmio_dm: +return hvmemul_do_mmio(gpa, reps, bytes_per_rep, 0, IOREQ_WRITE, df, + p_data); +} +} + static int hvmemul_read_segment( enum x86_segment seg, struct segment_register *reg, @@ -1239,6 +1358,7 @@ static const struct x86_emulate_ops hvm_ .rep_ins = hvmemul_rep_ins, .rep_outs = hvmemul_rep_outs, .rep_movs = hvmemul_rep_movs, +.rep_stos = hvmemul_rep_stos, .read_segment = hvmemul_read_segment, .write_segment = hvmemul_write_segment, .read_io = hvmemul_read_io, @@ -1264,6 +1384,7 @@ static const struct x86_emulate_ops hvm_ .rep_ins = hvmemul_rep_ins_discard, .rep_outs = hvmemul_rep_outs_discard