On 15/03/18 13:08, Jan Beulich wrote:
> @@ -8517,6 +8690,142 @@ x86_emulate(
>  #undef vex
>  #undef ea
>  
> +int x86_emul_rmw(
> +    void *ptr,
> +    unsigned int bytes,
> +    uint32_t *eflags,
> +    struct x86_emulate_state *state,
> +    struct x86_emulate_ctxt *ctxt)
> +{
> +    unsigned long *dst = ptr;
> +
> +    ASSERT(bytes == state->op_bytes);
> +
> +#ifdef __x86_64__
> +# define JCXZ "jrcxz"
> +#else
> +# define JCXZ "jecxz"
> +#endif
> +
> +#define COND_LOCK(op) \
> +    JCXZ " .L" #op "%=\n\t" \
> +    "lock\n" \
> +    ".L" #op "%=:\n\t" \
> +    #op

I'd forgotten that these encoding of jmp existed, but various ORMs
suggest that it is far slower to execute than other jumps.

Irrespective of the instruction latency argument, you'll get better code
generation with cond_op() looking rather more like:

cmpb $0, %[lock]
je 1f
lock
1: #op

which will most likely cause the cmp to be encoded with a memory
operand, rather than forcing it the value into %ecx.

Everything else looks ok, so a provisional Acked-by: Andrew Cooper
<andrew.coop...@citrix.com> if you choose to follow this pattern.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to