From: Richard Henderson <richard.hender...@linaro.org> We were only constructing the 64-bit element, and not replicating the 64-bit element across the rest of the vector.
Cc: qemu-sta...@nongnu.org Signed-off-by: Richard Henderson <richard.hender...@linaro.org> (cherry picked from commit e20cb81d9c5a3d0f9c08f3642728a210a1c162c9) Signed-off-by: Michael Roth <mdr...@linux.vnet.ibm.com> --- tcg/i386/tcg-target.inc.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 9d8ed974e0..77b78c941c 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -2855,9 +2855,13 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, goto gen_simd; #if TCG_TARGET_REG_BITS == 32 case INDEX_op_dup2_vec: - /* Constraints have already placed both 32-bit inputs in xmm regs. */ - insn = OPC_PUNPCKLDQ; - goto gen_simd; + /* First merge the two 32-bit inputs to a single 64-bit element. */ + tcg_out_vex_modrm(s, OPC_PUNPCKLDQ, a0, a1, a2); + /* Then replicate the 64-bit elements across the rest of the vector. */ + if (type != TCG_TYPE_V64) { + tcg_out_dup_vec(s, type, MO_64, a0, a0); + } + break; #endif case INDEX_op_abs_vec: insn = abs_insn[vece]; -- 2.17.1