Hello Richard,

the case INDEX_op_dup_vec in optimize.c:tcg_optimize is doing a break
when the input arg is not constant instead of jumping to do_default.

The issue can be reproduced by compiling the following SVE code:

.global main
.section .text

main:
        mov     z2.d, #0
        decp    z1.d, p0
        ret

When run with SVE vector length set to 128-bit, we get this before TCG opt:

 ---- 0000000000400524 0000000000000000 0000000000000000
 dupi_vec v128,e64,tmp2,$0x0
 st_vec v128,e8,tmp2,env,$0xb00

 ---- 0000000000400528 0000000000000000 0000000000000000
 ld_i64 tmp3,env,$0x2900
 movi_i64 tmp4,$0x101
 and_i64 tmp3,tmp3,tmp4
 ctpop_i64 tmp3,tmp3
 dup_vec v128,e64,tmp2,tmp3
 ld_vec v128,e8,tmp5,env,$0xa00
 sub_vec v128,e64,tmp6,tmp5,tmp2
 st_vec v128,e8,tmp6,env,$0xa00

After the optimization pass:

  ---- 0000000000400524 0000000000000000 0000000000000000
 dupi_vec v128,e64,tmp2,$0x0
 st_vec v128,e8,tmp2,env,$0xb00                   dead: 0

 ---- 0000000000400528 0000000000000000 0000000000000000
 ld_vec v128,e8,tmp5,env,$0xa00
 mov_vec v128,e64,tmp6,tmp5                       dead: 1
 st_vec v128,e8,tmp6,env,$0xa00                   dead: 0

The dupi_vec TCG op correctly flags tmp2 as a 0 constant. This is then
wrongly propagated down to the sub_vec TCG op despite tmp2 being
overwritten by a dup_vec TCG op.

Thanks,

Laurent

Reply via email to