On Wed, 8 Nov 2023, Kito Cheng wrote:

> OK, then LGTM, thanks for the explanation :)

 Please don't top-post on a GCC mailing list (and preferably in off-list 
replies to such mailing list messages unless it's been agreed to somehow 
with the participants), as it makes it difficult to make context replies.

 Best practice is to reply inline, quoting the relevant original paragraph 
(or enough context) referred to above, and with all the other parts of the 
message replied to discarded.  We may even have it written down somewhere 
(though I haven't checked; in the old days it used to be assumed), and I 
do hope any sane modern MUA can handle it.

 Otherwise the discussion thread quickly grows into an illegible mess.

 So this change does indeed fix PR 112092, however we now have an issue 
with several other test cases and the new `-mmovcc' option.  For example 
vsetvl-13.c fails with "-mmovcc -mbranch-cost=8" test options and assembly 
produced is like:

        vsetvli a6,a6,e8,mf4,ta,ma
        snez    a5,a5
        neg     a5,a5
        and     a6,a5,a6
        not     a5,a5
        andi    a5,a5,55
        or      a5,a6,a5
        beq     a4,zero,.L10
        li      a6,0
        vsetvli zero,a5,e32,m1,tu,ma
.L4:
        vle32.v v1,0(a0)
        vle32.v v1,0(a1)
        vle32.v v1,0(a2)
        vse32.v v1,0(a3)
        addi    a6,a6,1
        bne     a4,a6,.L4
.L10:
        ret

As far as I can tell code produced is legitimate, and for the record 
analogous assembly is produced with `-march=rv32gcv_zicond' too:

        vsetvli a6,a6,e8,mf4,ta,ma
        czero.eqz       a6,a6,a5
        li      a7,55
        czero.nez       a5,a7,a5
        or      a5,a5,a6
        beq     a4,zero,.L10
        li      a6,0
        vsetvli zero,a5,e32,m1,tu,ma
.L4:
        vle32.v v1,0(a0)
        vle32.v v1,0(a1)
        vle32.v v1,0(a2)
        vse32.v v1,0(a3)
        addi    a6,a6,1
        bne     a4,a6,.L4
.L10:
        ret

-- it's just that you can't see it with regression testing, because the 
test case overrides `-march='.  Presumably we do want to execute VSETVLI 
twice here on the basis that to avoid the second one by means of branches 
would be more costly than not to.

 Shall we just silence false failures like this with `-mno-movcc' then or 
shall we handle the conditional-move case somehow?

 For reference plain branched assembly is like:

        li      a7,55
        beq     a5,zero,.L13
        vsetvli zero,a6,e32,m1,tu,ma
.L2:
        beq     a4,zero,.L11
        li      a5,0
.L4:
        vle32.v v1,0(a0)
        vle32.v v1,0(a1)
        vle32.v v1,0(a2)
        vse32.v v1,0(a3)
        addi    a5,a5,1
        bne     a4,a5,.L4
.L11:
        ret
.L13:
        vsetvli zero,a7,e32,m1,tu,ma
        j       .L2

  Maciej

Reply via email to