On 7/31/24 18:06, Andre Vieira (lists) wrote:
This patch refactors and fixes an issue where arm_mve_dlstp_check_dec_counter     was making an assumption about the form of what a candidate for a dec_insn
     should be, which caused an ICE.
    This dec_insn is the instruction that decreases the loop counter inside a
     decrementing loop and we expect it to have the following form:
     (set (reg CONDCOUNT)
          (plus (reg CONDCOUNT)
                (const_int)))

    Where CONDCOUNT is the loop counter, and const int is the negative constant
     used to decrement it.

    This patch also improves our search for a valid dec_insn.  Before this patch     we'd only look for a dec_insn inside the loop header if the loop latch was     empty.  We now also search the loop header if the loop latch is not empty but     the last instruction is not a valid dec_insn.  This could potentially be improved
     to search all instructions inside the loop latch.

     gcc/ChangeLog:

            * config/arm/arm.cc (check_dec_insn): New helper function containing
             code hoisted from...
            (arm_mve_dlstp_check_dec_counter): ... here. Use check_dec_insn to
             check the validity of the candidate dec_insn.

     gcc/testsuite/ChangeLog:

             * gcc.targer/arm/mve/dlstp-loop-form.c: New test.

On 31/07/2024 15:15, Christophe Lyon wrote:
Because I tested with a toolchain configured for cortex-m85, which has mve.fp enabled by default, which means I didn't realize the testcase required arm_v8_1m_mve_fp_ok instead of arm_v8_1m_mve_ok.

Addressed that now.

Thanks, I thought you meant you ran the testsuite with -mcpu=cortex-m85 in RUNTESTFLAGS.

To be fair, that's not a terrible assumption. But what I did was I configured the toolchain (and single multilib) for I ran them in a build configured for armv8.1-m.main+mve.fp+fp.dp and fpu=auto (and float-abi=hard).


Regarding the patch, did you consider making the new check_dec_insn helper return an rtx (NULL or dec_set) instead of bool? I think it would save a call to single_set when computing decrementnum, but that's nitpicking.

Yeah I had also contemplated that, I'm OK either way, doesn't look too bad with the rtx return. See attached.


Thanks, LGTM.

Christophe


Thanks,

Christophe

Reply via email to