On 30/01/2024 14:09, Andre Simoes Dias Vieira wrote:
> Hi Richard,
> 
> Thanks for the reviews, I'm making these changes but just a heads up.
> 
> When hardcoding LR_REGNUM like this we need to change the way we compare the 
> register in doloop_condition_get. This function currently compares the rtx 
> nodes by address, which I think happens to be fine before we assign hard 
> registers, as I suspect we always share the rtx node for the same pseudo, but 
> when assigning registers it seems like we create copies, so things like:
> `XEXP (inc_src, 0) == reg` will fail for
> inc_src: (plus (reg LR) (const_int -n)'
> reg: (reg LR)
> 
> Instead I will substitute the operand '==' with calls to 'rtx_equal_p (op1, 
> op2, NULL)'.

Yes, that's fine.

R.

> 
> Sound good?
> 
> Kind regards,
> Andre
> 
> ________________________________________
> From: Richard Earnshaw (lists) <richard.earns...@arm.com>
> Sent: Tuesday, January 30, 2024 11:36 AM
> To: Andre Simoes Dias Vieira; gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov; Stam Markianos-Wright
> Subject: Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low 
> Overhead Loops
> 
> On 19/01/2024 14:40, Andre Vieira wrote:
>>
>> Respin after comments from Kyrill and rebase. I also removed an if-then-else
>> construct in arm_mve_check_reg_origin_is_num_elems similar to the other 
>> functions
>> Kyrill pointed out.
>>
>> After an earlier comment from Richard Sandiford I also added comments to the
>> two tail predication patterns added to explain the need for the unspecs.
> 
> [missing ChangeLog]
> 
> I'm just going to focus on loop-doloop.c in this reply, I'll respond to the 
> other bits in a follow-up.
> 
>       2)  (set (reg) (plus (reg) (const_int -1))
> -         (set (pc) (if_then_else (reg != 0)
> -                                (label_ref (label))
> -                                (pc))).
> +        (set (pc) (if_then_else (reg != 0)
> +                                (label_ref (label))
> +                                (pc))).
> 
>       Some targets (ARM) do the comparison before the branch, as in the
>       following form:
> 
> -     3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
> -                   (set (reg) (plus (reg) (const_int -1)))])
> -        (set (pc) (if_then_else (cc == NE)
> ...
> 
> 
> This comment is becoming confusing.  Really the text leading up to 3)... 
> should be inside 3.  Something like:
> 
>       3) Some targets (ARM) do the comparison before the branch, as in the
>       following form:
> 
>       (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
>                  (set (reg) (plus (reg) (const_int -1)))])
>       (set (pc) (if_then_else (cc == NE)
>                               (label_ref (label))
>                               (pc)))])
> 
> 
> The same issue on the comment structure also applies to the new point 4...
> 
> +      The ARM target also supports a special case of a counter that 
> decrements
> +      by `n` and terminating in a GTU condition.  In that case, the compare 
> and
> +      branch are all part of one insn, containing an UNSPEC:
> +
> +      4) (parallel [
> +           (set (pc)
> +               (if_then_else (gtu (unspec:SI [(plus:SI (reg:SI 14 lr)
> +                                                       (const_int -n))])
> +                                  (const_int n-1]))
> +                   (label_ref)
> +                   (pc)))
> +           (set (reg:SI 14 lr)
> +                (plus:SI (reg:SI 14 lr)
> +                         (const_int -n)))
> +     */
> 
> I think this needs a bit more clarification.  Specifically that this 
> construct supports a predicated vectorized do loop.  Also, the placement of 
> the unspec inside the comparison is ugnly and unnecessary.  It should be 
> sufficient to have the unspec inside a USE expression, which the mid-end can 
> then ignore entirely.  So
> 
>         (parallel
>          [(set (pc) (if_then_else (gtu (plus (reg) (const_int -n))
>                                        (const_int n-1))
>                                   (label_ref) (pc)))
>           (set (reg) (plus (reg) (const_int -n)))
>           (additional clobbers and uses)])
> 
> For Arm, we then add a (use (unspec [(const_int 0)] N)) that is specific to 
> this pattern to stop anything else from matching it.
> 
> Note that we don't need to mention that the register is 'LR' or the modes, 
> those are specific to a particular backend, not the generic pattern we want 
> to match.
> 
> +      || !CONST_INT_P (XEXP (inc_src, 1))
> +      || INTVAL (XEXP (inc_src, 1)) >= 0)
>      return 0;
> +  int dec_num = abs (INTVAL (XEXP (inc_src, 1)));
> 
> We can just use '-INTVAL(...)' here, we've verified just above that the 
> constant is negative.
> 
> -  if ((XEXP (condition, 0) == reg)
> +  /* For the ARM special case of having a GTU: re-form the condition without
> +     the unspec for the benefit of the middle-end.  */
> +  if (GET_CODE (condition) == GTU)
> +    {
> +      condition = gen_rtx_fmt_ee (GTU, VOIDmode, inc_src,
> +                                 GEN_INT (dec_num - 1));
> +      return condition;
> +    }
> 
> If you make the change I mentioned above, this re-forming isn't needed any 
> more, so the arm-specific comment goes away
> 
> -   {
> +    {
>       if (GET_CODE (pattern) != PARALLEL)
>       /*  For the second form we expect:
> 
> You've fixed the indentation of the brace (good), but the body of the braced 
> expression needs re-indenting as well.
> 
> R.
> 

Reply via email to