> On 23 Oct 2025, at 8:37 pm, Richard Biener <[email protected]> wrote:
>
> External email: Use caution opening links or attachments
>
>
> On Wed, Oct 22, 2025 at 12:12 PM Kugan Vivekanandarajah
> <[email protected]> wrote:
>>
>>
>>
>>> On 22 Oct 2025, at 7:51 pm, Richard Biener <[email protected]> 
>>> wrote:
>>>
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Tue, Oct 21, 2025 at 11:55 PM Kugan Vivekanandarajah
>>> <[email protected]> wrote:
>>>>
>>>> Hi Richard,
>>>> Thanks for the review.
>>>>
>>>>> On 16 Oct 2025, at 11:06 pm, Richard Biener <[email protected]> 
>>>>> wrote:
>>>>>
>>>>> External email: Use caution opening links or attachments
>>>>>
>>>>>
>>>>> On Wed, Oct 15, 2025 at 11:08 PM Kugan Vivekanandarajah
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This patch enables Loop Invariant Code Motion (LICM) to hoist loads that
>>>>>> alias with stores rom the loaded value.
>>>>>>
>>>>>> The pattern a[i] = a[0] is common in TSVC benchmarks (s293):
>>>>>>
>>>>>> for (int i = 0; i < N; i++)
>>>>>>  a[i] = a[0];
>>>>>>
>>>>>> Previously, GCC conservatively rejected hoisting a[0] due to potential
>>>>>> aliasing when i==0. However, this is a "self write" - even when aliasing
>>>>>> occurs, we're writing back the same value, making hoisting safe.
>>>>>>
>>>>>> The optimization checks that:
>>>>>> 1. One reference is a load, the other is a store
>>>>>> 2. The stored SSA value equals the loaded SSA value
>>>>>> 3. Only simple cases with single accesses per reference
>>>>>>
>>>>>> This enables vectorization of these patterns by allowing the vectorizer
>>>>>> to see the hoisted loop-invariant value.
>>>>>>
>>>>>> With the patch, the loop now vectorizes and generates:
>>>>>>
>>>>>>      .L2:
>>>>>> -       ldr     s31, [x1]
>>>>>> -       str     s31, [x0], 4
>>>>>> -       cmp     x0, x2
>>>>>> +       str     q31, [x0], 16
>>>>>> +       cmp     x0, x1
>>>>>>      bne     .L2
>>>>>>
>>>>>> gcc/ChangeLog:
>>>>>
>>>>> So refs_independent_p has no particular oder on the arguments it
>>>>> receives and both could be stores.  It seems to me the special-casing
>>>>> would be better done in ref_indep_loop_p for the kind == lim_raw
>>>>> only.  Also
>>>>>
>>>>> +      if (((ref1->loaded && !ref1->stored && ref2->stored)
>>>>> +          || (ref2->loaded && !ref2->stored && ref1->stored))
>>>>> +         && is_self_write (ref1, ref2))
>>>>> +       {
>>>>>
>>>>> you seem to imply both orders are handled, but is_self_write only
>>>>> handles ref1 being the load.  Handing this upthread makes the
>>>>> order obvious.
>>>>>
>>>>> So can you see to move this up?  Otherwise I think this should be OK.
>>>>> I'll note this does not only handle the "obvious" case but also
>>>>
>>>> In the attached patched, I have moved based on the suggestion.
>>>> Bootstrappped and regression tested on aarch64-linux-gnu with no New 
>>>> regressions.
>>>> Is this OK?
>>>
>>> It seems you attached the old patch?
>>
>> Apologies. I sent the wrong patch. Here is the correct one.
>
> This seems to be an incremental patch of some sorts.  Applying some "guessing"
> the overall thing looks OK, I'd probably try micro-optimizing this
> timing critical
> part a bit by pre-computing load_stmt outside of the refs_to_check iteration
> and only call is_self_write if that was successful (lim_raw && single
> access in loop).
>
> Please send an updated and complete patch.  You might want to double-check it
> before posting ;)
>
Apologies for the mixup again. Here is the patch with added check for 
ref1->loaded && ref2->stored

Bootstrapped and regression tests on aarch64-linuc-gnu with no new regressions.

Thanks,
Kugan


> Thanks,
> Richard.
>
>> Thanks,
>> Kugan
>>
>>>
>>>> Thanks,
>>>> Kugan
>>>>
>>>>
>>>>>
>>>>> void foo(int *p, int *q, int j)
>>>>> {
>>>>> for  (int i =0; i < N; ++i)
>>>>>  p[i] = q[j];
>>>>> }
>>>>>
>>>>> Richard.
>>>>>
>>>>>>      * tree-ssa-loop-im.cc (is_self_write): New.
>>>>>>      (refs_independent_p): Allow hoisting when aliasing references
>>>>>>      form a self write pattern.
>>>>>>
>>>>>> gcc/testsuite/ChangeLog:
>>>>>>
>>>>>>      * gcc.dg/vect/vect-licm-hoist-1.c: New.
>>>>>>      * gcc.dg/vect/vect-licm-hoist-2.c: Likewise.
>>>>>>
>>>>>> Bootstrappped and regression tested on aarch64-linux-gnu with no New
>>>>>> regressions.
>>>>>>
>>>>>> Is this Ok?
>>>>>>
>>>>>> Signed-off-by: Kugan Vivekanandarajah <[email protected]>
>>>>>>
>>>>>> Thanks,
>>>>>> Kugan
>>
>>

Attachment: 0001-tree-optimization-Allow-LICM-to-hoist-loads-in-self-.patch
Description: 0001-tree-optimization-Allow-LICM-to-hoist-loads-in-self-.patch

Reply via email to