Soumya AR <soum...@nvidia.com> writes: > One additional change with this patch is that I had to update ldapr-sext.c > too. > > During the combine pass, cases of UNSPECV_LDAP (with an offset) + sign_extend > transform to LDAPUR + SEXT, and later to LDAPURS (with address folding). > > The aarch64 tests run with -moverride=tune=none, which clears all tuning flags > including the AVOID_LDAPUR flag, enabling LDAPUR. (Since the generic arch > tuning > adjustments are applied before the tune override.) > > This breaks ldapr-sext.c, in all instances where an offset is used. (Even > though > there is no explicit offset in the testcase, since it uses global variables, > there is an implicit offset due to the alignment padding in bss.) > > As a result, the following code: > > add x0, x0, 2 > ldapursh x0, [x0] > > becomes: > > ldapursh x0, [x0, 2] > > I can just add -moverride=tune=avoid_ldapur to ldapr-sext.c to maintain the > original behavior but since we're extending LDAPURS to fold offsets anyway, > I think it makes sense to check that alongside as well. > Let me know if that works.
Yeah, sounds good to me. > [...] > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index abbb97768f5..4ee539e1dcd 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -18760,6 +18760,8 @@ aarch64_adjust_generic_arch_tuning (struct > tune_params ¤t_tune) > if (TARGET_SVE2) > current_tune.extra_tuning_flags > &= ~AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS; > + if (!AARCH64_HAVE_ISA(V8_8A)) > + aarch64_tune_params.extra_tuning_flags |= > AARCH64_EXTRA_TUNE_AVOID_LDAPUR; Sorry for the formatting nit, but the last line above should be indented by 2 columns fewer. > diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md > index 36b0dbd1f57..57a8e410c20 100644 > --- a/gcc/config/aarch64/atomics.md > +++ b/gcc/config/aarch64/atomics.md > @@ -679,13 +679,16 @@ > ) > > (define_insn "aarch64_atomic_load<mode>_rcpc" > - [(set (match_operand:ALLI 0 "register_operand" "=r") > + [(set (match_operand:ALLI 0 "register_operand") > (unspec_volatile:ALLI > - [(match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q") > + [(match_operand:ALLI 1 "aarch64_rcpc_memory_operand") > (match_operand:SI 2 "const_int_operand")] ;; model > UNSPECV_LDAP))] > "TARGET_RCPC" > - "ldapr<atomic_sfx>\t%<w>0, %1" > + {@ [ cons: =0 , 1 ; attrs: enable_ldapur ] > + [ r , Q ; any ] ldapr<atomic_sfx>\t%<w>0, %1 > + [ r , Ust ; yes ] ldapur<atomic_sfx>\t%<w>0, %1 > + } > ) > > (define_insn "aarch64_atomic_load<mode>" > @@ -705,21 +708,24 @@ > ) > > (define_insn "*aarch64_atomic_load<ALLX:mode>_rcpc_zext" > - [(set (match_operand:SD_HSDI 0 "register_operand" "=r") > + [(set (match_operand:SD_HSDI 0 "register_operand") > (zero_extend:SD_HSDI > (unspec_volatile:ALLX > - [(match_operand:ALLX 1 "aarch64_sync_memory_operand" "Q") > + [(match_operand:ALLX 1 "aarch64_rcpc_memory_operand") > (match_operand:SI 2 "const_int_operand")] ;; model > UNSPECV_LDAP)))] > "TARGET_RCPC && (<SD_HSDI:sizen> > <ALLX:sizen>)" > - "ldapr<ALLX:atomic_sfx>\t%w0, %1" > + {@ [ cons: =0 , 1 ; attrs: enable_ldapur ] > + [ r , Q ; any ] ldapr<ALLX:atomic_sfx>\t%w0, %1 > + [ r , Ust ; yes ] ldapur<ALLX:atomic_sfx>\t%w0, > %1 > + } > ) In both of the patterns above, it would be good to keep the "[", ",", ";" and "]" lined up. OK with those changes, thanks. Richard