Alex Coplan <alex.cop...@arm.com> writes: > Hi Richard, > >> -----Original Message----- >> From: Richard Sandiford <richard.sandif...@arm.com> >> Sent: 18 August 2020 09:35 >> To: Alex Coplan <alex.cop...@arm.com> >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw <richard.earns...@arm.com>; >> Marcus Shawcroft <marcus.shawcr...@arm.com>; Kyrylo Tkachov >> <kyrylo.tkac...@arm.com> >> Subject: Re: [PATCH] aarch64: Don't generate invalid zero/sign-extend >> syntax >> >> Alex Coplan <alex.cop...@arm.com> writes: >> > Note that an obvious omission here is that this patch does not touch >> the >> > mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found >> > that I couldn't hit these patterns with C code since multiplications by >> > powers of two always get turned into shifts by earlier RTL passes. If >> > there's a way to reliably hit these patterns, then perhaps these should >> > be updated as well. >> >> Hmm. Feels like we should either update them or delete them. E.g.: >> >> *adds_<optab><mode>_multp2 >> *subs_<optab><mode>_multp2 >> >> were added alongside the adds3.c and subs3.c tests that you're updating, >> so if the tests don't/no longer need the multp2 patterns to pass, >> there's a good chance that the patterns are redundant. >> >> For reasons I never understood, the canonical representation is to use >> (mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s. >> So perhaps the patterns were originally for address calculations that had >> been moved outside of a (mem …) and not updated to shifts instead of >> mults. > > Thanks for the review, and for clarifying this. I tried removing these > together with the *_mul_imm_ patterns (e.g. *adds_mul_imm_<mode>) and > the only failure was gcc/testsuite/gcc.dg/torture/pr34330.c when > compiled with -Os -ftree-vectorize which appears to depend on the > *add_mul_imm_di pattern. Without this pattern, we ICE in LRA on this > input. > > In this case, GCC appears to have done exactly what you described: we > have the (plus (mult ...) ...) nodes inside (mem)s prior to register > allocation, and then we end up pulling these out without converting them > to shifts. > > Seeing this behaviour (and in particular seeing the ICE) makes me > hesitant to just go ahead and remove the other patterns. That said, I > have a patch to remove the following patterns: > > *adds_<optab><mode>_multp2 > *subs_<optab><mode>_multp2 > *add_<optab><ALLX:mode>_mult_<GPI:mode> > *add_<optab><SHORT:mode>_mult_si_uxtw > *add_<optab><mode>_multp2 > *add_<optab>si_multp2_uxtw > *add_uxt<mode>_multp2 > *add_uxtsi_multp2_uxtw > *sub_<optab><mode>_multp2 > *sub_<optab>si_multp2_uxtw > *sub_uxt<mode>_multp2 > *sub_uxtsi_multp2_uxtw > *sub_uxtsi_multp2_uxtw > > (together with the predicate aarch64_pwr_imm3 which is only used in > these patterns) and this bootstraps/regtests just fine. > > So, I have a couple of questions: > > (1) Should it be considered a bug if we pull (plus (mult (power of 2) > ...) ...) out of a (mem) RTX without re-writing the (mult) as a > shift?
IMO, yes. But if we have an example in which it happens, we have to fix it before removing the patterns. That could end up being a bit of a rabbit hole, and could affect other targets too. If we keep the patterns, we should fix the [su]xtw problem in: *adds_<optab><mode>_multp2 *subs_<optab><mode>_multp2 *add_<optab><ALLX:mode>_mult_<GPI:mode> *add_uxt<mode>_multp2 *sub_uxt<mode>_multp2 too. (Plus any others I missed, if that isn't the full list.) Thanks, Richard