On Wed, May 15, 2024 at 12:29 PM Tamar Christina <tamar.christ...@arm.com> wrote: > > Hi All, > > Some Neoverse Software Optimization Guides (SWoG) have a clause that state > that for predicated operations that also produce a predicate it is preferred > that the codegen should use a different register for the destination than that > of the input predicate in order to avoid a performance overhead. > > This of course has the problem that it increases register pressure and so > should > be done with care. Additionally not all micro-architectures have this > consideration and so it shouldn't be done as a default thing. > > The patch series adds support for doing conditional early clobbers through a > combination of new alternatives and attributes to control their availability.
You could have two alternatives, one with early clobber and one with a matching constraint where you'd disparage the matching constraint one? > On high register pressure we also use LRA's costing to prefer not to use the > alternative and instead just use the tie as this is preferable to a reload. > > Concretely this patch series does: > > > aarch64-none-elf-gcc -O3 -g0 -S -o - pred-clobber.c -mcpu=neoverse-n2 > > foo: > mov z31.h, w0 > ptrue p3.b, all > cmplo p0.h, p3/z, z0.h, z31.h > b use > > > aarch64-none-elf-gcc -O3 -g0 -S -o - pred-clobber.c -mcpu=neoverse-n1+sve > > foo: > mov z31.h, w0 > ptrue p0.b, all > cmplo p0.h, p0/z, z0.h, z31.h > b use > > > aarch64-none-elf-gcc -O3 -g0 -S -o - pred-clobber.c -mcpu=neoverse-n2 > > -ffixed-p[1-15] > > foo: > mov z31.h, w0 > ptrue p0.b, all > cmplo p0.h, p0/z, z0.h, z31.h > b use > > Testcases for the changes are in the last patch of the series. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Thanks, > Tamar > > --- > > --