On Wed, May 15, 2024 at 12:29 PM Tamar Christina
<tamar.christ...@arm.com> wrote:
>
> Hi All,
>
> Some Neoverse Software Optimization Guides (SWoG) have a clause that state
> that for predicated operations that also produce a predicate it is preferred
> that the codegen should use a different register for the destination than that
> of the input predicate in order to avoid a performance overhead.
>
> This of course has the problem that it increases register pressure and so 
> should
> be done with care.  Additionally not all micro-architectures have this
> consideration and so it shouldn't be done as a default thing.
>
> The patch series adds support for doing conditional early clobbers through a
> combination of new alternatives and attributes to control their availability.

You could have two alternatives, one with early clobber and one with
a matching constraint where you'd disparage the matching constraint one?

> On high register pressure we also use LRA's costing to prefer not to use the
> alternative and instead just use the tie as this is preferable to a reload.
>
> Concretely this patch series does:
>
> > aarch64-none-elf-gcc -O3 -g0 -S -o - pred-clobber.c -mcpu=neoverse-n2
>
> foo:
>         mov     z31.h, w0
>         ptrue   p3.b, all
>         cmplo   p0.h, p3/z, z0.h, z31.h
>         b       use
>
> > aarch64-none-elf-gcc -O3 -g0 -S -o - pred-clobber.c -mcpu=neoverse-n1+sve
>
> foo:
>         mov     z31.h, w0
>         ptrue   p0.b, all
>         cmplo   p0.h, p0/z, z0.h, z31.h
>         b       use
>
> > aarch64-none-elf-gcc -O3 -g0 -S -o - pred-clobber.c -mcpu=neoverse-n2 
> > -ffixed-p[1-15]
>
> foo:
>         mov     z31.h, w0
>         ptrue   p0.b, all
>         cmplo   p0.h, p0/z, z0.h, z31.h
>         b       use
>
> Testcases for the changes are in the last patch of the series.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Thanks,
> Tamar
>
> ---
>
> --

Reply via email to