Hi!

On Tue, May 11, 2021 at 05:37:34AM +0000, Tamar Christina via Gcc wrote:
> 2. Saturating abs:
>    char sat (char a)
>    {
>       int tmp = abs (a);
>       return tmp > 127 ? 127 : ((tmp < -128) ? -128 : tmp);
>    }

That can be done quite a bit better, branchless at least.  Same for all
examples here probably.

> 2. Saturation:
>    a) Use match.pd to rewrite the various saturation expressions into min/max
>       operations which opens up the expressions to further optimizations.

You'll have to do the operation in a bigger mode for that.  (This is
also true for rounding, in many cases).

This makes new internal functions more attractive / more feasible.

>       We could get the right instructions by using combine if we don't rewrite
>       the instructions to an internal function, however then during 
> Vectorization
>       we would overestimate the cost of performing the saturation.  The 
> constants
>       will the also be loaded into registers and so becomes a lot more 
> difficult
>       to cleanup solely in the backend.

Combine is almost never the right answer if you want to combine more
than two or three RTL insns.  It can be done, but combine will always
write the combined instruction in simplest terms, which tends to mean
that if you combine more insns there can be very many outcomes that you
all need to recognise as insns in your machine description.

> The one thing I am wondering about is whether we would need an internal 
> function
> for all operations supported, or if it should be modelled as an internal FN 
> which
> just "marks" the operation as rounding/saturating. After all, the only 
> difference
> between a normal and saturating expression in RTL is the xx_truncate RTL 
> surrounding
> the expression.  Doing so would also mean that all targets whom have 
> saturating
> instructions would automatically benefit from this.

I think you will have to experiment with both approaches to get a good
feeling for the tradeoff.


Segher

Reply via email to