Am 05.06.26 um 13:55 schrieb Richard Biener:
On Fri, Jun 5, 2026 at 12:53 PM Georg-Johann Lay via Gcc
<[email protected]> wrote:

Am 05.06.26 um 11:41 schrieb Andrew Pinski:
On Fri, Jun 5, 2026 at 2:25 AM Georg-Johann Lay via Gcc <[email protected]> wrote:

In match.pd there are two kinds of patterns: Canonicalizations
and optimizations.

While canonicalizations simplify the compile task by reducing
combinatorial complexity, optimization patterns try to improve
code performance.

The trouble with the optimization patterns is that they operate blindly,
not considering costs or target capabilities in any way.

Particularly bad example are patterns of the form

/* (zero_one == 0) ? y : z <op> y -> ((typeof(y))zero_one * z) <op> y */
/* (zero_one != 0) ? z <op> y : y -> ((typeof(y))zero_one * z) <op> y */

The simple answer here is expand should see if `((typeof(y))zero_one *
z)` is cheaper than `(zero_one) ? z : 0`. I have some ideas on
implementing that but I have not got around to it yet.

That's already a bad proxy because `(zero_one == 0) ? y : z <op> y`
my be cheaper (an actually is on avr) than `(zero_one) ? z : 0`.

Moreover, the multiplication in the mentioned form may no survive
until expand, e.g. may have been transformed to something different
when the code is more complex than a minimal example.

In this case gating the transform on TYPE_PRECISION (type) <=
GET_MODE_PRECISION (word_mode)
would work?  To the extent still leaving AVR with a 8 bit multiply instead of
a test-and-branch.

It would improve the situation as a side effect.  IMHO such a condition
is not the right thing to do and is unrelated to the root cause. Plus,
on targets that are not plagued by this, it might introduce a missed
optimization.

In general I agree with Andrew that match.pd, for this kind of "optimizations",
needs to apply costing.

As far as I understand, that approach works at tree -> rtl lowering when
rtx costs are available.  IIUC there's nothing like costs at match.pd time.

I'll note that this pattern attempts at if-converting
code which can expose the code to additional optimization which you might
lose when not applying if-conversion, so it's not always black-or-white and
costing just a single transform can miss a bigger picture.

Additional transformations / optimizations makes the problem even worse,
because patterns that try to roll back bad optimizations — be it at
expand time or combine time — won't match any more.

that map single-bit tests to multiplications, even on machines where
multiplication is very expensive, and even when a target doesn't support
multiplications.

All a back end can do is try to write patterns for the insn combiner and
such, that try to undo code that makes GCC look stupid and ridiculous.

In cases where MUL is expanded to a libcall, rolling back is not even
possible.

So the question is:  Is possible and wanted to give targets
a say in whether match.pd patterns should be rejected?

The next question is how this could be implemented?

Here is a proposal:

Support a file like <target>.pd in the back end, that, if present,
takes precedence over the middle end's match.pd.  That way, a backend
could reject / FAIL patterns like the ones above.

That's an interesting idea.  To be practical w/o changs the target.pd would
need to be included first and would need to "succeed" in transforming.  So
I'm not sure this is very maintainable - also considering that match.pd might
change slightly and the target.pd no longer matching.

I actually do consider a target.pd useful, but limited to pre-RTL-expansiion
and to add additional patterns, not disable others.


The advantages are:

* There's already support to generate .cc from .pd.

* No cluttering up of match.pd of any kind.

* No need for complicated C code that evaluates trees or RTXes like
     in rtx costs.

As a final note: Using the existence of a standard insn as a proxy would
be a bad choice.  Existence of say, mulsi3, doesn't mean at all that
such a pattern is cheap in any way.


Thanks,
Johann

Reply via email to