On 1/2/23 08:59, Jakub Jelinek wrote:
On Mon, Jan 02, 2023 at 08:45:15AM -0700, Jeff Law via Gcc-patches wrote:
On 1/1/23 08:55, Roger Sayle wrote:
In 2011, the rtl.texi documentation was updated to reflect that the
modes of the RTX unary operations FFS, POPCOUNT and PARITY must
match those of their operands.  Unfortunately, some of the transformations
in simplify-rtx.cc predate this tightening of RTL semantics, and have
not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
optimizations were "correct" when I originally added them back in 2007.

Segher requested that I split this piece out from a fix for PR 106594 in
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2023-01-01  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
        * gcc/simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:
        Avoid generating FFS with mismatched operand and result modes, by
        using an explicit SIGN_EXTEND/ZERO_EXTEND instead.
        <case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
        <case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.
?!?  The docs still seem to indicate to me that the modes of the input and
output operands can differ.  Let's take PARITY as an example:

See the PR50161 thread in
https://gcc.gnu.org/legacy-ml/gcc-patches/2011-08/threads.html#01847
The options are to disallow different modes, which is what my patch did
(perhaps not all documentation has been tweaked), or ensure that the operand
of those is never constant.  The latter is much harder and needs to be done
in many places.  While for SUBREG/ZERO_EXTEND/SIGN_EXTEND and to some extend
also FLOAT/UNSIGNED_FLOAT we already try hard not to fold those immediately
(and still find every now and then spots where we don't do that), for the
rarely used unary rtls we certainly don't.
Sigh. Lack of modes on constants mucking things up elsewhere. There's no good reason other than our poor representation to force the input and output modes to match for these instructions.



In fact Raphael and I were about to submit a patch which takes advantage of
that capability to improve the code slightly for risc-v.

Just use a pattern with zero_extend or sign_extend around it or subreg of
it?
If it were only that easy ;( In the bowels of the simplifications the zero_extension turns into either a pair of shifts or an AND with a mask (I forget which offhand). I'm sure we *can* work around this in the target, but it'll be ugly.

The documentation definitely needs to be updated. I looked at the whole family a few weeks ago and my recollection was they all need to be fixed (ffs, clrsb, clz, ctz, popcount & parity) if the defined semantics are that the input and output operand modes must match.

Jeff

Reply via email to