On 12/09/2016 05:48 AM, Richard Sandiford wrote:
This series includes most of the changes in group C from:
https://gcc.gnu.org/ml/gcc/2016-11/msg00033.html
The idea is to add wrapper classes around machine_mode_enum
for specific groups of modes, such as scalar integers, scalar floats,
complex values, etc. This has two main benefits: one specific to SVE
and one not.
The SVE-specific benefit is that it helps to introduce the concept
of variable-length vectors. To do that we need to change the size
of a vector mode from being a known compile-time constant to being
(possibly) a run-time invariant. We then need to do the same for
unconstrained machine_modes, which might or might not be vectors.
Introducing these new constrained types means that we can continue
to treat them as having a constant size.
The other benefit is that it uses static type checking to enforce
conditions that are easily forgotten otherwise. The most common
sources of problems seem to be:
(a) using VOIDmode or BLKmode where a scalar integer was expected
(e.g. when getting the number of bits in the value).
(b) simplifying vector operations in ways that only make sense for
scalars.
The series helps with both of these, although we don't get the full
benefit of (b) until variable-sized modes are introduced.
I know of three specific cases in which the static type checking
forced fixes for things that turned out to be real bugs (although
we didn't know that at the time, otherwise we'd have posted patches).
They were later fixed for trunk by:
https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01783.html
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02983.html
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02896.html
The group C patches in ARM/sve-branch did slow compile time down a little.
I've since taken steps to avoid that:
- Make the tailcall pass handle aggregate parameters and return values
(already in trunk).
- Turn some of the new wrapper functions into inline functions.
- Make all the machmode.h macros that used:
__builtin_constant_p (M) ? foo_inline (M) : foo_array[M[
forward to an ALWAYS_INLINE function, so that (a) M is only evaluated
once and (b) __builtin_constant_p is applied to a variable, and so is
deferred until later passes. This helped the optimisation to fire in
more cases and to continue firing when M is a class rather than a
raw enum.
- In a similar vein, make sure that conditions like:
SImode == DImode
are treated as builtin_constant_p by gencondmd, so that .md patterns
with those conditions are dropped.
With these changes the series is actually a very slight compile-time win.
That might seem unlikely, but there are several possible reasons:
1. The machmode.h macro change above might allow more constant folding.
2. The series has a tendency to evaluate modes once, rather than
continually fetching them from (sometimes quite deep) rtx nests.
Refetching a mode is a particular problem if call comes between
two uses, since the compiler then has to re-evaluate the whole thing.
3. The series introduces many uses of new SCALAR_*TYPE_MODE macros,
as alternatives to TYPE_MODE. The new macros avoid the usual:
(VECTOR_TYPE_P (TYPE_CHECK (NODE)) \
? vector_type_mode (NODE) : (NODE)->type_common.mode)
and become direct field accesses in release builds.
VECTOR_TYPE_P would be consistently false for these uses,
but call-clobbered registers would usually be treated as clobbered
by the condition as a whole.
Maybe (3) is the most likely reason.
I tested this by compiling the testsuite for:
aarch64-linux-gnu alpha-linux-gnu arc-elf arm-linux-gnueabi
arm-linux-gnueabihf avr-elf bfin-elf c6x-elf cr16-elf cris-elf
epiphany-elf fr30-elf frv-linux-gnu ft32-elf h8300-elf
hppa64-hp-hpux11.23 ia64-linux-gnu i686-pc-linux-gnu
i686-apple-darwin iq2000-elf lm32-elf m32c-elf m32r-elf
m68k-linux-gnu mcore-elf microblaze-elf mips-linux-gnu
mipsisa64-linux-gnu mmix mn10300-elf moxie-rtems msp430-elf
nds32le-elf nios2-linux-gnu nvptx-none pdp11 powerpc-linux-gnuspe
powerpc-eabispe powerpc64-linux-gnu powerpc-ibm-aix7.0 rl78-elf
rx-elf s390-linux-gnu s390x-linux-gnu sh-linux-gnu sparc-linux-gnu
sparc64-linux-gnu sparc-wrs-vxworks spu-elf tilegx-elf tilepro-elf
xstormy16-elf v850-elf vax-netbsdelf visium-elf x86_64-darwin
x86_64-linux-gnu xtensa-elf
and checking that there were no changes in assembly. Also tested
in the normal way on aarch64-linux-gnu and x86_64-linux-gnu.
The series depends on the already-posted:
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01657.html
So can we get the discussion around the prerequisite restarted -- I like
the core ideas around building wrapper classes around machine modes, but
obviously we can't really move forward on this without the prereqs.
jeff