Hi!
We ICE on the following testcase, because lower-subreg sees only
SFmode subregs of a multi-word pseudo (V4SFmode) and decides to decompose
it. Decomposition is done through replacing the multi-word pseudo with
a concat of word-sized integer pseudos; unfortunately, we don't allow
SFmode subregs of DImode inner regs (or any other mix of floating and
integral mode where the sizes aren't the same), so we ICE later on during
the lower-subreg pass.
The following patch punts in that case, unless the size is the same
(i.e. allows SFmode on 32-bit word targets or DFmode on 64-bit word ones).
Bootstrapped/regtested on x86_64-linux and i686-linux plus tested with a
cross to s390x-linux on the testcase. Ok for trunk?
BTW, wonder if we also shouldn't punt in case the inner mode is vector mode,
I don't think it is desirable to undo vectorization this way (but would like
to try that as a separate patch and gather some statistics).
For this patch I've gathered statistics too, on x86_64/i686-linux this patch
changes behavior in a couple of 64-bit gcc.target/i386/ tests where the
modes in all cases were SFmode (outer) and V4SFmode (inner), but it seems at
least on avx5124fmadd-v4fnmaddss-1.c I've looked at we in the end didn't
decide to decompose it anyway (perhaps some other non-subreg use).
2018-05-29 Jakub Jelinek
PR target/85945
* lower-subreg.c (find_decomposable_subregs): Don't decompose float
subregs of multi-word pseudos unless the float mode has word size.
* gcc.c-torture/compile/pr85945.c: New test.
--- gcc/lower-subreg.c.jj 2018-01-04 00:43:17.209703103 +0100
+++ gcc/lower-subreg.c 2018-05-28 16:52:48.078871856 +0200
@@ -497,7 +497,16 @@ find_decomposable_subregs (rtx *loc, enu
were the same number and size of pieces. Hopefully this
doesn't happen much. */
- if (outer_words == 1 && inner_words > 1)
+ if (outer_words == 1
+ && inner_words > 1
+ /* Don't allow to decompose floating point subregs of
+multi-word pseudos if the floating point mode does
+not have word size, because otherwise we'd generate
+a subreg with that floating mode from a different
+sized integral pseudo which is not allowed by
+validate_subreg. */
+ && (!FLOAT_MODE_P (GET_MODE (x))
+ || outer_size == UNITS_PER_WORD))
{
bitmap_set_bit (decomposable_context, regno);
iter.skip_subrtxes ();
--- gcc/testsuite/gcc.c-torture/compile/pr85945.c.jj2018-05-28
18:26:28.043765766 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr85945.c 2018-05-28
18:24:23.274594221 +0200
@@ -0,0 +1,16 @@
+/* PR target/85945 */
+
+typedef float V __attribute__((vector_size(16)));
+union U { V v; float f[4]; };
+int f;
+float g[4];
+
+void
+foo (void)
+{
+ V d;
+ union U i;
+ i.v = d;
+ for (f = 0; f < 4; f++)
+g[f] = i.f[f];
+}
Jakub