Re: [PATCH] Fix lower-subreg ICE (PR target/85945)

2018-06-11 Thread Jeff Law
On 05/29/2018 02:41 AM, Jakub Jelinek wrote:
> Hi!
> 
> We ICE on the following testcase, because lower-subreg sees only
> SFmode subregs of a multi-word pseudo (V4SFmode) and decides to decompose
> it.  Decomposition is done through replacing the multi-word pseudo with
> a concat of word-sized integer pseudos; unfortunately, we don't allow
> SFmode subregs of DImode inner regs (or any other mix of floating and
> integral mode where the sizes aren't the same), so we ICE later on during
> the lower-subreg pass.
> 
> The following patch punts in that case, unless the size is the same
> (i.e. allows SFmode on 32-bit word targets or DFmode on 64-bit word ones).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux plus tested with a
> cross to s390x-linux on the testcase.  Ok for trunk?
> 
> BTW, wonder if we also shouldn't punt in case the inner mode is vector mode,
> I don't think it is desirable to undo vectorization this way (but would like
> to try that as a separate patch and gather some statistics).
> 
> For this patch I've gathered statistics too, on x86_64/i686-linux this patch
> changes behavior in a couple of 64-bit gcc.target/i386/ tests where the
> modes in all cases were SFmode (outer) and V4SFmode (inner), but it seems at
> least on avx5124fmadd-v4fnmaddss-1.c I've looked at we in the end didn't
> decide to decompose it anyway (perhaps some other non-subreg use).
> 
> 2018-05-29  Jakub Jelinek  
> 
>   PR target/85945
>   * lower-subreg.c (find_decomposable_subregs): Don't decompose float
>   subregs of multi-word pseudos unless the float mode has word size.
> 
>   * gcc.c-torture/compile/pr85945.c: New test.
I thought we had a target macro which told us when it was safe to do
this kind of thing.  But I can't seem to find it.  It probably wouldn't
be the solution here since we're trying to avoid an ICE within the
lowering pass.  But it might point to other concerns in this space.

Anyway, the patch is fine.  I'm comfortable returning to the question
WRT vector modes as a distinct issue in the future after you gather some
data.

jeff



[PATCH] Fix lower-subreg ICE (PR target/85945)

2018-05-29 Thread Jakub Jelinek
Hi!

We ICE on the following testcase, because lower-subreg sees only
SFmode subregs of a multi-word pseudo (V4SFmode) and decides to decompose
it.  Decomposition is done through replacing the multi-word pseudo with
a concat of word-sized integer pseudos; unfortunately, we don't allow
SFmode subregs of DImode inner regs (or any other mix of floating and
integral mode where the sizes aren't the same), so we ICE later on during
the lower-subreg pass.

The following patch punts in that case, unless the size is the same
(i.e. allows SFmode on 32-bit word targets or DFmode on 64-bit word ones).

Bootstrapped/regtested on x86_64-linux and i686-linux plus tested with a
cross to s390x-linux on the testcase.  Ok for trunk?

BTW, wonder if we also shouldn't punt in case the inner mode is vector mode,
I don't think it is desirable to undo vectorization this way (but would like
to try that as a separate patch and gather some statistics).

For this patch I've gathered statistics too, on x86_64/i686-linux this patch
changes behavior in a couple of 64-bit gcc.target/i386/ tests where the
modes in all cases were SFmode (outer) and V4SFmode (inner), but it seems at
least on avx5124fmadd-v4fnmaddss-1.c I've looked at we in the end didn't
decide to decompose it anyway (perhaps some other non-subreg use).

2018-05-29  Jakub Jelinek  

PR target/85945
* lower-subreg.c (find_decomposable_subregs): Don't decompose float
subregs of multi-word pseudos unless the float mode has word size.

* gcc.c-torture/compile/pr85945.c: New test.

--- gcc/lower-subreg.c.jj   2018-01-04 00:43:17.209703103 +0100
+++ gcc/lower-subreg.c  2018-05-28 16:52:48.078871856 +0200
@@ -497,7 +497,16 @@ find_decomposable_subregs (rtx *loc, enu
 were the same number and size of pieces.  Hopefully this
 doesn't happen much.  */
 
- if (outer_words == 1 && inner_words > 1)
+ if (outer_words == 1
+ && inner_words > 1
+ /* Don't allow to decompose floating point subregs of
+multi-word pseudos if the floating point mode does
+not have word size, because otherwise we'd generate
+a subreg with that floating mode from a different
+sized integral pseudo which is not allowed by
+validate_subreg.  */
+ && (!FLOAT_MODE_P (GET_MODE (x))
+ || outer_size == UNITS_PER_WORD))
{
  bitmap_set_bit (decomposable_context, regno);
  iter.skip_subrtxes ();
--- gcc/testsuite/gcc.c-torture/compile/pr85945.c.jj2018-05-28 
18:26:28.043765766 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr85945.c   2018-05-28 
18:24:23.274594221 +0200
@@ -0,0 +1,16 @@
+/* PR target/85945 */
+
+typedef float V __attribute__((vector_size(16)));
+union U { V v; float f[4]; };
+int f;
+float g[4];
+
+void
+foo (void)
+{
+  V d;
+  union U i;
+  i.v = d;
+  for (f = 0; f < 4; f++)
+g[f] = i.f[f];
+}

Jakub