On Thu, 27 Jan 2022, Richard Biener wrote:

> > Index: gcc/gcc/c/c-typeck.cc
> > ===================================================================
> > --- gcc.orig/gcc/c/c-typeck.cc
> > +++ gcc/gcc/c/c-typeck.cc
> > @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.
> >  #include "gomp-constants.h"
> >  #include "spellcheck-tree.h"
> >  #include "gcc-rich-location.h"
> > +#include "optabs-query.h"
> >  #include "stringpool.h"
> >  #include "attribs.h"
> >  #include "asan.h"
> > @@ -11923,7 +11924,9 @@ build_binary_op (location_t location, en
> >                bool maybe_const = true;
> >                tree sc;
> >                sc = c_fully_fold (op0, false, &maybe_const);
> > -              sc = save_expr (sc);
> > +             if (optab_handler (vec_duplicate_optab,
> > +                                TYPE_MODE (type1)) == CODE_FOR_nothing)
> > +               sc = save_expr (sc);
> 
> This doesn't make much sense - I suppose the CONSTRUCTOR retains
> TREE_SIDE_EFFECTS but such flag has no meaning on GIMPLE
> and thus should have been cleared during gimplification or in the end
> ignored by RTL expansion.

 This is how the expression built here eventually looks in 
`store_constructor':

(gdb) print exp
$41 = <constructor 0x7ffff5c06ba0>
(gdb) pt
 <constructor 0x7ffff5c06ba0
    type <vector_type 0x7ffff5e7ea48 v4sf
        type <real_type 0x7ffff5cf1260 float sizes-gimplified SF
            size <integer_cst 0x7ffff5c00f78 constant 32>
            unit-size <integer_cst 0x7ffff5c00f90 constant 4>
            align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x7ffff5cf1260 precision:32
            pointer_to_this <pointer_type 0x7ffff5cf1848>>
        sizes-gimplified V4SF
        size <integer_cst 0x7ffff5c00d80 constant 128>
        unit-size <integer_cst 0x7ffff5c00d98 constant 16>
        align:128 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x7ffff5d19648 nunits:4 context <translation_unit_decl 0x7ffff5ec0bb8 
v4sf-dup.c>>
    side-effects length:4
    val <ssa_name 0x7ffff5cb0dc8 type <real_type 0x7ffff5cf1260 float>
        visited var <parm_decl 0x7ffff5f00080 y>
        def_stmt GIMPLE_NOP
        version:2>
    val <ssa_name 0x7ffff5cb0dc8 type <real_type 0x7ffff5cf1260 float>
        visited var <parm_decl 0x7ffff5f00080 y>
        def_stmt GIMPLE_NOP
        version:2>
    val <ssa_name 0x7ffff5cb0dc8 type <real_type 0x7ffff5cf1260 float>
        visited var <parm_decl 0x7ffff5f00080 y>
        def_stmt GIMPLE_NOP
        version:2>
    val <ssa_name 0x7ffff5cb0dc8 type <real_type 0x7ffff5cf1260 float>
        visited var <parm_decl 0x7ffff5f00080 y>
        def_stmt GIMPLE_NOP
        version:2>>
(gdb)

The `side-effects' flag prevents this conditional from executing:

        /* Try using vec_duplicate_optab for uniform vectors.  */
        if (!TREE_SIDE_EFFECTS (exp)
            && VECTOR_MODE_P (mode)
            && eltmode == GET_MODE_INNER (mode)
            && ((icode = optab_handler (vec_duplicate_optab, mode))
                != CODE_FOR_nothing)
            && (elt = uniform_vector_p (exp))
            && !VECTOR_TYPE_P (TREE_TYPE (elt)))
          {
            class expand_operand ops[2];
            create_output_operand (&ops[0], target, mode);
            create_input_operand (&ops[1], expand_normal (elt), eltmode);
            expand_insn (icode, 2, ops);
            if (!rtx_equal_p (target, ops[0].value))
              emit_move_insn (target, ops[0].value);
            break;
          }

I don't know what's supposed to clear the flag (and what the purpose of 
setting it in the first place would be then).

> You do not add a testcase so it's hard to see what goes wrong where exactly.

 This only happens with an out-of-tree microarchitecture and I was unable 
to find an in-tree target that ever uses the `vec_duplicateM' standard RTL 
pattern, so I wasn't able to produce a test case that would trigger with 
upstream code.  Code is essentially like:

typedef float v4sf __attribute__ ((vector_size (16)));

v4sf
odd_even (v4sf x, float y)
{
  return x + f;
}

 I considered the Aarch64 one the most likely candidate as it is the one
commit be4c1d4a42c5 has been for, but as I noted in the description it 
does not appear to use it either.  It uses `vec_initMN' instead and does 
the broadcast by hand in the backend.

 Maybe I'm missing something.

  Maciej

Reply via email to