> > What's the reason you cannot defer SIMD cloning to LTRANS stage
> > as simple IPA pass next to IPA-PTA?
> 
> Ok, deferring till after IPA-PTA was easy, just small ipa-cp.c changes
> (look at the attribute rather than simd*clone* fields), passes.def and
> had to tweak ipa_add_new_function which assumed that all new functions
> must be definitions with gimple body.

Note that any small IPA pass at ltrans will increase peak memory use of
ltrans copmilation by loading all function bodies into memory (since
IPA transformations needs to be applied first).

It would be nice to avoid these enabled by default unless we have really
good reason for it.

> 2013-11-25  Aldy Hernandez  <al...@redhat.com>
>           Jakub Jelinek  <ja...@redhat.com>
> 
>       * cgraph.h (enum cgraph_simd_clone_arg_type): New.
>       (struct cgraph_simd_clone_arg, struct cgraph_simd_clone): New.
>       (struct cgraph_node): Add simdclone and simd_clones fields.
>       * config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen,
>       ix86_simd_clone_adjust, ix86_simd_clone_usable): New functions.
>       (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN,
>       TARGET_SIMD_CLONE_ADJUST, TARGET_SIMD_CLONE_USABLE): Define.
>       * doc/tm.texi.in (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN,
>       TARGET_SIMD_CLONE_ADJUST, TARGET_SIMD_CLONE_USABLE): Add.
>       * doc/tm.texi: Regenerated.
>       * ggc.h (ggc_alloc_cleared_simd_clone_stat): New function.
>       * ipa-cp.c (determine_versionability): Fail if "omp declare simd"
>       attribute is present.
>       * omp-low.c: Include pretty-print.h, ipa-prop.h and tree-eh.h.
>       (simd_clone_vector_of_formal_parm_types): New function.
>       (simd_clone_struct_alloc, simd_clone_struct_copy,
>       simd_clone_vector_of_formal_parm_types, simd_clone_clauses_extract,
>       simd_clone_compute_base_data_type, simd_clone_mangle,
>       simd_clone_create, simd_clone_adjust_return_type,
>       create_tmp_simd_array, simd_clone_adjust_argument_types,
>       simd_clone_init_simd_arrays): New functions.
>       (struct modify_stmt_info): New type.
>       (ipa_simd_modify_stmt_ops, ipa_simd_modify_function_body,
>       simd_clone_adjust, expand_simd_clones, ipa_omp_simd_clone): New
>       functions.
>       (pass_data_omp_simd_clone): New variable.
>       (pass_omp_simd_clone): New class.
>       (make_pass_omp_simd_clone): New function.
>       * passes.def (pass_omp_simd_clone): New.
>       * target.def (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN,
>       TARGET_SIMD_CLONE_ADJUST, TARGET_SIMD_CLONE_USABLE): New target
>       hooks.
>       * target.h (struct cgraph_node, struct cgraph_simd_node): Declare.
>       * tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document.
>       * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define.
>       * tree-pass.h (make_pass_omp_simd_clone): New prototype.
>       * tree-vect-data-refs.c: Include cgraph.h.
>       (vect_analyze_data_refs): Inline by hand find_data_references_in_loop
>       and find_data_references_in_bb, if find_data_references_in_stmt
>       fails, still allow calls to #pragma omp declare simd functions
>       in #pragma omp simd loops unless they contain data references among
>       the call arguments or in lhs.
>       * tree-vect-loop.c (vect_determine_vectorization_factor): Handle
>       calls with no lhs.
>       (vect_transform_loop): Allow NULL STMT_VINFO_VECTYPE for calls without
>       lhs.
>       * tree-vectorizer.h (enum stmt_vec_info_type): Add
>       call_simd_clone_vec_info_type.
>       (struct _stmt_vec_info): Add simd_clone_fndecl field.
>       (STMT_VINFO_SIMD_CLONE_FNDECL): Define.
>       * tree-vect-stmts.c: Include tree-ssa-loop.h,
>       tree-scalar-evolution.h and cgraph.h.
>       (vectorizable_call): Handle calls without lhs.  Assert
>       !stmt_can_throw_internal instead of failing for it.  Don't update
>       EH stuff.
>       (struct simd_call_arg_info): New.
>       (vectorizable_simd_clone_call): New function.
>       (vect_transform_stmt): Call it.
>       (vect_analyze_stmt): Likewise.  Allow NULL STMT_VINFO_VECTYPE for
>       calls without lhs.
>       * ipa-prop.c (ipa_add_new_function): Only call ipa_analyze_node
>       if cgraph_function_with_gimple_body_p is true.
> c/
>       * c-decl.c (c_builtin_function_ext_scope): Avoid binding if
>       external_scope is NULL.
> cp/
>       * semantics.c (finish_omp_clauses): For #pragma omp declare simd
>       linear clause step call maybe_constant_value.
> testsuite/
>       * g++.dg/gomp/declare-simd-1.C (f38): Make sure
>       simdlen is a power of two.
>       * gcc.dg/gomp/simd-clones-2.c: Compile on all targets.
>       Remove -msse2.  Adjust regexps for name mangling changes.
>       * gcc.dg/gomp/simd-clones-3.c: Likewise.
>       * gcc.dg/vect/vect-simd-clone-1.c: New test.
>       * gcc.dg/vect/vect-simd-clone-2.c: New test.
>       * gcc.dg/vect/vect-simd-clone-3.c: New test.
>       * gcc.dg/vect/vect-simd-clone-4.c: New test.
>       * gcc.dg/vect/vect-simd-clone-5.c: New test.
>       * gcc.dg/vect/vect-simd-clone-6.c: New test.
>       * gcc.dg/vect/vect-simd-clone-7.c: New test.
>       * gcc.dg/vect/vect-simd-clone-8.c: New test.
>       * gcc.dg/vect/vect-simd-clone-9.c: New test.
>       * gcc.dg/vect/vect-simd-clone-10.c: New test.
>       * gcc.dg/vect/vect-simd-clone-10.h: New file.
>       * gcc.dg/vect/vect-simd-clone-10a.c: New file.
>       * gcc.dg/vect/vect-simd-clone-11.c: New test.

The i386 and IPA/cgraph bits seems OK to me.
> 
> --- gcc/ipa.c.jj      2013-11-22 21:08:18.958330368 +0100
> +++ gcc/ipa.c 2013-11-25 10:20:47.693785318 +0100
> @@ -426,6 +426,19 @@ symtab_remove_unreachable_nodes (bool be
>                     enqueue_node (cnode, &first, reachable);
>                   }
>               }
> +
> +         }
> +       /* If any reachable function has simd clones, mark them as
> +          reachable as well.  */
> +       if (cnode->simd_clones)
> +         {
> +           cgraph_node *next;
> +           for (next = cnode->simd_clones;
> +                next;
> +                next = next->simdclone->next_clone)
> +             if (in_boundary_p
> +                 || !pointer_set_insert (reachable, next))
> +               enqueue_node (next, &first, reachable);

Can't we represent the need for the simd clones more explicitely, i.e. by 
references?

> --- gcc/cgraph.h.jj   2013-11-22 21:03:50.782671321 +0100
> +++ gcc/cgraph.h      2013-11-25 10:20:47.695785297 +0100
> @@ -256,6 +256,99 @@ struct GTY(()) cgraph_clone_info
>    bitmap combined_args_to_skip;
>  };

Perhaps a comment here would fit.
>  
> +enum cgraph_simd_clone_arg_type
> +{
> +  SIMD_CLONE_ARG_TYPE_VECTOR,
> +  SIMD_CLONE_ARG_TYPE_UNIFORM,
> +  SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP,
> +  SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP,
> +  SIMD_CLONE_ARG_TYPE_MASK
> +};
> +
> +/* Function arguments in the original function of a SIMD clone.
> +   Supplementary data for `struct simd_clone'.  */
> +
> +struct GTY(()) cgraph_simd_clone_arg {
> +  /* Original function argument as it originally existed in
> +     DECL_ARGUMENTS.  */
> +  tree orig_arg;
> +
> +  /* orig_arg's function (or for extern functions type from
> +     TYPE_ARG_TYPES).  */
> +  tree orig_type;
> +
> +  /* If argument is a vector, this holds the vector version of
> +     orig_arg that after adjusting the argument types will live in
> +     DECL_ARGUMENTS.  Otherwise, this is NULL.
> +
> +     This basically holds:
> +       vector(simdlen) __typeof__(orig_arg) new_arg.  */
> +  tree vector_arg;
> +
> +  /* vector_arg's type (or for extern functions new vector type.  */
> +  tree vector_type;
> +
> +  /* If argument is a vector, this holds the array where the simd
> +     argument is held while executing the simd clone function.  This
> +     is a local variable in the cloned function.  Its content is
> +     copied from vector_arg upon entry to the clone.
> +
> +     This basically holds:
> +       __typeof__(orig_arg) simd_array[simdlen].  */
> +  tree simd_array;
> +
> +  /* A SIMD clone's argument can be either linear (constant or
> +     variable), uniform, or vector.  */
> +  enum cgraph_simd_clone_arg_type arg_type;
> +
> +  /* For arg_type SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP this is
> +     the constant linear step, if arg_type is
> +     SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP, this is index of
> +     the uniform argument holding the step, otherwise 0.  */
> +  HOST_WIDE_INT linear_step;
> +
> +  /* Variable alignment if available, otherwise 0.  */
> +  unsigned int alignment;
> +};
> +
> +/* Specific data for a SIMD function clone.  */
> +
> +struct GTY(()) cgraph_simd_clone {
> +  /* Number of words in the SIMD lane associated with this clone.  */
> +  unsigned int simdlen;
> +
> +  /* Number of annotated function arguments in `args'.  This is
> +     usually the number of named arguments in FNDECL.  */
> +  unsigned int nargs;
> +
> +  /* Max hardware vector size in bits for integral vectors.  */
> +  unsigned int vecsize_int;
> +
> +  /* Max hardware vector size in bits for floating point vectors.  */
> +  unsigned int vecsize_float;
> +
> +  /* The mangling character for a given vector size.  This is is used
> +     to determine the ISA mangling bit as specified in the Intel
> +     Vector ABI.  */
> +  unsigned char vecsize_mangle;
> +
> +  /* True if this is the masked, in-branch version of the clone,
> +     otherwise false.  */
> +  unsigned int inbranch : 1;
> +
> +  /* True if this is a Cilk Plus variant.  */
> +  unsigned int cilk_elemental : 1;
> +
> +  /* Doubly linked list of SIMD clones.  */
> +  struct cgraph_node *prev_clone, *next_clone;
> +
> +  /* Original cgraph node the SIMD clones were created for.  */
> +  struct cgraph_node *origin;
> +
> +  /* Annotated function arguments for the original function.  */
> +  struct cgraph_simd_clone_arg GTY((length ("%h.nargs"))) args[1];
> +};
> +
>  
>  /* The cgraph data structure.
>     Each function decl has assigned cgraph_node listing callees and callers.  
> */
> @@ -284,6 +377,12 @@ public:
>    /* Declaration node used to be clone of. */
>    tree former_clone_of;
>  
> +  /* If this is a SIMD clone, this points to the SIMD specific
> +     information for it.  */
> +  struct cgraph_simd_clone *simdclone;
> +  /* If this function has SIMD clones, this points to the first clone.  */
> +  struct cgraph_node *simd_clones;
> +
>    /* Interprocedural passes scheduled to have their transform functions
>       applied next time we execute local pass on them.  We maintain it
>       per-function in order to allow IPA passes to introduce new functions.  
> */

Reply via email to