Re: [patch] cilkplus array notation for C (clean, independent patchset, take 1)

Joseph S. Myers Thu, 21 Mar 2013 08:56:56 -0700

On Wed, 20 Mar 2013, Aldy Hernandez wrote:

> Joseph, folks, et al... How does this look?


This review largely deals with coding style (interpreted broadly).  I'll 
review more of the substance separately later; reposting with fixes for 
all the accumulated issues is probably a good idea anyway, to avoid the 
same issues coming up repeatedly.

>       * c-common.c (c_define_builtins): When cilkplus is enabled, the
>       function array_notation_init_builtins() is called.

Don't use () after a function name when referring to the function.

> diff --git a/gcc/c-family/array-notation-common.c 
> b/gcc/c-family/array-notation-common.c

> +int extract_sec_implicit_index_arg (location_t, tree);
> +bool is_sec_implicit_index_fn (tree);
> +void array_notation_init_builtins (void);

Non-static function declarations like this should not be inside a .c file.  
If these functions are used outside this file, there should be an 
associated header that declares them; include it in the .c file.  If only 
used inside the .c file that defines them, make them static (and 
topologically sort static functions inside a source file so that forward 
static declarations are only needed for cases of recursion).

> +/* Mark the FNDECL as cold, meaning that the function specified by FNDECL is
> +   not run as is.  */

The cold attribute means unlikely to be executed rather than "not run as 
is".  Maybe "not run as is" is what's relevant here, but I'm not clear 
why this attribute would be useful for built-in functions at all - the 
documentation suggests it's only relevant when a user defines a function 
themselves, and affects the code generated for that function, so wouldn't 
be relevant at all for built-in functions.

> +void
> +array_notation_init_builtins (void)

Other built-in functions use various .def files (builtins.def and the 
files it includes) to avoid lots of repetitive code like this - can you 
integrate this with that mechanism?  If you do so, then you should be able 
to avoid (or massively simplify) functions like:

> +/* Returns true if the function call specified in FUNC_NAME is
> +   __sec_implicit_index.  */
> +
> +bool
> +is_sec_implicit_index_fn (tree func_name)

because code can use the BUILT_IN_* enum values to test whether a 
particular function is in use - which is certainly cleaner than using 
strcmp against the function name.

> +/* Returns the first and only argument for FN, which should be a
> +   sec_implicit_index function.  FN's location in the source file is is 
> +   indicated by LOCATION.  */
> +
> +int
> +extract_sec_implicit_index_arg (location_t location, tree fn)
> +{
> +  tree fn_arg;
> +  HOST_WIDE_INT return_int = 0;
> +  if (!fn)
> +    return -1;

Why the random check for a NULL argument?  If a NULL argument is valid 
(meaning that it makes the code cleaner to allow such arguments rather 
than making sure the function isn't called with them), this should be 
documented in the comment above the function; otherwise, if such an 
argument isn't valid, there is no need to check for it.

You declare return_int as HOST_WIDE_INT, but it only receives a value cast 
to int, and is used only to store a value returned as int.  Either use int 
consistently, or HOST_WIDE_INT consistently, but I don't see a reason to 
use both.

> +  if (TREE_CODE (fn) == CALL_EXPR)
> +    {
> +      fn_arg = CALL_EXPR_ARG (fn, 0);
> +      if (really_constant_p (fn_arg))

I don't think really_constant_p is what's wanted; 
<http://software.intel.com/sites/default/files/m/4/e/7/3/1/40297-Intel_Cilk_plus_lang_spec_2.htm>
 
says "The argument shall be an integer constant expression.", and such 
expressions always appear in the C front end as INTEGER_CST.  So you can 
just check for INTEGER_CST.

Now a subtlety here is that the function argument will have been folded by 
this point, meaning that cases that aren't integer constant expressions in 
C standard terms will be wrongly allowed (both by the original code and by 
a version checking against INTEGER_CST).  In such cases, the way to get 
things checked correctly is to use a keyword rather than a built-in 
function - as with __builtin_choose_expr or __builtin_shuffle, for 
example.  Since this operation seems special in ways that built-in 
functions generally aren't, that seems reasonable anyway.  So the code 
parsing this keyword would check that the argument is an INTEGER_CST, of 
integer type (since INTEGER_CSTs can have pointer type in GCC), like that 
for __builtin_choose_expr does.  It would then quite likely create its own 
tree code for the operation, rather than using a CALL_EXPR at all.  (It 
would need to manage converting to int, given how the specification 
defines things in terms of a prototype for type int - so e.g. a constant 
1ULL << 32 would act like 0 if int is 32 bits, under the present 
specification.)

The specification doesn't seem very clear on to what extent the __sec_* 
operations must act like functions (what happens if someone puts 
parentheses around the __sec_* name, for example - that wouldn't work with 
the keyword approach).  So the specification should be clarified there, 
but I think saying the __sec_* operations are syntactically special, like 
keywords, is more appropriate than requiring other uses to work.

> +     return_int = (int) int_cst_value (fn_arg);
> +      else
> +     {
> +       if (location == UNKNOWN_LOCATION && EXPR_HAS_LOCATION (fn))
> +         location = EXPR_LOCATION (fn);
> +       error_at (location, "__sec_implicit_index parameter must be a " 
> +                 "constant integer expression");

The term is "integer constant expression" not "constant integer 
expression".

> diff --git a/gcc/c/c-array-notation.c b/gcc/c/c-array-notation.c

> +void replace_array_notations (tree *, bool, vec<tree, va_gc> *,
> +                           vec<tree, va_gc> *);
> +void find_rank (tree, bool, size_t *);
> +void extract_array_notation_exprs (tree, bool, vec<tree, va_gc> **);
> +tree fix_conditional_array_notations (tree);
> +struct c_expr fix_array_notation_expr (location_t, enum tree_code,
> +                                    struct c_expr);
> +bool is_builtin_array_notation_fn (tree func_name, an_reduce_type *type);
> +static tree fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
> *new_var);
> +bool contains_array_notation_expr (tree expr);
> +tree expand_array_notation_exprs (tree t);

As before, forward declarations inside .c files should only be for static 
functions with recursive calls to themselves, not for non-static 
functions or static functions not involved in recursion.

> +struct inv_list
> +{
> +  vec<tree, va_gc> *list_values;
> +  vec<tree, va_gc> *replacement;
> +};

Comment on this type explaining what it's for.

> +/* Returns the rank of ARRAY through the *RANK.  The user can specify whether
> +   (s)he wants to step into array_notation-specific builtin functions
> +   (specified by the IGNORE_BUILTIN_FN).

The wording seems awkward; "Set *RANK to the rank of ARRAY, ignoring 
array-notation-specific built-in functions if IGNORE_BUILTIN_FN." would be 
better.

> +void
> +find_rank (tree array, bool ignore_builtin_fn, size_t *rank)
> +{
> +  tree ii_tree;
> +  size_t current_rank = 0, ii = 0;
> +  an_reduce_type dummy_type = REDUCE_UNKNOWN;
> +  if (!array)
> +    return;

As before, avoid random checks for NULL parameters unless there is an 
actual reason to allow them and the comments document that they are 
allowed and what the semantics are in that case.  In general, explain what 
ARRAY is - an expression?

Is *RANK always set by this function?  Make clear in the comment above the 
function whether it is, and whether the initial value of *RANK before the 
function is called is of any significance.  I note that

> +  else if (TREE_CODE (array) == ARRAY_NOTATION_REF)
> +    {
> +      for (ii_tree = array;
> +        ii_tree && TREE_CODE (ii_tree) == ARRAY_NOTATION_REF;
> +        ii_tree = ARRAY_NOTATION_ARRAY (ii_tree))
> +     current_rank++;
> +      if (*rank == 0)
> +     *rank = current_rank;

does appear to look at the value before the function has set it, implying 
that the original value of *RANK *does* mean something.

> +      if (TREE_CODE (array) == CALL_EXPR)
> +     {
> +       tree func_name = CALL_EXPR_FN (array);
> +       if (TREE_CODE (func_name) == ADDR_EXPR)
> +         if (!ignore_builtin_fn)
> +           if (is_builtin_array_notation_fn (func_name, &dummy_type))
> +             /* If it is a builtin function, then we know it returns a 
> +                scalar.  */
> +             return;
> +       if (TREE_CODE (TREE_OPERAND (array, 0)) == INTEGER_CST)
> +         {
> +           int length = TREE_INT_CST_LOW (TREE_OPERAND (array, 0));
> +           for (ii = 0; ii < (size_t) length; ii++)

TREE_INT_CST_LOW returns unsigned HOST_WIDE_INT.  There should be no need 
for converting twice, first to int and then to size_t.  And rather than 
depending on implementation default of CALL_EXPR, call_expr_nargs would be 
a better way to calculate the length.

> +             find_rank (TREE_OPERAND (array, ii), ignore_builtin_fn, rank);

But actually, you're dealing with a CALL_EXPR here.  So you should be able 
to use existing iterators over CALL_EXPR arguments (e.g. 
FOR_EACH_CALL_EXPR_ARG) rather than explicitly using the number of 
arguments at all.  Doing so, and separately checking CALL_EXPR_FN, and 
CALL_EXPR_STATIC_CHAIN if applicable, seems cleaner than depending on 
low-level details of the sequence of operands to a CALL_EXPR.

> +/* Extracts all the array notations specified in NODE and stores them in a
> +   dynamic tree array of ARRAY_LIST whose size is stored in *LIST_SIZE.  The
> +   user can specify if (s)he wants to ignore the array notations inside the
> +   array-notation specific builtin functions (by setting IGNORE_BUILTIN_FN to
> +   true).  */
> +
> +void
> +extract_array_notation_exprs (tree node, bool ignore_builtin_fn,
> +                           vec<tree, va_gc> **array_list)

There's no argument LIST_SIZE, so the comment needs updating.  Again, the 
wording about "The user" is awkward; the comment should directly define 
the semantics of the function in terms of its argument, without reference 
to "The user".

> +{
> +  size_t ii = 0;
> +  an_reduce_type dummy_type = REDUCE_UNKNOWN;
> +  
> +  if (!node)
> +    return;

Again, check for NULL argument without any mention in the comment that 
such arguments are valid; remove unless there is a reason to make them 
valid.

> +  else if (TREE_CODE (node) == TREE_LIST)

What's NODE?  My first guess would have been an expression, but if a 
TREE_LIST is possible that's clearly not the answer, so explain in the 
comment above the function what NODE is.  (If a TREE_LIST is being used 
within expressions to store something specific to array notation, don't do 
so - TREE_LIST is deprecated, existing uses should be phased out in favour 
of more specific and less memory-hungry datastructures and new uses should 
not be added.)

> +      if (TREE_CODE (TREE_OPERAND (node, 0)) == INTEGER_CST)
> +     {
> +       int length = TREE_INT_CST_LOW (TREE_OPERAND (node, 0));
> +
> +       for (ii = 0; ii < (size_t) length; ii++)
> +         extract_array_notation_exprs
> +           (TREE_OPERAND (node, ii), ignore_builtin_fn, array_list);

Same problems as before with an iterator over CALL_EXPR that should avoid 
depending on low-level details of how CALL_EXPR is implemented, and excess 
integer type conversions.

> +/* Replaces all occurances of array notations in tree ORIG that matches the
> +   ones in LIST with the one in ARRAY_OPERAND.  The size of list and
> +   ARRAY_OPERAND is ARRAY_SIZE.  For example, ARRAY_OPERAND[x] for some index
> +   'x' will have the equivalent ARRAY_REF for the ARRAY_NOTATION_REF 
> specified
> +   in LIST[x].   The  user can specify if (s)he wants to ignore the array
> +   notations inside the array-notation specific builtin functions (using the
> +   bool variable IGNORE_BUILTIN_FN).  */

Again, avoid "The user".

> +void
> +replace_array_notations (tree *orig, bool ignore_builtin_fn,
> +                      vec<tree, va_gc> *list,
> +                      vec<tree, va_gc> *array_operand)
> +{
> +  size_t ii = 0;
> +  tree node = NULL_TREE, node_replacement = NULL_TREE;
> +  an_reduce_type dummy_type = REDUCE_UNKNOWN;
> +  
> +  if (vec_safe_length (list) == 0 || !*orig)
> +    return;

Again, avoid checks for NULL or document that NULL arguments are valid if 
there's a good reason.  Generally, document what sort of thing ORIG is.

> +      if (TREE_CODE (TREE_OPERAND (*orig, 0)) == INTEGER_CST)
> +     {
> +       int length = TREE_INT_CST_LOW (TREE_OPERAND (*orig, 0));
> +       for (ii = 0; ii < (size_t) length; ii++)
> +         replace_array_notations
> +           (&TREE_OPERAND (*orig, ii), ignore_builtin_fn, list,
> +            array_operand);

Again, better CALL_EXPR iterators.

> +/* This function will find all the scalar expressions in *TP and push it in
> +   DATA struct, typecasted to (void *).  If *WALK_SUBTREES is set to 0 then
> +   we have do not go into the *TP's subtrees.  */

Rather than "This function will", just "Find ..." (and say "Returns 
NULL_TREE." or something like that - presumably the return type is so it 
can be passed to walk_tree).

> +/* Replaces all the scalar expressions in *NODE. */
> +
> +tree
> +replace_invariant_exprs (tree *node)

Comment needs to explain the semantics of the return value.

> +tree
> +build_array_notation_expr (location_t location, tree lhs, tree lhs_origtype,
> +                        enum tree_code modifycode, location_t rhs_loc,
> +                        tree rhs, tree rhs_origtype)

> +    }
> +
> +
> +
> +  for (ii = 0; ii < lhs_rank; ii++)

Excess blank lines in middle of function.  Generally there shouldn't be 
two or more consecutive blank lines inside a function (if you want to have 
different sizes of blanks to split up levels of structure in the function, 
that suggests the function is too big and should be split up into separate 
functions).

> +          TREE_TYPE (lhs_var[ii]));
> +       
> +     }

This location for a blank line doesn't make sense.

> +  /* The following statements will do the following:
> +   * <if_stmt_label>: (in order from outermost to innermost)
> +   *                  if (cond_expr) then go to body_label
> +   *                  else                go to exit_label
> +   * <body_label>:
> +   *                  array expression
> +   *
> +   *                  (the increment, goto and exit_label goes from 
> innermost to
> +   *                   outermost).
> +   *                  ii++ and jj++
> +   *                  go to if_stmt_label
> +   * <exit_label>:
> +   *                  <REST OF CODE>
> +   */

Comments should not have an initial "*" on each line.

> +/* Encloses the conditional statement passed in STMT with a loop around it
> +   and replaces the condition in STMT with a ARRAY_REF tree-node to the 
> array.
> +   The condition must have a ARRAY_NOTATION_REF tree.  */
> +
> +static tree
> +fix_conditional_array_notations_1 (tree stmt)

Comment should explain return value semantics.

> +                        TREE_TYPE (array_var[ii]));
> +     
> +    }

Another stray blank line.  Check the patch generally for stray blank lines 
immediately before a '}', I don't think they ever make sense, but I may 
have missed some.

> +  // XDELETEVEC (array_var);

I don't think this sort of commented-out code should be added.  If you're 
deliberately not doing something that a reader might expect to be done, 
have a comment explaining *why* you're not doing it, not just 
commented-out code to do it.

> +      error_at (location, "__sec_reduce_min_ind or __sec_reduce_max_ind 
> cannot"
> +             " have arrays with dimension greater than 1.");

Diagnostics don't end with ".".

> +    default:
> +      gcc_unreachable ();  /* You should not reach here.  */

No need for comments like this that just repeat the plain semantics of the 
C code.  There's nothing else a call to gcc_unreachable could possibly 
mean; such a comment is of no more use than "i++; /* Add 1 to i.  */".

> +/* Returns true of FUNC_NAME is a builtin array notation function.  The type 
> of
> +   function is returned in *TYPE.  */

"true if", not "true of".

> +bool
> +is_builtin_array_notation_fn (tree func_name, an_reduce_type *type)
> +{
> +  const char *function_name = NULL;
> +
> +  if (!func_name)
> +    return false;

Another unexplained test for a NULL argument.  Again, explain what sort of 
things FUNC_NAME may be.  (This is another function that should be using 
BUILT_IN_* enum values rather than strcmp, if you rework how the built-in 
functions are implemented.)

> +/* Returns true of EXPR (and its subtrees) contain ARRAY_NOTATION_EXPR node. 
>  */

"true if", again.

> +/* Replaces array notations in void function call arguments in ARG with loop 
> and
> +   tree-node ARRAY_REF and returns that value in a tree node variable called
> +   LOOP.  */

LOOP is not an argument to this function, so it doesn't make sense to 
refer to it in the comment.  I suspect " in a tree node variable called 
LOOP" should simply be removed.

> +  if (TREE_CODE (arg) == CALL_EXPR
> +      && is_builtin_array_notation_fn (CALL_EXPR_FN (arg), &an_type))
> +    {
> +      loop = fix_builtin_array_notation_fn (arg, &new_var);
> +      /* We are ignoring the new var because either the user does not want to
> +      capture it OR he is using sec_reduce_mutating function.  */

In general I think "the user" comments should be avoided though this one 
is a bit less awkward than those defining function semantics by reference 
to "the user".

> +/* Walks through tree node T and find all the call-statments that do not 
> return
> +   anything and fix up any array notations they may carry.  */
> +
> +tree
> +expand_array_notation_exprs (tree t)

Comment should document return value.

> +{
> +  if (!t || !contains_array_notation_expr (t))
> +    return t;

Another check for NULL without a comment saying NULL is a valid argument.

> +/* Returns array notation expression for the array base ARRAY of type TYPE,
> +   with start index, length and stride given by START_INDEX, LENGTH and 
> STRIDE,
> +   respectively.  */
> +
> +tree
> +build_array_notation_ref (location_t loc, tree array, tree start_index, 
> +                       tree length, tree stride, tree type)
> +{
> +  tree array_ntn_tree = NULL_TREE;
> +  size_t stride_rank = 0, length_rank = 0, start_rank = 0;
> +  
> +  if (!TREE_TYPE (start_index) || !INTEGRAL_TYPE_P (TREE_TYPE (start_index)))

I'd expect the argument would have to be an expression, so would always 
have a TYPE and the !TREE_TYPE (start_index) check should be unnecessary.  
If it is needed, explain further in the comment at the start of the 
function.  Likewise for other checks for NULL types in this function.

> +    {
> +      error_at (loc,
> +             "start-index of array notation triplet is not an integer.");

Diagnostic should not end with ".".

> +      error_at (loc, "length of array notation triplet is not an integer.");

Likewise.

> +      error_at (loc, "stride of array notation triplet is not an integer.");

Likewise.

> +      error_at (loc, "rank of an array notation triplet's start-index is not 
> "
> +             "zero.");

Likewise.

> +      error_at (loc, "rank of an array notation triplet's length is not 
> zero.");

Likewise.

> +      error_at (loc, "rank of array notation triplet's stride is not zero.");

Likewise.

That's a coding style review of the first half or so of the patch, more 
later....

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [patch] cilkplus array notation for C (clean, independent patchset, take 1)

Reply via email to