Hi Chung-Lin, Tobias!

On 2023-07-11T02:33:58+0800, Chung-Lin Tang <chunglin.t...@siemens.com> wrote:
> this patch contains support for the 'readonly' modifier in copyin clauses
> and the cache directive.

Thanks!

> As we discussed earlier, the work for actually linking this to middle-end
> points-to analysis is a somewhat non-trivial issue. This first patch allows
> the language feature to be used in OpenACC directives first (with no effect 
> for now).
> The middle-end changes are probably going to be a later patch.

ACK.

> (Also CCing Tobias because of the Fortran bits)

A few specific GCC/Fortran questions for Tobias below, and some more
review comments for Chung-Lin:

> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -14059,7 +14059,8 @@ c_parser_omp_variable_list (c_parser *parser,
>
>  static tree
>  c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
> -                           tree list, bool allow_deref = false)
> +                           tree list, bool allow_deref = false,
> +                           bool *readonly = NULL)
>  {
>    /* The clauses location.  */
>    location_t loc = c_parser_peek_token (parser)->location;
> @@ -14067,6 +14068,20 @@ c_parser_omp_var_list_parens (c_parser *parser, enum 
> omp_clause_code kind,
>    matching_parens parens;
>    if (parens.require_open (parser))
>      {
> +      if (readonly != NULL)
> +     {
> +       c_token *token = c_parser_peek_token (parser);
> +       if (token->type == CPP_NAME
> +           && !strcmp (IDENTIFIER_POINTER (token->value), "readonly")
> +           && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
> +         {
> +           c_parser_consume_token (parser);
> +           c_parser_consume_token (parser);
> +           *readonly = true;
> +         }
> +       else
> +         *readonly = false;
> +     }
>        list = c_parser_omp_variable_list (parser, loc, kind, list, 
> allow_deref);
>        parens.skip_until_found_close (parser);
>      }

Instead of doing this in 'c_parser_omp_var_list_parens', I think it's
clearer to have this special 'readonly :' parsing logic in the two places
where it's used.  For example (random), like 'ancestor :' is parsed in
'c_parser_omp_clause_device', or 'conditional :' is parsed in
'c_parser_omp_clause_lastprivate'.  (Yes, this does duplicate a bit of
code, but that's easy enough to follow along.)

The existing 'enum omp_clause_code kind', 'bool allow_deref' actually
affect the parsing process; the new 'bool readonly' only propagates a
flag.

> @@ -14084,7 +14099,11 @@ c_parser_omp_var_list_parens (c_parser *parser, enum 
> omp_clause_code kind,
>     OpenACC 2.6:
>     no_create ( variable-list )
>     attach ( variable-list )
> -   detach ( variable-list ) */
> +   detach ( variable-list )
> +
> +   OpenACC 2.7:
> +   copyin (readonly : variable-list )
> + */
>
>  static tree
>  c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
> @@ -14135,11 +14154,22 @@ c_parser_oacc_data_clause (c_parser *parser, 
> pragma_omp_clause c_kind,
>      default:
>        gcc_unreachable ();
>      }
> +
> +  /* Turn on readonly modifier parsing for copyin clause.  */
> +  bool readonly = false, *readonly_ptr = NULL;
> +  if (c_kind == PRAGMA_OACC_CLAUSE_COPYIN)
> +    readonly_ptr = &readonly;
> +
>    tree nl, c;
> -  nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, true);
> +  nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, true,
> +                                  readonly_ptr);

That is, similar to 'c_parser_omp_clause_device', or
'c_parser_omp_clause_lastprivate', inline 'c_parser_omp_var_list_parens'
here, and only for 'PRAGMA_OACC_CLAUSE_COPYIN' parse 'readonly :', then
(for all) use 'c_parser_omp_variable_list' etc. instead of
'c_parser_omp_var_list_parens', then set 'readonly':

>    for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
> -    OMP_CLAUSE_SET_MAP_KIND (c, kind);
> +    {
> +      OMP_CLAUSE_SET_MAP_KIND (c, kind);
> +      if (readonly)
> +     OMP_CLAUSE_MAP_READONLY (c) = 1;
> +    }
>
>    return nl;

> @@ -18212,6 +18242,9 @@ c_parser_omp_structured_block (c_parser *parser, bool 
> *if_p)
>  /* OpenACC 2.0:
>     # pragma acc cache (variable-list) new-line
>
> +   OpenACC 2.7:
> +   # pragma acc cache (readonly: variable-list) new-line
> +
>     LOC is the location of the #pragma token.
>  */
>
> @@ -18219,8 +18252,14 @@ static tree
>  c_parser_oacc_cache (location_t loc, c_parser *parser)
>  {
>    tree stmt, clauses;
> +  bool readonly;
> +
> +  clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL,
> +                                       false, &readonly);
> +  if (readonly)
> +    for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> +      OMP_CLAUSE__CACHE__READONLY (c) = 1;
>
> -  clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL);
>    clauses = c_finish_omp_clauses (clauses, C_ORT_ACC);
>
>    c_parser_skip_to_pragma_eol (parser);

Similarly.

> --- a/gcc/cp/parser.cc
> +++ b/gcc/cp/parser.cc

Similarly.

> --- a/gcc/fortran/gfortran.h
> +++ b/gcc/fortran/gfortran.h
> @@ -1360,7 +1360,11 @@ typedef struct gfc_omp_namelist
>      {
>        gfc_omp_reduction_op reduction_op;
>        gfc_omp_depend_doacross_op depend_doacross_op;
> -      gfc_omp_map_op map_op;
> +      struct
> +        {
> +       ENUM_BITFIELD (gfc_omp_map_op) map_op:8;
> +       bool readonly;
> +        };
>        gfc_expr *align;
>        struct
>       {

I did wonder whether the 'readonly' flag should live in the
'gfc_omp_namelist' (as done here -- similar to 'lastprivate_conditional',
for example), or in 'gfc_omp_clauses' (similar to 'ancestor', for
example).  Then I realized/remembered that 'gfc_omp_clauses' exists only
once per directive (which is sufficient for 'ancestor', for example, as
there may be only one OpenMP 'device' clause), whereas 'gfc_omp_namelist'
exists once per list item -- which is what we need for 'readonly'.  Thus,
the above looks good to me.

> --- a/gcc/fortran/openmp.cc
> +++ b/gcc/fortran/openmp.cc
> @@ -1196,7 +1196,7 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : 
> omp_mask (m)
>
>  static bool
>  gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
> -                       bool allow_common, bool allow_derived)
> +                       bool allow_common, bool allow_derived, bool readonly 
> = false)
>  {
>    gfc_omp_namelist **head = NULL;
>    if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, true,
> @@ -1205,7 +1205,10 @@ gfc_match_omp_map_clause (gfc_omp_namelist **list, 
> gfc_omp_map_op map_op,
>      {
>        gfc_omp_namelist *n;
>        for (n = *head; n; n = n->next)
> -     n->u.map_op = map_op;
> +     {
> +       n->u.map_op = map_op;
> +       n->u.readonly = readonly;
> +     }
>        return true;
>      }

Similar to 'c_parser_omp_var_list_parens' above, the existing
'bool allow_common', 'bool allow_derived' actually affect the parsing
process; the new 'bool readonly' only propagates a flag.  Which I
acknowledge the existing 'gfc_omp_map_op map_op' also only does, but that
one's applicable to a lot more instances than 'readonly'.  So I again
wonder if we should keep the latter out of 'gfc_match_omp_map_clause',
and instead set the flag when parsing the 'copyin' clauses; again, for
example (random), like 'ancestor :', or 'conditional :' are parsed --
which you're mostly already doing:

> @@ -2079,11 +2082,16 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
> omp_mask mask,
>           {
>             if (openacc)
>               {
> -               if (gfc_match ("copyin ( ") == MATCH_YES
> -                   && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -                                                OMP_MAP_TO, true,
> -                                                allow_derived))
> -                 continue;
> +               if (gfc_match ("copyin ( ") == MATCH_YES)
> +                 {
> +                   bool readonly = false;
> +                   if (gfc_match ("readonly : ") == MATCH_YES)
> +                     readonly = true;
> +                   if (gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> +                                                 OMP_MAP_TO, true,
> +                                                 allow_derived, readonly))
> +                     continue;
> +                 }
>               }

..., so you'd just set 'readonly' here, instead of having
'gfc_match_omp_map_clause' do that.  Care has to be taken to only do that
for the current list items, which you'll need 'gfc_omp_namelist *head'
for, or similar.  Hmm.  Effectively inline 'gfc_match_omp_map_clause'
here, or do add the 'bool readonly' argument to the latter, or something
else?

Or, we could add a new 'gcc/fortran/gfortran.h:gfc_omp_map_op' item
'OMP_MAP_TO_READONLY', which eventually translates into 'OMP_MAP_TO' with
'readonly' set?  Then we'd just here call the (unaltered)
'gfc_match_omp_map_clause', with
'readonly ? OMP_MAP_TO_READONLY : OMP_MAP_TO'?  Per
'git grep --cached '[^G]OMP_MAP_TO[^F]' -- gcc/fortran/' not a lot of
places need adjusting for that (most of the 'gcc/fortran/openmp.cc' ones
are not applicable).

Tobias?

> @@ -4008,20 +4016,35 @@ gfc_match_oacc_wait (void)
>  match
>  gfc_match_oacc_cache (void)
>  {
> +  bool readonly = false;
>    gfc_omp_clauses *c = gfc_get_omp_clauses ();
>    /* The OpenACC cache directive explicitly only allows "array elements or
>       subarrays", which we're currently not checking here.  Either check this
>       after the call of gfc_match_omp_variable_list, or add something like a
>       only_sections variant next to its allow_sections parameter.  */
> -  match m = gfc_match_omp_variable_list (" (",
> -                                      &c->lists[OMP_LIST_CACHE], true,
> -                                      NULL, NULL, true);
> +  match m = gfc_match (" ( ");
>    if (m != MATCH_YES)
>      {
>        gfc_free_omp_clauses(c);
>        return m;
>      }
>
> +  if (gfc_match ("readonly :") == MATCH_YES)

I note this one does not have a space after ':' in 'gfc_match', but the
one above in 'gfc_match_omp_clauses' does.  I don't know off-hand if that
makes a difference in parsing -- probably not, as all of
'gcc/fortran/openmp.cc' generally doesn't seem to be very consistent
about these two variants?

> +    readonly = true;
> +
> +  gfc_omp_namelist **head = NULL;
> +  m = gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_CACHE], true,
> +                                NULL, &head, true);
> +  if (m != MATCH_YES)
> +    {
> +      gfc_free_omp_clauses(c);
> +      return m;
> +    }
> +
> +  if (readonly)
> +    for (gfc_omp_namelist *n = *head; n; n = n->next)
> +      n->u.readonly = true;

This already looks like how I thought it should look like.

> --- a/gcc/fortran/trans-openmp.cc
> +++ b/gcc/fortran/trans-openmp.cc
> @@ -3067,6 +3067,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
> gfc_omp_clauses *clauses,
>                     || (n->expr && gfc_expr_attr (n->expr).pointer)))
>               always_modifier = true;
>
> +           if (n->u.readonly)
> +             OMP_CLAUSE_MAP_READONLY (node) = 1;
> +
>             switch (n->u.map_op)
>               {
>               case OMP_MAP_ALLOC:
> @@ -3920,6 +3923,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
> gfc_omp_clauses *clauses,
>               }
>             if (n->u.present_modifier)
>               OMP_CLAUSE_MOTION_PRESENT (node) = 1;
> +           if (list == OMP_LIST_CACHE && n->u.readonly)
> +             OMP_CLAUSE__CACHE__READONLY (node) = 1;
>             omp_clauses = gfc_trans_add_clause (node, omp_clauses);
>           }
>         break;

> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> @@ -0,0 +1,27 @@
> +/* { dg-additional-options "-fdump-tree-original" } */
> +
> +struct S
> +{
> +  int *ptr;
> +  float f;
> +};
> +
> +
> +int main (void)
> +{
> +  int x[32];
> +  struct S s = {x, 0};
> +
> +  #pragma acc parallel copyin(readonly: x[:32], s.ptr[:16])
> +  {
> +    #pragma acc cache (readonly: x[:32])
> +  }
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> map\\(readonly,to:\\*s.ptr \\\[len: 64\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] 
> \\\[len: 128\\\]\\)" 1 "original" { target { c } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: 64\\\]\\) .+ 
> map\\(readonly,to:x\\\[0\\\] \\\[len: 128\\\]\\)" 1 "original" { target { c++ 
> } } } } */
> +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache 
> \\(readonly:x\\\[0\\\] \\\[len: 128\\\]\\);$" 1 "original" } } */

Are 'len: 64' etc. also correct for targets where 'sizeof (int) != 4'?
Maybe just mask these out; they're not the important thing we're testing
here?

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> @@ -0,0 +1,28 @@
> +! { dg-additional-options "-fdump-tree-original" }
> +
> +subroutine foo (a, n)
> +  integer :: n, a(:)
> +  integer :: i, b(n)
> +  !$acc parallel copyin(readonly: a(:), b(:n))
> +  do i = 1,32
> +     !$acc cache (readonly: a(:), b(:n))
> +  enddo
> +  !$acc end parallel
> +end subroutine foo
> +
> +program main
> +  integer :: i, n = 32, a(32)
> +  integer :: b(32)
> +  !$acc parallel copyin(readonly: a(:32), b(:n))
> +  do i = 1,32
> +     !$acc cache (readonly: a(:), b(:n))
> +  enddo
> +  !$acc end parallel
> +end program main
> +
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data 
> \\\[len: .+\\\]\\) .+ map\\(readonly,to:\\*\\(integer\\(kind=4\\)\\\[0:\\\] 
> \\*\\) parm.*data \\\[len: .+\\\]\\)" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> map\\(readonly,to:a\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - 
> \\(integer\\(kind=8\\)\\) &a\\) / 4\\\] \\\[len: .+\\\]\\) .+ 
> map\\(readonly,to:b\\\[\\(\\(integer\\(kind=8\\)\\) parm.*data - 
> \\(integer\\(kind=8\\)\\) &b\\) / 4\\\] \\\[len: .+\\\]\\)" 1 "original" } }
> +! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache 
> \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: 
> .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data 
> \\\[len: .+\\\]\\);" 2 "original" } }

You're scanning only one of the two 'cache' directives?  If that's
intentional, please add a comment, why.  If not, add the missing
scanning.

Given the peculiarities of the Fortran parsing, where first all
directive's clauses are collected and then translated en bloc, I
suggest to extent the 'copyin' test cases to have several 'copyin'
clauses, some with, some without 'readonly' modifier, so we make sure
that 'readonly' is set only for the appropriate ones.

Generally, in addition to just 'parallel' compute construct, please
spread this out a bit, to also cover 'kernels', 'serial' compute
constructs, and the 'data' construct.

Generally, please also add testing for the 'declare' directive with
'copyin' with 'readonly' modifier -- and implement handling in case
that's not implicitly covered?  (..., but please don't let you be dragged
into a number of pre-existing issues with OpenACC 'declare' -- I hope the
'readonly' handling is straightforward to test for.)

Given that per the implementation in the front ends, the handling of
'readonly' obviously -- famous last words?  ;-) -- is specific to
'copyin', it's probably OK to not have test cases to verify that the
'readonly' modifier is rejected for other data clauses?

> --- a/gcc/tree-pretty-print.cc
> +++ b/gcc/tree-pretty-print.cc
> @@ -905,6 +905,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int 
> spc, dump_flags_t flags)
>
>      case OMP_CLAUSE_MAP:
>        pp_string (pp, "map(");
> +      if (OMP_CLAUSE_MAP_READONLY (clause))
> +     pp_string (pp, "readonly,");
>        switch (OMP_CLAUSE_MAP_KIND (clause))
>       {
>       case GOMP_MAP_ALLOC:
> @@ -1075,6 +1077,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int 
> spc, dump_flags_t flags)
>
>      case OMP_CLAUSE__CACHE_:
>        pp_string (pp, "(");
> +      if (OMP_CLAUSE__CACHE__READONLY (clause))
> +     pp_string (pp, "readonly:");
>        dump_generic_node (pp, OMP_CLAUSE_DECL (clause),
>                        spc, flags, false);
>        goto print_clause_size;

> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -1813,6 +1813,14 @@ class auto_suppress_location_wrappers
>  #define OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE(NODE) \
>    (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)->base.addressable_flag)
>
> +/* Nonzero if OpenACC 'readonly' modifier set, used for 'copyin'.  */
> +#define OMP_CLAUSE_MAP_READONLY(NODE) \
> +  TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> +
> +/* Same as above, for use in OpenACC cache directives.  */
> +#define OMP_CLAUSE__CACHE__READONLY(NODE) \
> +  TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))

I'm not sure if these special accessor functions are actually useful, or
we should just directly use 'TREE_READONLY' instead?  We're only using
them in contexts where it's clear that the 'OMP_CLAUSE_SUBCODE_CHECK' is
satisfied, for example.

Also, for the new use for OMP clauses, update 'gcc/tree.h:TREE_READONLY',
and in 'gcc/tree-core.h' for 'readonly_flag' the
"table lists the uses of each of the above flags".


Setting 'TREE_READONLY' of the 'OMP_CLAUSE_DECL' instead of the clause
itself isn't the right thing to do -- or is it, and might already
indicate to the middle end the desired semantics?  But does it maybe
conflict with front end/language-level use of 'TREE_READONLY' for 'const'
etc. (I suppose), and thus diagnostics for mismatches?  I mean:

    int a;
    #pragma acc parallel copyin(readonly: a)
    {
      int *b = &a;

... should still continue to work (valid as long as '*b' isn't written
to), so should not raise any
"warning: initialization discards ‘const’ qualifier from pointer target type"
diagnostics.  But if that's not a problem (I don't know how
'TREE_READONLY' is used elsewhere), maybe that's something to give a
thought to?

Or, early in the middle end, propagate 'TREE_READONLY' from the clause to
its 'OMP_CLAUSE_DECL'?  Might need to 'unshare_expr' the latter for
modification and use in the associated region only?

Just some quick thoughts, obviously without any detailed analysis.  ;-)


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Reply via email to