Hi Tobias!

On 2019-10-23T22:34:42+0200, Tobias Burnus <tob...@codesourcery.com> wrote:
> Updated version attached. Changes:
> * Use "true" instead of "openacc" for the OpenACC-only "copy()" clause 
> (as not shared w/ OpenMP)
> * Add some documentation to gimplify.c
> * Use GOVD_FIRSTPRIVATE also for "kernel"

Thanks!

> The patch survived bootstrapping + regtesting on my laptop (no 
> offloading) and on a build server (with nvptx offloading).

OK for trunk, with the following few small items considered.  To record
the review effort, please include "Reviewed-by: Thomas Schwinge
<tho...@codesourcery.com>" in the commit log, see
<https://gcc.gnu.org/wiki/Reviewed-by>.


> On 10/18/19 3:26 PM, Thomas Schwinge wrote:
>> I'll be quick to note that I don't have any first-hand experience with 
>> Fortran common blocks. :-P 
>
> To quote you from below: "So, please do study that closer. ;-P"

Haha!  ;-P (You don't want to know how long is my list of items that I
might/want/could look into...)

> I also do not have first-hand experience (as I started with Fortran 95 + 
> some of F2003), but common blocks were a nice idea of the early 1960 to 
> provide access to global memory, avoiding to pass all data as arguments 
> (which also has stack issues). They have been replaced by derived types 
> and variables declared at module level since Fortran 90. See 
> https://j3-fortran.org/doc/year/18/18-007r1.pdf or 
> https://web.stanford.edu/class/me200c/tutorial_77/13_common.html

..., and didn't "They have been replaced by [...]" go as far as that
they've actually been deprecated in recent Fortran standard revisions?
(I may be misremembering.)


Anyway:

> On 10/18/19 3:26 PM, Thomas Schwinge wrote:
>>> For OpenACC, gfortran already supports common blocks for 
>>> device_resident/usedevice/cache/flush/link.

>> I'll defer to your judgement there, but just one comment: I noticed 
>> that OpenACC 2.7 in 2.7. "Data Clauses" states that "For all clauses 
>> except 'deviceptr' and 'present', the list argument may include a 
>> Fortran_common block_ name enclosed within slashes, if that _common 
>> block_ name also appears in a 'declare' directive 'link' clause".
>>
>> Are we already properly handling the aspect that requires that the 
>> "that _common block_ name also appears in a 'declare' directive 'link' 
>> clause"? 
>
> I don't know neither the OpenACC spec nor the GCC implementation well 
> enough to claim proper (!) handling. However, as stated above: 
> device_resident/usedevice/cache/flush/link do support common block 
> arguments.

(... in the front end at least.)

> Looking at the testsuite, link and device_resident are tested in 
> gfortran.dg/goacc/declare-2.f95. (list.f95 and reduction.f95 also use 
> come common blocks.) – And gfortran.dg/goacc/common-block-1.f90 has been 
> added.

(..., again, that'S all front end testing, so not sufficient to claim it
actually works for executing user code.)  ;-\

>> The libgomp execution test cases you're adding all state that "This test 
>> does not exercise ACC DECLARE", yet they supposedly already do work fine. Or 
>> am I understading the OpenACC specification wrongly here?
>
> You need to ask Cesar, who wrote the test case and that comment, why he 
> added it.

Well, Cesar is not working on GCC anymore, thus you've been asked to
adopt his patch, and fix it up, change it as necessary.

> The patch does not touch 'link'/'device_resident' clauses of 'declare', 
> hence, I think he didn't see a reason to add a run-time test case for 
> it.

(Or such testing didn't work, but there was no time/interest at that
point to make it work.)

> – That's independent from whether it is supported by the OpenACC 
> spec and whether it is "properly" implemented in GCC/gfortran.
>
>> I'm certainly aware of (big) deficiencies in the OpenACC 'declare' handling 
>> so I guess my question here may be whether these test cases are valid after 
>> all?
>
> Well, you are the OpenACC specialist – both spec wise and 
> GCC-implementation wise.

Sure, I do know some things, but I'm certainly not all-knowing -- that's
why I needed you to look into this in more detail.

> However, as the test cases are currently 
> parsing-only test cases, I think they should be fine.

OK, and everything else we're thus delaying for later.  That's OK -- what
we got here now is certainly an improvement on its own.  I just wanted to
make sure that we're not missing something obvious.


>>> gcc/gimplify.c: oacc_default_clause contains some changes; there are 
>>> additionally two lines which only differ for ORT_ACC – Hence, it is an 
>>> OpenACC-only change!
>>> The ME change is about privatizing common blocks (I haven't studied this 
>>> part closer.)
>> So, please do study that closer.  ;-P
>>
>> In<http://mid.mail-archive.com/87efx6haep.fsf@euler.schwinge.homeip.net>
>> I raised some questions, got a bit of an answer, and in
>> <http://mid.mail-archive.com/87bms85kra.fsf@hertz.schwinge.homeip.net>
>> asked further, didn't get an answer.

By the way, in the mean time I also found the original GCC trunk
submission email:
<http://mid.mail-archive.com/a028d039-7e9e-792a-7424-ccab1bb425f4@codesourcery.com>.
(Mentioning that just in case that carries any additional information for
you.)


>> All the rationale from Cesar's original submission email should be
>> transferred into 'gcc/gimplify.c' as much as feasible, to make that
>> "voodoo code" better understandable.
>
> I have now added some comments to the patch.

Thanks.


> I also changed GOVD_MAP to 
> GOVD_FIRSTPRIVATE for "acc kernels" to match "acc parallel"; I think 
> that makes sense in terms of what Cesar has written – but I am not 
> completely sure about this.

OK.  Given that this "abstractly" seems to make sense to both of us,
let's do it that way.

Or, would it be easy to add an OpenACC 'kernels' test case that otherwise
faild (at run time, say, with aforementioned duplicate mapping errors, or
would contain "strange"/duplicate/conflicting mapping items in the
'-fdump-tree-gimple' dump)?


>>> Due to the wonders of GIT – when not requiring linear history and due to 
>>> rebasing with GCC9, it is also part of the OG9 commit 
>>> ac6c90812344f4f4cfe4d2f5901c1a9d038a4000
>> There's no Git magic involved there: somebody just (manually) merged
>> several these patches together into one, for no good reason.  ;-\
>
> Well, there is more. If you do not enforce linear history, you cannot 
> easily say to git: Give me all changes between this commit and that 
> commit – as they pass by in a sneak path. And by default, GIT merges 
> such that the private version is the "main" branch – and one merges the 
> other branch ("upstream") into the own branch. This can quickly become 
> quite confusing.

I'm not sure I'm following your argument there.

> In particular, it is not easy to see when/why some code disappeared. You 
> have some patch – someone else had merge problems, accidentally removed 
> it and then if you diff or do "log -p", it looks as if the code was 
> never there, unless you explicitly dig into the branch whose commits 
> were merged into the "main" branch.

There are "Diff Formatting" options like '-c', '--cc', '-m', '-t' which
may help.

Anyway, that's certainly not related to Fortran common block support.


> PS: I am a great fan of patch submissions by the authors – it avoids 
> later digging and guess work for reasons why someone else wrote 
> something in a particular way.

Absolutely agreed!

On the other hand, what you've now done, re-engineering the original
rationale etc., makes my review much easier, because that then gives
greater confidence in the changes, I can then trust that you didn't just
copy the original patch, but instead spent your own time thinking it
through.


> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -7219,15 +7219,28 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, 
> tree decl, unsigned flags)
>  {
>    const char *rkind;
>    bool on_device = false;
> +  bool is_private = false;
>    bool declared = is_oacc_declared (decl);
>    tree type = TREE_TYPE (decl);
>  
>    if (lang_hooks.decls.omp_privatize_by_reference (decl))
>      type = TREE_TYPE (type);
>  
> +  /* For Fortran COMMON blocks, only used variables in those blocks are
> +     transfered and remapped.  The block itself will have a private clause to
> +     avoid transfering the data twice.
> +     The hook evaluates to false by default.  For a variable in Fortran's 
> COMMON
> +     or EQUIVALENCE block, returns 'true' (as we have shared=false) - as only
> +     the variables in such a COMMON/EQUIVALENCE block shall be privatized not
> +     the whole block.  For C++ and Fortran, it can also be true under certain
> +     other conditions, if DECL_HAS_VALUE_EXPR.  */
> +  if (RECORD_OR_UNION_TYPE_P (type))
> +    is_private = lang_hooks.decls.omp_disregard_value_expr (decl, false);
> +
>    if ((ctx->region_type & (ORT_ACC_PARALLEL | ORT_ACC_KERNELS)) != 0
>        && is_global_var (decl)
> -      && device_resident_p (decl))
> +      && device_resident_p (decl)
> +      && !is_private)
>      {
>        on_device = true;
>        flags |= GOVD_MAP_TO_ONLY;
> @@ -7238,7 +7251,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree 
> decl, unsigned flags)
>      case ORT_ACC_KERNELS:
>        rkind = "kernels";
>  
> -      if (AGGREGATE_TYPE_P (type))
> +      if (is_private)
> +     flags |= GOVD_FIRSTPRIVATE;
> +      else if (AGGREGATE_TYPE_P (type))
>       {
>         /* Aggregates default to 'present_or_copy', or 'present'.  */
>         if (ctx->default_kind != OMP_CLAUSE_DEFAULT_PRESENT)
> @@ -7255,7 +7270,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree 
> decl, unsigned flags)
>      case ORT_ACC_PARALLEL:
>        rkind = "parallel";
>  
> -      if (on_device || declared)
> +      if (is_private)
> +     flags |= GOVD_FIRSTPRIVATE;
> +      else if (on_device || declared)
>       flags |= GOVD_MAP;
>        else if (AGGREGATE_TYPE_P (type))
>       {
> @@ -7321,7 +7338,11 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, 
> tree decl, bool in_code)
|        if (DECL_HAS_VALUE_EXPR_P (decl))
>       {
>         tree value = get_base_address (DECL_VALUE_EXPR (decl));
>  
> -       if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
> +       /* For OpenACC, defer expansion of value to avoid transfering
> +          privatized common block data instead of im-/explicitly transfered
> +          variables which are in common blocks.  */
> +       if (!(ctx->region_type & ORT_ACC)
> +           && value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
>           return omp_notice_threadprivate_variable (ctx, decl, value);
>       }

Wouldn't it be clearer if that latter one were written as follows:

    if (DECL_HAS_VALUE_EXPR_P (decl))
      {
        if (ctx->region_type & ORT_ACC)
          /* For OpenACC, defer expansion of value to avoid transfering
             privatized common block data instead of im-/explicitly transfered
             variables which are in common blocks.  */
          ;
        else
          {
            tree value = get_base_address (DECL_VALUE_EXPR (decl));
    
            if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
              return omp_notice_threadprivate_variable (ctx, decl, value);
          }
      }

> @@ -7353,7 +7374,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
> decl, bool in_code)
>    n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
>    if ((ctx->region_type & ORT_TARGET) != 0)
>      {
> -      ret = lang_hooks.decls.omp_disregard_value_expr (decl, true);
> +      /* For OpenACC, as remarked above, defer expansion.  */
> +      shared = !(ctx->region_type & ORT_ACC);
> +      ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);

Also more explicit, easier to read:

    if (ctx->region_type & ORT_ACC)
      /* For OpenACC, as remarked above, defer expansion.  */
      shared = false;
    else
      shared = true;

> @@ -7521,6 +7544,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
> decl, bool in_code)
>      }
>  
>    shared = ((flags | n->value) & GOVD_SHARED) != 0;
> +  /* For OpenACC, cf. remark above regaring common blocks.  */
> +  if (ctx->region_type & ORT_ACC)
> +    shared = false;
>    ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);

And again:

    if (ctx->region_type & ORT_ACC)
      /* For OpenACC, cf. remark above regaring common blocks.  */
      shared = false;
    else
      shared = ((flags | n->value) & GOVD_SHARED) != 0;

(In all three cases, using an easy 'if (ctx->region_type & ORT_ACC)' to
point out the special case.)

It's still some kind of voodoo to me -- but at least, you've now also
reviewed this, and it's now documented what's going on.


> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90

> @@ -0,0 +1,69 @@
> +! Test data clauses involving common blocks and common block data.
> +! Specifically, validates early matching errors.
> +
> +subroutine subtest
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +  !$acc declare link(/blockA/, /blockB/, e, v)
> +end subroutine subtest
> +
> +program test
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +  !$acc declare link(/blockA/, /blockB/, e, v)
> +
> +  !$acc data copy(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyin(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data create(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopy(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyin(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcreate(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data present(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in 
> OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +
> +  !$acc parallel private(/blockA/, /blockB/, e, v)
> +  !$acc end parallel
> +
> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
> +  !$acc end parallel
> +
> +  !$acc exit data delete(/blockA/, /blockB/, e, v)

I note there is one single 'exit data' test, but no 'enter data'.

Also, 'update' is missing, to test the 'device' and 'self'/'host' clauses.

> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error 
> in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +
> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error 
> in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +end program test

Is there a reason for the duplicated 'deviceptr' testing?

Move 'data deviceptr' up a little bit, next to the other 'data' construct
testing?

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90

Similarly.


Grüße
 Thomas

Attachment: signature.asc
Description: PGP signature

Reply via email to