[Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-10-15 Thread Tobias Burnus

This OpenACC-only patch extends the support for /common/ blocks.

[In OpenMP (4.0 to 5.0, unchanged) and gfortran, common blocks are supported in 
copyin/copyprivate, in firstprivate/lastprivate/private/shared, in 
threadprivate and in declare target.]

For OpenACC, gfortran already supports common blocks for 
device_resident/usedevice/cache/flush/link.

This patch adds them (for OpenACC only) to copy/copyin/copyout, create/delete,
host, pcopy/pcopy_in/pcopy_out, present_or_copy, present_or_copy_in,
present_or_copy_out, present_or_create and self.
[Of those, only "copy()" is also an OpenMP clause name.]
[Cf. OpenACC 2.7 in 1.9 (for the p* variants) and 2.13; the latter is new since 
OpenACC 2.0.]



I think the Fortran part is obvious, once one agrees on the list of clauses; 
and OK from a Fortran's maintainer view.

gcc/gimplify.c: oacc_default_clause contains some changes; there are 
additionally two lines which only differ for ORT_ACC – Hence, it is an 
OpenACC-only change!
The ME change is about privatizing common blocks (I haven't studied this part 
closer.)


@Thomas: Please review
@Jakub, all: comments and approvals are welcome.

Tobias

PS: This patch is the rediffed OG9 (alias OG8) patch 
0793cef408c9937f4c4e2423dd1f7d6a97b9bed3 by Cesar Philippidis from 2016. (Which 
was on gomp-4_0-branch as r240165). Due to the wonders of GIT – when not 
requiring linear history and due to rebasing with GCC9, it is also part of the 
OG9 commit ac6c90812344f4f4cfe4d2f5901c1a9d038a4000 – which in addition also 
does some other things like handling OpenACC device pointers.

2019-10-15  Cesar Philippidis 
	Tobias Burnus  

	gcc/fortran/
	* openmp.c (gfc_match_omp_map_clause): Add and pass allow_commons
	argument.
	(gfc_match_omp_clauses): Update calls to permit common blocks for
	OpenACC's copy/copyin/copyout, create/delete, host,
	pcopy/pcopy_in/pcopy_out, present_or_copy, present_or_copy_in,
	present_or_copy_out, present_or_create and self.

	gcc/
	* gimplify.c (oacc_default_clause): Privatize fortran common blocks.
	(omp_notice_variable): Defer the expansion of DECL_VALUE_EXPR for
	common block decls.

	gcc/testsuite/
	* gfortran.dg/goacc/common-block-1.f90: New test.
	* gfortran.dg/goacc/common-block-2.f90: New test.

	libgomp/
	* testsuite/libgomp.oacc-fortran/common-block-1.f90: New test.
	* testsuite/libgomp.oacc-fortran/common-block-2.f90: New test.
	* testsuite/libgomp.oacc-fortran/common-block-3.f90: New test.

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 5c91fcdfd31..dbcb647ea6a 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -926,10 +926,11 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
mapping.  */
 
 static bool
-gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op)
+gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
+			  bool allow_common)
 {
   gfc_omp_namelist **head = NULL;
-  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true)
+  if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, true)
   == MATCH_YES)
 {
   gfc_omp_namelist *n;
@@ -1051,7 +1052,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	  && gfc_match ("copy ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-	   OMP_MAP_TOFROM))
+	   OMP_MAP_TOFROM, openacc))
 	continue;
 	  if (mask & OMP_CLAUSE_COPYIN)
 	{
@@ -1059,7 +1060,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 		{
 		  if (gfc_match ("copyin ( ") == MATCH_YES
 		  && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-		   OMP_MAP_TO))
+		   OMP_MAP_TO, true))
 		continue;
 		}
 	  else if (gfc_match_omp_variable_list ("copyin (",
@@ -1070,7 +1071,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	  && gfc_match ("copyout ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-	   OMP_MAP_FROM))
+	   OMP_MAP_FROM, true))
 	continue;
 	  if ((mask & OMP_CLAUSE_COPYPRIVATE)
 	  && gfc_match_omp_variable_list ("copyprivate (",
@@ -1080,7 +1081,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_CREATE)
 	  && gfc_match ("create ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-	   OMP_MAP_ALLOC))
+	   OMP_MAP_ALLOC, true))
 	continue;
 	  break;
 	case 'd':
@@ -1116,7 +1117,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_DELETE)
 	  && gfc_match ("delete ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-	   OMP_MAP_RELEASE))
+	   OMP_MAP_RELEASE, true))
 	continue;
 	  if ((mask & OMP_CLAUSE_DEPEND)
 	  && gfc_match ("depend ( ") == MATCH_YES)
@@ -1168,12 +1169,12 @@ gfc_match_omp_cl

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-10-18 Thread Thomas Schwinge
Hi!

On 2019-10-15T23:32:32+0200, Tobias Burnus  wrote:
> This OpenACC-only patch extends the support for /common/ blocks.

I'll be quick to note that I don't have any first-hand experience with
Fortran common blocks.  :-P

> [In OpenMP (4.0 to 5.0, unchanged) and gfortran, common blocks are supported 
> in copyin/copyprivate, in firstprivate/lastprivate/private/shared, in 
> threadprivate and in declare target.]
>
> For OpenACC, gfortran already supports common blocks for 
> device_resident/usedevice/cache/flush/link.
>
> This patch adds them (for OpenACC only) to copy/copyin/copyout, create/delete,
> host, pcopy/pcopy_in/pcopy_out, present_or_copy, present_or_copy_in,
> present_or_copy_out, present_or_create and self.
> [Of those, only "copy()" is also an OpenMP clause name.]

I'm confused: in
 Jakub stated
that "OpenMP doesn't have a copy clause, so I'd expect true here":

| @@ -1051,7 +1052,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
| if ((mask & OMP_CLAUSE_COPY)
| && gfc_match ("copy ( ") == MATCH_YES
| && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
| -OMP_MAP_TOFROM))
| +OMP_MAP_TOFROM, openacc))
|   continue;

> [Cf. OpenACC 2.7 in 1.9 (for the p* variants) and 2.13; the latter is new 
> since OpenACC 2.0.]
>
>
>
> I think the Fortran part is obvious, once one agrees on the list of clauses; 
> and OK from a Fortran's maintainer view.

I'll defer to your judgement there, but just one comment: I noticed that
OpenACC 2.7 in 2.7. "Data Clauses" states that "For all clauses except
'deviceptr' and 'present', the list argument may include a Fortran
_common block_ name enclosed within slashes, if that _common block_ name
also appears in a 'declare' directive 'link' clause".

Are we already properly handling the aspect that requires that the "that
_common block_ name also appears in a 'declare' directive 'link' clause"?

The libgomp execution test cases you're adding all state that "This test
does not exercise ACC DECLARE", yet they supposedly already do work fine.
Or am I understading the OpenACC specification wrongly here?

I'm certainly aware of (big) deficiencies in the OpenACC 'declare'
handling, so I guess my question here may be whether these test cases are
valid after all?

> gcc/gimplify.c: oacc_default_clause contains some changes; there are 
> additionally two lines which only differ for ORT_ACC – Hence, it is an 
> OpenACC-only change!
> The ME change is about privatizing common blocks (I haven't studied this part 
> closer.)

So, please do study that closer.  ;-P

In 
I raised some questions, got a bit of an answer, and in

asked further, didn't get an answer.

All the rationale from Cesar's original submission email should be
transferred into 'gcc/gimplify.c' as much as feasible, to make that
"voodoo code" better understandable.

> @Jakub, all: comments and approvals are welcome.

Indeed.  :-)

>   gcc/
>   * gimplify.c (oacc_default_clause): Privatize fortran common blocks.
>   (omp_notice_variable): Defer the expansion of DECL_VALUE_EXPR for
>   common block decls.

> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -7218,15 +7218,20 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, 
> tree decl, unsigned flags)
>  {
>const char *rkind;
>bool on_device = false;
> +  bool is_private = false;
>bool declared = is_oacc_declared (decl);
>tree type = TREE_TYPE (decl);
>  
>if (lang_hooks.decls.omp_privatize_by_reference (decl))
>  type = TREE_TYPE (type);
>  
> +  if (RECORD_OR_UNION_TYPE_P (type))
> +is_private = lang_hooks.decls.omp_disregard_value_expr (decl, false);
> +
>if ((ctx->region_type & (ORT_ACC_PARALLEL | ORT_ACC_KERNELS)) != 0
>&& is_global_var (decl)
> -  && device_resident_p (decl))
> +  && device_resident_p (decl)
> +  && !is_private)
>  {
>on_device = true;
>flags |= GOVD_MAP_TO_ONLY;
> @@ -7237,7 +7242,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree 
> decl, unsigned flags)
>  case ORT_ACC_KERNELS:
>rkind = "kernels";
>  
> -  if (AGGREGATE_TYPE_P (type))
> +  if (is_private)
> + flags |= GOVD_MAP;
> +  else if (AGGREGATE_TYPE_P (type))
>   {
> /* Aggregates default to 'present_or_copy', or 'present'.  */
> if (ctx->default_kind != OMP_CLAUSE_DEFAULT_PRESENT)
> @@ -7254,7 +7261,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree 
> decl, unsigned flags)
>  case ORT_ACC_PARALLEL:
>rkind = "parallel";
>  
> -  if (on_device || declared)
> +  if (is_private)
> + flags |= GOVD_FIRSTPRIVATE;
> +  else if (on_device || declared)
>   flags |= GOVD_

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-10-23 Thread Tobias Burnus

Hi Thomas,

Updated version attached. Changes:
* Use "true" instead of "openacc" for the OpenACC-only "copy()" clause 
(as not shared w/ OpenMP)

* Add some documentation to gimplify.c
* Use GOVD_FIRSTPRIVATE also for "kernel"

The patch survived bootstrapping + regtesting on my laptop (no 
offloading) and on a build server (with nvptx offloading).


On 10/18/19 3:26 PM, Thomas Schwinge wrote:
I'll be quick to note that I don't have any first-hand experience with 
Fortran common blocks. :-P 


To quote you from below: "So, please do study that closer. ;-P"

I also do not have first-hand experience (as I started with Fortran 95 + 
some of F2003), but common blocks were a nice idea of the early 1960 to 
provide access to global memory, avoiding to pass all data as arguments 
(which also has stack issues). They have been replaced by derived types 
and variables declared at module level since Fortran 90. See 
https://j3-fortran.org/doc/year/18/18-007r1.pdf or 
https://web.stanford.edu/class/me200c/tutorial_77/13_common.html



On 10/18/19 3:26 PM, Thomas Schwinge wrote:

For OpenACC, gfortran already supports common blocks for 
device_resident/usedevice/cache/flush/link.
[…] [Of those, only "copy()" is also an OpenMP clause name.]

I'm confused: in […] "OpenMP doesn't have a copy clause, so I'd expect true 
here":


I concur – only "copyin" and "copyprivate" exist in OpenMP. (But thanks 
to "if (openacc)" no "openacc" is needed, either.)



I'll defer to your judgement there, but just one comment: I noticed 
that OpenACC 2.7 in 2.7. "Data Clauses" states that "For all clauses 
except 'deviceptr' and 'present', the list argument may include a 
Fortran_common block_ name enclosed within slashes, if that _common 
block_ name also appears in a 'declare' directive 'link' clause".


Are we already properly handling the aspect that requires that the 
"that _common block_ name also appears in a 'declare' directive 'link' 
clause"? 


I don't know neither the OpenACC spec nor the GCC implementation well 
enough to claim proper (!) handling. However, as stated above: 
device_resident/usedevice/cache/flush/link do support common block 
arguments.


Looking at the testsuite, link and device_resident are tested in 
gfortran.dg/goacc/declare-2.f95. (list.f95 and reduction.f95 also use 
come common blocks.) – And gfortran.dg/goacc/common-block-1.f90 has been 
added.




The libgomp execution test cases you're adding all state that "This test does not 
exercise ACC DECLARE", yet they supposedly already do work fine. Or am I 
understading the OpenACC specification wrongly here?


You need to ask Cesar, who wrote the test case and that comment, why he 
added it.


The patch does not touch 'link'/'device_resident' clauses of 'declare', 
hence, I think he didn't see a reason to add a run-time test case for 
it. – That's independent from whether it is supported by the OpenACC 
spec and whether it is "properly" implemented in GCC/gfortran.



I'm certainly aware of (big) deficiencies in the OpenACC 'declare' handling so 
I guess my question here may be whether these test cases are valid after all?


Well, you are the OpenACC specialist – both spec wise and 
GCC-implementation wise. However, as the test cases are currently 
parsing-only test cases, I think they should be fine.




gcc/gimplify.c: oacc_default_clause contains some changes; there are 
additionally two lines which only differ for ORT_ACC – Hence, it is an 
OpenACC-only change!
The ME change is about privatizing common blocks (I haven't studied this part 
closer.)

So, please do study that closer.  ;-P

In
I raised some questions, got a bit of an answer, and in

asked further, didn't get an answer.

All the rationale from Cesar's original submission email should be
transferred into 'gcc/gimplify.c' as much as feasible, to make that
"voodoo code" better understandable.



I have now added some comments to the patch. I also changed GOVD_MAP to 
GOVD_FIRSTPRIVATE for "acc kernels" to match "acc parallel"; I think 
that makes sense in terms of what Cesar has written – but I am not 
completely sure about this.


Cross ref: The original email is 
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00950.html ; the review 
starts here https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00250.html 
(same email as mid.mail-archive.com link above).


BTW: That patch – rediffed for OG9 and augmented by several other 
patches (including deviceptr) – was then submitted at 
https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01911.html and first 
reviewed at https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00176.html and 
then committed to OG9 at 
https://gcc.gnu.org/ml/gcc-patches/2019-01/msg00051.html




Due to the wonders of GIT – when not requiring linear history and due to 
rebasing with GCC9, it is also part of the OG9 commit 
ac6c90812344f4f4cfe4d2f5901c1a9d

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-10-25 Thread Thomas Schwinge
Hi Tobias!

On 2019-10-23T22:34:42+0200, Tobias Burnus  wrote:
> Updated version attached. Changes:
> * Use "true" instead of "openacc" for the OpenACC-only "copy()" clause 
> (as not shared w/ OpenMP)
> * Add some documentation to gimplify.c
> * Use GOVD_FIRSTPRIVATE also for "kernel"

Thanks!

> The patch survived bootstrapping + regtesting on my laptop (no 
> offloading) and on a build server (with nvptx offloading).

OK for trunk, with the following few small items considered.  To record
the review effort, please include "Reviewed-by: Thomas Schwinge
" in the commit log, see
.


> On 10/18/19 3:26 PM, Thomas Schwinge wrote:
>> I'll be quick to note that I don't have any first-hand experience with 
>> Fortran common blocks. :-P 
>
> To quote you from below: "So, please do study that closer. ;-P"

Haha!  ;-P (You don't want to know how long is my list of items that I
might/want/could look into...)

> I also do not have first-hand experience (as I started with Fortran 95 + 
> some of F2003), but common blocks were a nice idea of the early 1960 to 
> provide access to global memory, avoiding to pass all data as arguments 
> (which also has stack issues). They have been replaced by derived types 
> and variables declared at module level since Fortran 90. See 
> https://j3-fortran.org/doc/year/18/18-007r1.pdf or 
> https://web.stanford.edu/class/me200c/tutorial_77/13_common.html

..., and didn't "They have been replaced by [...]" go as far as that
they've actually been deprecated in recent Fortran standard revisions?
(I may be misremembering.)


Anyway:

> On 10/18/19 3:26 PM, Thomas Schwinge wrote:
>>> For OpenACC, gfortran already supports common blocks for 
>>> device_resident/usedevice/cache/flush/link.

>> I'll defer to your judgement there, but just one comment: I noticed 
>> that OpenACC 2.7 in 2.7. "Data Clauses" states that "For all clauses 
>> except 'deviceptr' and 'present', the list argument may include a 
>> Fortran_common block_ name enclosed within slashes, if that _common 
>> block_ name also appears in a 'declare' directive 'link' clause".
>>
>> Are we already properly handling the aspect that requires that the 
>> "that _common block_ name also appears in a 'declare' directive 'link' 
>> clause"? 
>
> I don't know neither the OpenACC spec nor the GCC implementation well 
> enough to claim proper (!) handling. However, as stated above: 
> device_resident/usedevice/cache/flush/link do support common block 
> arguments.

(... in the front end at least.)

> Looking at the testsuite, link and device_resident are tested in 
> gfortran.dg/goacc/declare-2.f95. (list.f95 and reduction.f95 also use 
> come common blocks.) – And gfortran.dg/goacc/common-block-1.f90 has been 
> added.

(..., again, that'S all front end testing, so not sufficient to claim it
actually works for executing user code.)  ;-\

>> The libgomp execution test cases you're adding all state that "This test 
>> does not exercise ACC DECLARE", yet they supposedly already do work fine. Or 
>> am I understading the OpenACC specification wrongly here?
>
> You need to ask Cesar, who wrote the test case and that comment, why he 
> added it.

Well, Cesar is not working on GCC anymore, thus you've been asked to
adopt his patch, and fix it up, change it as necessary.

> The patch does not touch 'link'/'device_resident' clauses of 'declare', 
> hence, I think he didn't see a reason to add a run-time test case for 
> it.

(Or such testing didn't work, but there was no time/interest at that
point to make it work.)

> – That's independent from whether it is supported by the OpenACC 
> spec and whether it is "properly" implemented in GCC/gfortran.
>
>> I'm certainly aware of (big) deficiencies in the OpenACC 'declare' handling 
>> so I guess my question here may be whether these test cases are valid after 
>> all?
>
> Well, you are the OpenACC specialist – both spec wise and 
> GCC-implementation wise.

Sure, I do know some things, but I'm certainly not all-knowing -- that's
why I needed you to look into this in more detail.

> However, as the test cases are currently 
> parsing-only test cases, I think they should be fine.

OK, and everything else we're thus delaying for later.  That's OK -- what
we got here now is certainly an improvement on its own.  I just wanted to
make sure that we're not missing something obvious.


>>> gcc/gimplify.c: oacc_default_clause contains some changes; there are 
>>> additionally two lines which only differ for ORT_ACC – Hence, it is an 
>>> OpenACC-only change!
>>> The ME change is about privatizing common blocks (I haven't studied this 
>>> part closer.)
>> So, please do study that closer.  ;-P
>>
>> In
>> I raised some questions, got a bit of an answer, and in
>> 
>> asked further, didn't get an answer.

By the way, in the mean time I

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-10-25 Thread Tobias Burnus

Hi Thomas,

On 10/25/19 10:43 AM, Thomas Schwinge wrote:

OK for trunk, with the following few small items considered.


Committed as Rev. 277451 – after a fresh bootstrap and regtesting.

Changes:
* I have now a new test case 
libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at 
omplower.
* In the compile-time *{2,3} test case, there is now also a 'enter data' 
and 'update host/self/device' test.

* the libgomp tests have a 'dg-do run'.
* I modified the code in gimplify.c as proposed.


Regarding the new test case: Without the gcc/gimplify.c changes, one has 
(see last item before child fn):


    #pragma omp target oacc_parallel map(tofrom:a [len: 400]) 
map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) map(tofrom:block [len: 
812]) [child fn …
    #pragma omp target oacc_kernels map(force_tofrom:i [len: 4]) 
map(tofrom:y [len: 400]) map(tofrom:x [len: 400]) 
map(tofrom:kernel_block [len: 804]) map(force_tofrom:c [len: 4]) 
map(tofrom:block [len: 812])  [child fn …


With the changes of gcc/gimplify.c, one has:

    #pragma omp target oacc_parallel map(tofrom:a [len: 400]) 
map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) [child fn …
    #pragma omp target oacc_kernels map(force_tofrom:i [len: 4]) 
map(tofrom:y [len: 400]) map(tofrom:x [len: 400]) map(force_tofrom:c 
[len: 4])  [child fn …



And without gimplify.c, the added run-tests indeed fail with:
libgomp: Trying to map into device [0x407100..0x407294) object when 
[0x407100..0x407290) is already mapped



Tobias

PS:

Or, would it be easy to add an OpenACC 'kernels' test case that otherwise
faild (at run time, say, with aforementioned duplicate mapping errors, or
would contain "strange"/duplicate/conflicting mapping items in the
'-fdump-tree-gimple' dump)?


See new test case and result for the current tests.

Additionally, I have applied:


Wouldn't it be clearer if that latter one were written as follows:
 if (DECL_HAS_VALUE_EXPR_P (decl))
   {
 if (ctx->region_type & ORT_ACC)
   /* For OpenACC, defer expansion of value to avoid transfering
  privatized common block data instead of im-/explicitly transfered
  variables which are in common blocks.  */
   ;
 else
   {
 tree value = get_base_address (DECL_VALUE_EXPR (decl));
 
 if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))

   return omp_notice_threadprivate_variable (ctx, decl, value);
   }
   }


@@ -7353,7 +7374,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
decl, bool in_code)
n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
if ((ctx->region_type & ORT_TARGET) != 0)
  {
-  ret = lang_hooks.decls.omp_disregard_value_expr (decl, true);
+  /* For OpenACC, as remarked above, defer expansion.  */
+  shared = !(ctx->region_type & ORT_ACC);
+  ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);

Also more explicit, easier to read:

 if (ctx->region_type & ORT_ACC)
   /* For OpenACC, as remarked above, defer expansion.  */
   shared = false;
 else
   shared = true;


@@ -7521,6 +7544,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
decl, bool in_code)
  }
  
shared = ((flags | n->value) & GOVD_SHARED) != 0;

+  /* For OpenACC, cf. remark above regaring common blocks.  */
+  if (ctx->region_type & ORT_ACC)
+shared = false;
ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);

And again:

 if (ctx->region_type & ORT_ACC)
   /* For OpenACC, cf. remark above regaring common blocks.  */
   shared = false;
 else
   shared = ((flags | n->value) & GOVD_SHARED) != 0;

(In all three cases, using an easy 'if (ctx->region_type & ORT_ACC)' to
point out the special case.)

It's still some kind of voodoo to me -- but at least, you've now also
reviewed this, and it's now documented what's going on.



And changed the test case based on:


+  !$acc exit data delete(/blockA/, /blockB/, e, v)

I note there is one single 'exit data' test, but no 'enter data'.

Also, 'update' is missing, to test the 'device' and 'self'/'host' clauses.


+  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in 
OpenMP variable list" }
+  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
+
+  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in 
OpenMP variable list" }
+  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
+end program test

Is there a reason for the duplicated 'deviceptr' testing?

Move 'data deviceptr' up a little bit, next to the other 'data' construct
testing?


--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90

Similarly.
commit 96d1e6235a5b7c81df7940c1c8727f87dc1b577a
Author: burnus 
Date:   Fri Oct 25 14:28:40 2019 +

[Fortran] OpenACC – permit common blocks in some clauses

2019-10-25  Cesar Philippidi

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-11-25 Thread Tobias Burnus

Hi Thomas,

sorry for the belated reply.

Some comments – and a patch modifying two test cases (see below).
Regarding the patch: OK for the trunk?

On 11/11/19 10:39 AM, Thomas Schwinge wrote:

By the way, do you know what's the status is for Fortran common blocks in
OpenMP: supported vs. expected per the specification?


No; however, I had a quick glance at the spec and at the test cases; 
both compile-time and run-time test have some coverage, although I 
didn't spot a run-time test for one 'omp target'.


Definition (3.32.1 in F2018): "blank common" = "unnamed common block". 
'common /name/ x" (continues) define the common block named "name" by 
adding 'x' to it. While "common // y" or "common y" appends 'y' to the 
blank common.


In OpenMP 5, common blocks appear twice – once [2.1, p.39, ll.11ff.] as 
general rule in the definition of "list item" (which are inherited by 
"extended list item" and "locator-list item"). [There are also some 
constraints and notes regarding common blocks)]. It does not really tell 
whether blank commons are permitted or not; some description is 
explicitly for named-common variables, leaving blank-common ones out 
(and undefined). But later sections explicitly make reference to blank 
commons, hence, one can assume they are permitted unless explicitly 
stated that they are not.


And then very selectively for some items:
* allocate – only with default allocator.
* declare target – some restrictions and no blank commons
* depend clause – no common permitted
* threadprivate – some notes and explanation of the syntax (why?)
  also only here requirement regarding common blocks with bind(c)
  (why not also for declare target?)
* linear clause – no common permitted
* copyin – some notes
* copyprivate – some notes

As target test cases were suspiciously left out, I tries '!$omp target 
map(/name/)' which was rejected. I think one should add test cases for 
newer features – which mostly means 'omp target' and add the missing 
common-block checks. – And one has to find out why blank commons are not 
permitted and for the places where they are permitted, support has to be 
added.


Talking about blank common blocks, the current OpenACC implementation 
does not seem to like them (see below); the spec (2.7) does not mention 
blank common blocks at all. – It talks about name between two slashes, 
but leaves it open whether the name can also be an empty string.


common // x,y  !blank common
!$acc parallel copyin(//)
!$acc end parallel
end

fails with:

2 | !$acc parallel copyin(//)
  |   1
Error: Syntax error in OpenMP variable list at (1)


On 2019-10-25T16:36:10+0200, Tobias Burnus  wrote:


* I have now a new test case
libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at
omplower.

Thanks. Curious: why 'omplower' instead of 'gimple' dump?


So far I found -fdump-tree-original, -fdump-omplower and 
-fdump-optimized quite useful – I have so far not used 
-fdump-tree-gimple, hence, I cannot state what's the advantage of the 
latter.


The original dump I like because it shows what the FE generates, the 
omplower dump has the result after lowering including the assignments to 
the omp_arr variables but it keeps a readable pragma line (avoids 
guessing what the kind value was again etc.) while the optimized dump 
really shows what ends up in the call (with the pro and con that it 
depends on the optimization option).


If you think it makes sense, one can switch.


--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
@@ -0,0 +1,39 @@
+! { dg-options "-fopenacc -fdump-tree-omplower" }

(For later: we usually just use 'dg-additional-options
"-fdump-tree-omplower"'; '-fopenacc' is implied inside '*/goacc/'.)


My impression was that it is not effective unless repeated – and I think 
I even tries it. In gcc/testsuite/gfortran.dg/gomp/ all 64 test cases 
with dg-options specify re-add the "-fopenmp".


And in gcc/testsuite/gfortran.dg/goacc, 4 test cases use dg-options, 3 
specify -fopenacc and one doesn't. I wouldn't call this 'usually' and I 
wonder whether -fopenacc is really active for goacc/pr84963.f90. (The 
file uses no directive at all!)


Hence, I wonder whether one should add it to goacc/pr84963.f90 – see 
attached patch.


Without the patch, I get:

Executing on host: …/gfortran/../../gfortran -B…/gfortran/../../ 
-B…/./libgfortran/ …/goacc/pr84963.f90 -fno-diagnostics-show-caret 
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never  
-fdiagnostics-urls=never    -O  -O2 -S -o pr84963.s    (timeout = 300)


And with the patch, I get
… -fdiagnostics-urls=never -O -fopenacc -O2 -S -o pr84963.s …



+  integer :: i, j
+  real ::  a(n) = 0, b(n) = 0, c, d
+  real ::  x(n) = 0, y(n), z
+  common /BLOCK/ a, b, c, j, d
+  common /KERNELS_BLOCK/ x, y, z

For my understanding: the several unused variables in the common blocks
are to make sure that they don't cause any issues, don't get mapped at
all?


I think that's

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-11-26 Thread Tobias Burnus

Hi Thomas,

I now played also around common blocks with "!$acc declare 
device_resident (/block/)". [See attached test-case diff.]


Observations:

* !$acc declare has to come after the declaration of the common block. 
In terms of the spec, it just needs to be in the declaration section, 
i.e. it could also be before. – Seems as if one needs to split parsing 
and resolving clauses.


* If I just use '!$acc parallel', the used variables are copied in 
according to OpenMP 4.0 semantics, i.e. without a defaultmap clause (of 
OpenMP 4.5+; not yet in gfortran), scalars are firstprivate and arrays 
are map(fromto:). – Does this behaviour match the spec or should this 
automatically mapped to, e.g., no_create as the 'device_resident' is 
known? [Side remark: the module file does contain 
"OACC_DECLARE_DEVICE_RESIDENT".]


* If I explicitly use '!$acc parallel present(/block/)' that fails 
because present() does not permit common blocks.
(OpenACC 2.7, p36, l.1054: "For all clauses except deviceptr and 
present, the list argument may include a Fortran common block name 
enclosed within slashes"). I could use no_create, but that's not yet 
supported.


Cheers,

Tobias

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/declare-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/declare-5.f90
index 3ab91147e07..fd46b02abf9 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/declare-5.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-5.f90
@@ -1,29 +1,106 @@
 ! { dg-do run }
 
 module vars
   implicit none
   real b
- !$acc declare device_resident (b)
+  !$acc declare device_resident (b)
+
+  integer :: x, y, z
+  common /block/ x, y, z
+  !$acc declare device_resident (/block/)
 end module vars
 
+subroutine set()
+  use openacc
+  implicit none
+  integer :: a(5), b(1), c, vals(7)
+  common /another/ a, b, c
+  !$acc declare device_resident (/another/)
+  if (.not. acc_is_present (a)) stop 10
+  if (.not. acc_is_present (b)) stop 11
+  if (.not. acc_is_present (c)) stop 12
+
+  vals = 99
+  !$acc parallel copyout(vals) present(a, b, c) ! OK
+! but w/o 'present', 'c' is firstprivate and a+b are 'map(fromto:'
+! additionally, OpenACC 2.7 does not permit present(/another/)
+! and no_create is not yet in the trunk (but submitted)
+a = [11,12,13,14,15]
+b = 16
+c = 47
+vals(1:5) = a
+vals(6:6) = b
+vals(7) = c
+  !$acc end parallel
+
+  if (.not. acc_is_present (a)) stop 13
+  if (.not. acc_is_present (b)) stop 14
+  if (.not. acc_is_present (c)) stop 15
+
+  if (any (vals /= [11,12,13,14,15,16,47])) stop 16
+end subroutine set
+
+subroutine check()
+  use openacc
+  implicit none
+  integer :: g, h(3), i(3)
+  common /another/ g, h, i
+  integer :: val(7)
+  !$acc declare device_resident (/another/)
+  if (.not. acc_is_present (g)) stop 20
+  if (.not. acc_is_present (h)) stop 21
+  if (.not. acc_is_present (i)) stop 22
+
+  val = 99
+  !$acc parallel copyout(val) present(g, h, i)
+val(5:7) = i
+val(1) = g
+val(2:4) = h
+  !$acc end parallel
+
+  if (.not. acc_is_present (g)) stop 23
+  if (.not. acc_is_present (h)) stop 24
+  if (.not. acc_is_present (i)) stop 25
+
+
+  !print *, val
+  if (any (val /= [11,12,13,14,15,16,47])) stop 26
+end subroutine check
+
+
 program test
   use vars
   use openacc
   implicit none
   real a
+  integer :: k
 
-  if (acc_is_present (b) .neqv. .true.) STOP 1
+  call set()
+  call check()
+
+  if (.not. acc_is_present (b)) stop 1
+  if (.not. acc_is_present (x)) stop 2
+  if (.not. acc_is_present (y)) stop 3
+  if (.not. acc_is_present (z)) stop 4
 
   a = 2.0
+  k = 42
 
-  !$acc parallel copy (a)
+  !$acc parallel copy (a, k)
 b = a
 a = 1.0
 a = a + b
+x = k
+y = 7*k - 2*x
+z = 3*y
+k = k - z + y
!$acc end parallel
 
-  if (acc_is_present (b) .neqv. .true.) STOP 2
-
-  if (a .ne. 3.0) STOP 3
+  if (.not. acc_is_present (b)) stop 5
+  if (.not. acc_is_present (x)) stop 6
+  if (.not. acc_is_present (y)) stop 7
+  if (.not. acc_is_present (z)) stop 8
 
+  if (a /= 3.0) stop 3
+  if (k /= -378) stop 3
 end program test


Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-11-28 Thread Thomas Schwinge
Hi Tobias!

On 2019-11-25T15:02:16+0100, Tobias Burnus  wrote:
> sorry for the belated reply.

Eh, no worries -- I'm way more behind on things...


> On 11/11/19 10:39 AM, Thomas Schwinge wrote:
>> By the way, do you know what's the status is for Fortran common blocks in
>> OpenMP: supported vs. expected per the specification?
>
> No; however, I had a quick glance at the spec and at the test cases; 
> both compile-time and run-time test have some coverage, although I 
> didn't spot a run-time test for one 'omp target'.

Thanks.

> Definition (3.32.1 in F2018): "blank common" = "unnamed common block". 
> 'common /name/ x" (continues) define the common block named "name" by 
> adding 'x' to it. While "common // y" or "common y" appends 'y' to the 
> blank common.

Thanks for the concise summary.

> In OpenMP 5, common blocks appear twice – once [2.1, p.39, ll.11ff.] as 
> general rule in the definition of "list item" (which are inherited by 
> "extended list item" and "locator-list item"). [There are also some 
> constraints and notes regarding common blocks)]. It does not really tell 
> whether blank commons are permitted or not; some description is 
> explicitly for named-common variables, leaving blank-common ones out 
> (and undefined). But later sections explicitly make reference to blank 
> commons, hence, one can assume they are permitted unless explicitly 
> stated that they are not.

Yes, I go by the assumption that everything contained in the base
languages of OpenACC/OpenMP (so, the respective C, C++, Fortran
standards), should also work in an OpenACC/OpenMP context in a sensible
manner (detailed/clarified in the respective specification as necessary),
and if not supported then that ought to be spelled out explicitly.  (For
example, see either the "catch-all" notes in OpenACC 3.0,
1.7. "References", or the more in-detail notes in specific sections.)
Anything else I'd consider a bug in the respective specification, which
should be reported/fixed.

That said, if you think OpenMP needs to clarify whether Fortran blank
common blocks are supported or not, then file an issue or directly submit
a pull request against the specification on 
(once we've got access).

> And then very selectively for some items:
> * allocate – only with default allocator.
> * declare target – some restrictions and no blank commons
> * depend clause – no common permitted
> * threadprivate – some notes and explanation of the syntax (why?)
>also only here requirement regarding common blocks with bind(c)
>(why not also for declare target?)
> * linear clause – no common permitted
> * copyin – some notes
> * copyprivate – some notes
>
> As target test cases were suspiciously left out, I tries '!$omp target 
> map(/name/)' which was rejected. I think one should add test cases for 
> newer features – which mostly means 'omp target' and add the missing 
> common-block checks. – And one has to find out why blank commons are not 
> permitted and for the places where they are permitted, support has to be 
> added.

ACK.  Instead of "burying" such things in long emails, I like to see GCC
PRs filed, which can then be actioned on individually.

> Talking about blank common blocks, the current OpenACC implementation 
> does not seem to like them (see below); the spec (2.7) does not mention 
> blank common blocks at all. – It talks about name between two slashes, 
> but leaves it open whether the name can also be an empty string.

My assumption would thus be: yes, ought to be supported -- but I haven't
thought through whether that makes sense, so...

> common // x,y  !blank common
> !$acc parallel copyin(//)
> !$acc end parallel
> end
>
> fails with:
>
>  2 | !$acc parallel copyin(//)
>|   1
> Error: Syntax error in OpenMP variable list at (1)

..., please test with the PGI compiler (just to get more data), and
determine whether that makes sense to support in an OpenACC context
(likewise for OpenMP, of course), and then (once you've got access)
either file an issue, or (better) directly submit a pull request for
 to clarify that.  Sometimes
it's as easy as replacing non-standard text ("name between two slashes")
with the corresponding standard text (whatever the Fortran specification
calls this).


> On 2019-10-25T16:36:10+0200, Tobias Burnus  wrote:
>
>>> * I have now a new test case
>>> libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at
>>> omplower.
>> Thanks. Curious: why 'omplower' instead of 'gimple' dump?
>
> So far I found -fdump-tree-original, -fdump-omplower and 
> -fdump-optimized quite useful – I have so far not used 
> -fdump-tree-gimple, hence, I cannot state what's the advantage of the 
> latter.

My rationale is that your code changes are in 'gcc/gimplify.c', so you'd
test for that stuff in the 'gimple' dump (which is between 'original' and
'omplower').

> The original dump I like because it shows 

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-11-28 Thread Thomas Schwinge
Hi Tobias!

On 2019-11-26T15:02:34+0100, Tobias Burnus  wrote:
> I now played also around common blocks with "!$acc declare 
> device_resident (/block/)". [See attached test-case diff.]

If you'd like to, please commit that, to document the status quo.  (I
have not reviewed.)


There are several issues with the OpenACC 'declare' implementation, so
that one generally needs to be re-visited as some point.  Basically
everything from the front ends handling, to middle end handling, to nvptx
back end handling (supposedly?; see ), to
libgomp handling.  So, you're adding here some more.  ;-)

> Observations:
>
> * !$acc declare has to come after the declaration of the common block. 
> In terms of the spec, it just needs to be in the declaration section, 
> i.e. it could also be before. – Seems as if one needs to split parsing 
> and resolving clauses.

Good find -- purely a Fortran front end issue, as I understand.  Please
file a GCC PR, unless there is a reason (implementation complexity?) to
be more "strict" ("referenced variable/common block needs to be lexically
in scope", or something like that?), and the OpenACC specification should
be changed instead?

> * If I just use '!$acc parallel', the used variables are copied in 
> according to OpenMP 4.0 semantics, i.e. without a defaultmap clause (of 
> OpenMP 4.5+; not yet in gfortran), scalars are firstprivate and arrays 
> are map(fromto:). – Does this behaviour match the spec or should this 
> automatically mapped to, e.g., no_create as the 'device_resident' is 
> known? [Side remark: the module file does contain 
> "OACC_DECLARE_DEVICE_RESIDENT".]

Not sure at this point.

> * If I explicitly use '!$acc parallel present(/block/)' that fails 
> because present() does not permit common blocks.
> (OpenACC 2.7, p36, l.1054: "For all clauses except deviceptr and 
> present, the list argument may include a Fortran common block name 
> enclosed within slashes").

Do you understand the rationale behind that restriction, by the way?  I'm
not sure I do.  Is it because we don't know/can't be sure that *all* of
the common block has been mapped (per the rules set elsewhere)?  That
would make sense in context of this:

> I could use no_create

... which basically means 'present' but don't complain if not actually
present.

> but that's not yet 
> supported.

But will be soon.  :-)


Grüße
 Thomas


> --- a/libgomp/testsuite/libgomp.oacc-fortran/declare-5.f90
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-5.f90
> @@ -1,29 +1,106 @@
>  ! { dg-do run }
>  
>  module vars
>implicit none
>real b
> - !$acc declare device_resident (b)
> +  !$acc declare device_resident (b)
> +
> +  integer :: x, y, z
> +  common /block/ x, y, z
> +  !$acc declare device_resident (/block/)
>  end module vars
>  
> +subroutine set()
> +  use openacc
> +  implicit none
> +  integer :: a(5), b(1), c, vals(7)
> +  common /another/ a, b, c
> +  !$acc declare device_resident (/another/)
> +  if (.not. acc_is_present (a)) stop 10
> +  if (.not. acc_is_present (b)) stop 11
> +  if (.not. acc_is_present (c)) stop 12
> +
> +  vals = 99
> +  !$acc parallel copyout(vals) present(a, b, c) ! OK
> +! but w/o 'present', 'c' is 
> firstprivate and a+b are 'map(fromto:'
> +! additionally, OpenACC 2.7 
> does not permit present(/another/)
> +! and no_create is not yet 
> in the trunk (but submitted)
> +a = [11,12,13,14,15]
> +b = 16
> +c = 47
> +vals(1:5) = a
> +vals(6:6) = b
> +vals(7) = c
> +  !$acc end parallel
> +
> +  if (.not. acc_is_present (a)) stop 13
> +  if (.not. acc_is_present (b)) stop 14
> +  if (.not. acc_is_present (c)) stop 15
> +
> +  if (any (vals /= [11,12,13,14,15,16,47])) stop 16
> +end subroutine set
> +
> +subroutine check()
> +  use openacc
> +  implicit none
> +  integer :: g, h(3), i(3)
> +  common /another/ g, h, i
> +  integer :: val(7)
> +  !$acc declare device_resident (/another/)
> +  if (.not. acc_is_present (g)) stop 20
> +  if (.not. acc_is_present (h)) stop 21
> +  if (.not. acc_is_present (i)) stop 22
> +
> +  val = 99
> +  !$acc parallel copyout(val) present(g, h, i)
> +val(5:7) = i
> +val(1) = g
> +val(2:4) = h
> +  !$acc end parallel
> +
> +  if (.not. acc_is_present (g)) stop 23
> +  if (.not. acc_is_present (h)) stop 24
> +  if (.not. acc_is_present (i)) stop 25
> +
> +
> +  !print *, val
> +  if (any (val /= [11,12,13,14,15,16,47])) stop 26
> +end subroutine check
> +
> +
>  program test
>use vars
>use openacc
>implicit none
>real a
> +  integer :: k
>  
> -  if (acc_is_present (b) .neqv. .true.) STOP 1
> +  call set()
> +  call check()
> +
> +  if (.not. acc_is_present (b)) stop 1
> +  if (.not. acc_is_present (x)) stop 2
> +  if (.not. acc_is_present (y)) stop 3
> +  if (.not. acc_is_present (z)) stop 4
>  
>a = 2.0
> +  k = 42
>

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-11-29 Thread Tobias Burnus

Hi Thomas,

On 11/28/19 6:01 PM, Thomas Schwinge wrote:
Definition (3.32.1 in F2018): "blank common" = "unnamed common block". 


I just want to add the following, which came into my mind after thinking 
more about device_resident (the other email in this thread). Fortran 
(here: 2018, 8.10.2.5) has:


"Named common blocks of the same name shall be of the same size in all 
scoping units of a program in which they appear, but blank common blocks 
may be of different sizes."


* * *

Depending on the use of a common block (see other email in the same 
thread, to be send shortly), that's fine or not. If the common block 
only exists on the device (i.e. in a device routine / 'target' 
procedure) or only on the host, everything is fine. — In this case, the 
connection between host and target is done by single variables – and no 
one cares whether they are in a common block or not.


It only becomes interesting if the same(-named) common block is known to 
both the host and the device – in that case, it is important that the 
size matches, otherwise either the copying to the device or (via 
'update') from the device to the host will write beyond the static 
variable! — Also in the latter case, it makes sense that 
'copy(/block_name/)' will map the whole common block and not only the 
directly used variables (which might be none).


* * *

OpenACC: Does one need device_resident to allocate 'static' global 
memory on the device? If not, then its only use would be for same-named 
common blocks, existing on both the device and the host. If it is 
needed, then one needs to think about the semantic – will it declare a 
common block which exists only on the device or one which exists on both 
device and host with the same name. — I think that needs to be spelt out 
in the spec clearly; at the moment, it is ambiguous. In any case, it 
influences how copy(/block_name/) acts.


For OpenMP, my impression is that the spec is completely silent on 
device-located common blocks. And if a common block is only on the host, 
copy(/block/) just maps the used (common-block) variables to the target 
– which is fine. — Seems as if some spec work is needed as well.


* * *


I go by the assumption that everything contained in the base
languages of OpenACC/OpenMP ([…] C, C++, Fortran
standards), should also work in an OpenACC/OpenMP context in a sensible
manner […]

I concur.
ACK. Instead of "burying" such things in long emails, I like to see 
GCC PRs filed, which can then be actioned on individually.


Well, I think one first needs to understand what's supposed to be in the 
standard. Having said this, I have now filled – PR 92728 + PR 92730.


[blank commons]

assumption would thus be: yes, ought to be supported -- but I haven't 
thought through whether that makes sense, so...


By itself, using blank commons make sense if one maps variables from a 
common block but not if one maps the whole common block.


[Blank commons + PGI]

I will later play around with the PGI compiler; but I think it is really 
a spec issue and I care less what a specific compiler does. (Even 
though, with OpenACC, it is kind of the reference compiler.)



determine whether that makes sense to support in an OpenACC context


I think that needs discussion about what one wants to achieve instead of 
directly patching the spec.



* I have now a new test case
libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at
omplower.
Actually, I think this should be: 
gcc/testsuite/gfortran.dg/goacc/common-block-3.f90

Thanks. Curious: why 'omplower' instead of 'gimple' dump?

[…]

My rationale is that your code changes are in 'gcc/gimplify.c', so you'd
test for that stuff in the 'gimple' dump (which is between 'original' and
'omplower').

[switching to gimple]

I think it does (but please argue if it doesn't to you), but that's not
high priority, of course.


Hmm, that might be more specific to other parts – but with optional 
arguments, I had constantly to look at what has been passed on to 
libgomp via omp_arr – even though the code was produced directly by the 
front end. The 'pragma' simply didn't tell the whole story – omp-low.c 
added and removed some '*' and '&' which were crucial.


Probably, 'map' as parsed here doesn't change any more between 
gimplify.c and the early stages of omp-low.c, but I feel safer if the 
wanted result survived until the end of omp-low.c and does not get 
modified in an unintended way later on.



Note that I said 'dg-additional-options', not 'dg-options', so please
re-consider.


Ups. Yes, dg-additional-options should work :-)

[oacc/pr84963.f90]

Good find. […] to change 'dg-options "-O2"' to 'dg-additional-options 
"-O2"
Please verify, and then commit that to trunk, gcc-8-branch, gcc-7-branch, 


Done so – except I committed to GCC 9 instead of GCC 7, which is now 
closed :-)

[Question about a test case]


Then let's please document that in the test case sources, for that's not
quite obvious.


I have now committed

Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-11-29 Thread Tobias Burnus

Hi Thomas,

I have started with this email – and then stopped and replied to the 
other email in this thread: 
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02678.html – which covers 
parts which otherwise would belong into this email.


On 11/28/19 6:02 PM, Thomas Schwinge wrote:
[Test case which uses common blocks in device_resident.]

If you'd like to, please commit that, to document the status quo.  (I
have not reviewed.)


Did so as r278845 with a slightly updated comment.


Observations:
* !$acc declare has to come after the declaration of the common block.
That's now tracked in PR fortran/92728 for OpenMP/OpenACC – together 
with blank commons.
Good find -- purely a Fortran front end issue […] is a reason 
(implementation complexity?)


Having any order makes it feel more Fortran like; the complexity comes 
from splitting matching and checking the clauses, but it shouldn't be 
rocket science.



* If I just use '!$acc parallel', the used variables are copied in
according to OpenMP 4.0 semantics, i.e. without a defaultmap clause (of
OpenMP 4.5+; not yet in gfortran), scalars are firstprivate and arrays
are map(fromto:). – Does this behaviour match the spec or should this
automatically mapped to, e.g., no_create as the 'device_resident' is
known? [Side remark: the module file does contain
"OACC_DECLARE_DEVICE_RESIDENT".]

Not sure at this point.


s/OpenMP 4.0/OpenMP 4.5/

Regarding the mapping: Both OpenACC and OpenMP agree and it is fine (cf. 
OpenACC 2.7 last 'Description' paragraph in parallel/kernels, 2.5.1 + 
2.5.2). And OpenMP 4.5, Sect. 2.15.5 (esp. last three bullet points) or 
OpenMP 5, Sect. 2.19.7. (Missing omp bits see PR fortran/92568.).


Regarding device_resident: It is not fully clear to me what the intent 
is – and "The host may not be able to access variables in a 
device_resident clause." does not make it clearer.


In terms of the spec, the mapping with firstprivate/[copy alias tofrom] 
is fine – as is the explict use of present. However, if commons exists 
on both device + host, 'copy(/block/)' should work and also copy 
common-block variables, which are not referrenced in the 
parallel/kernels block – which currently does not work.



* If I explicitly use '!$acc parallel present(/block/)' that fails
because present() does not permit common blocks.
(OpenACC 2.7, p36, l.1054: "For all clauses except deviceptr and
present, the list argument may include a Fortran common block name
enclosed within slashes").

Do you understand the rationale behind that restriction, by the way?  I'm
not sure I do.


Regarding 'present', I don't: If copy/no_create is fine, why should 
present be a problem? (And vice versa.)


For 'deviceptr', it kind of does make sense – unless one wants to store 
the pointer as 'intptr_t' in an integer variable or want to have a 
pointer (i.e. Fortran attribute) in 'common' which will cause mapping 
problems for the common block. — In any case, the 'dummy argument' 
constraint prevents common blocks. – BTW: Those constraints do not make 
sense but seem to be same as for OpenMP's is_device_ptr. (They are both 
too loose and to strict; I miss type(c_ptr) [as local var + as dummy w/ 
value attribute].)


Cheers,

Tobias



Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-12-02 Thread Tobias Burnus

Hi Thomas,

for completeness, I tried now *blank commons* with OpenACC in *PGI's 
pgfortran.*


From the error message, it looks as if the parser does not handle blank 
commons at all. (Matches the current parser in gfortran.) pgfortran is 
also not very good at diagnostics as nonexisting common block names are 
not diagnosed.


Cheers,

Tobias



Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-12-03 Thread Thomas Schwinge
Hi Tobias!

On 2019-11-29T18:47:12+0100, Tobias Burnus  wrote:
> On 11/28/19 6:02 PM, Thomas Schwinge wrote:
> [Test case which uses common blocks in device_resident.]
>> If you'd like to, please commit that, to document the status quo.  (I
>> have not reviewed.)
>
> Did so as r278845 with a slightly updated comment.

Testing with nvptx offloading on two different systems, on both I'm
seeing 'STOP 10' execution test FAILure for all optimization levels:

PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  
execution test
PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O1  (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O1  
execution test
PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  
execution test
PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
errors)
[-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  
execution test
PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -g  (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -g  
execution test
PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -Os  (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -Os  
execution test


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-12-03 Thread Tobias Burnus

Hi Thomas,

Quick version: The attached patch seems to work, kind of,  but fails at 
run time with:
libgomp: Trying to map into device [0x407218..0x40721c) object when 
[0x407210..0x40721c) is already mapped
This marks the common-block decl but not the common-block vars as 
'device resident' (alias "omp declare target").


The attached with '#if 0'  set to '1' does not work as it gives an ICE 
in lto1. – If one only marks the common-block variables, it fails as the 
ME check complains "variable 'block' has been referenced in offloaded 
code but hasn't been marked to be included in the offloaded code" –


I think the first version is fine, but it seems as if the ME needs to 
use pcreate and not create for those. I think that's also the reason for 
the odd is-program check mentioned at the very bottom.


Tobias

PS: Hmm, I really wonder why it seemed to have passed before. Looking at 
the code, it cannot have passed — more below. That goes wrong since 
r272453 for PR85221 (well, it can't before). I don't quickly see whether 
it also affects OpenMP or other clauses.


I think for a proper fix it would be very useful to know some more 
details about the intention of 'declare device_resident' (existing only 
on the device, existing on both host and device etc.). Cf. previous email.


In terms of this issue, if one does:  "integer :: a, b, c; common /name/ 
a,b,c; !$acc declare device_resident(a)", should this make all of the 
common-block variables as device resident or not? Internally, one gets 
for declare-5.f90 the following, i.e. /another/ is the common name and 
g, h and i are common-name variables:


  static integer(kind=4) g [value-expr: another.g];
  static integer(kind=4) h[3] [value-expr: another.h];
  static integer(kind=4) i[3] [value-expr: another.i];

For the test case, the issue is that 'gfc_get_symbol_decl' only called 
after it's tree representation (sym->backend_decl) has already been 
created; this happens for common blocks. – The attached patch fixes 
this, marking the common block decl and all its variables as declare 
device_resident.


One could think of handling other attributes (which ones?). For 
EQUIVALENCE in commons, the attributes are collected using 
accumulate_equivalence_attributes – and for normal variables, it is 
handled in trans-decl.c's add_attributes_to_decl


 * * *

Additionally, and unrelated to the test case, the following code looks 
very suspicious (from finish_oacc_declare in fortran/trans-decl.c):


  module_oacc_clauses = NULL;
  gfc_traverse_ns (ns, find_module_oacc_declare_clauses);
  if (module_oacc_clauses && sym->attr.flavor == FL_PROGRAM)

First, it very much looks like memory leak – one creates a linked list,
but always dumps it w/o using or freeing it if one is currently not
processing the main program. Additionally, it assumes that the main program
has a full view of module-declared 'declare device_resident' variables, but
it is trivial to construct programs where the main program does not see this
property. Most trivial example is:
  subroutine foo()
use module_w_device_decl
  end subroutine
independent whether that function exists as such or as module procedure or
(at some place) is a procedure contained in another procedure. A general
assumption is also that the whole program is compiled with -fopenacc
and that the main program is written in Fortran and not, e.g., in C or C++.

Index: gcc/fortran/trans-common.c
===
--- gcc/fortran/trans-common.c	(revision 278936)
+++ gcc/fortran/trans-common.c	(working copy)
@@ -706,6 +706,15 @@
 	}
 }
 
+  bool is_device_resident = false;
+  for (s = head; s; s = next_s = s->next)
+ is_device_resident |= s->sym->attr.oacc_declare_device_resident;
+
+  if (is_device_resident)
+DECL_ATTRIBUTES (decl)
+  = tree_cons (get_identifier ("omp declare target"),
+		   NULL_TREE, DECL_ATTRIBUTES (decl));
+
   /* Build component reference for each variable.  */
   for (s = head; s; s = next_s)
 {
@@ -750,6 +759,12 @@
 	  GFC_DECL_ASSIGN_ADDR (var_decl) = GFC_DECL_ASSIGN_ADDR (s->field);
 	}
 
+#if 0
+  if (is_device_resident)
+	DECL_ATTRIBUTES (var_decl)
+	  = tree_cons (get_identifier ("omp declare target"),
+		   NULL_TREE, DECL_ATTRIBUTES (var_decl));
+#endif
   s->sym->backend_decl = var_decl;
 
   next_s = s->next;


Re: [Patch][Fortran] OpenACC – permit common blocks in some clauses

2019-11-11 Thread Thomas Schwinge
Hi Tobias!

By the way, do you know what's the status is for Fortran common blocks in
OpenMP: supported vs. expected per the specification?


On 2019-10-25T16:36:10+0200, Tobias Burnus  wrote:
> On 10/25/19 10:43 AM, Thomas Schwinge wrote:
>> Or, would it be easy to add an OpenACC 'kernels' test case that otherwise
>> faild (at run time, say, with aforementioned duplicate mapping errors, or
>> would contain "strange"/duplicate/conflicting mapping items in the
>> '-fdump-tree-gimple' dump)?

> * I have now a new test case 
> libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at 
> omplower.

Thanks.

Curious: why 'omplower' instead of 'gimple' dump?


> Regarding the new test case: Without the gcc/gimplify.c changes, one has 
> (see last item before child fn):
>
>      #pragma omp target oacc_parallel map(tofrom:a [len: 400]) 
> map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) map(tofrom:block [len: 
> 812]) [child fn …
>      #pragma omp target oacc_kernels map(force_tofrom:i [len: 4]) 
> map(tofrom:y [len: 400]) map(tofrom:x [len: 400]) 
> map(tofrom:kernel_block [len: 804]) map(force_tofrom:c [len: 4]) 
> map(tofrom:block [len: 812])  [child fn …
>
> With the changes of gcc/gimplify.c, one has:
>
>      #pragma omp target oacc_parallel map(tofrom:a [len: 400]) 
> map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) [child fn …
>      #pragma omp target oacc_kernels map(force_tofrom:i [len: 4]) 
> map(tofrom:y [len: 400]) map(tofrom:x [len: 400]) map(force_tofrom:c 
> [len: 4])  [child fn …
>
>
> And without gimplify.c, the added run-tests indeed fail with:
> libgomp: Trying to map into device [0x407100..0x407294) object when 
> [0x407100..0x407290) is already mapped

OK, good, my suspicion was thus right that there's something "strange"
there.  ;-)

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
> @@ -0,0 +1,39 @@
> +! { dg-options "-fopenacc -fdump-tree-omplower" }

(For later: we usually just use 'dg-additional-options
"-fdump-tree-omplower"'; '-fopenacc' is implied inside '*/goacc/'.)

> +
> +module consts
> +  integer, parameter :: n = 100
> +end module consts
> +
> +program main
> +  use consts
> +  implicit none
> +
> +  integer :: i, j
> +  real ::  a(n) = 0, b(n) = 0, c, d
> +  real ::  x(n) = 0, y(n), z
> +  common /BLOCK/ a, b, c, j, d
> +  common /KERNELS_BLOCK/ x, y, z
> +
> +  c = 1.0
> +  !$acc parallel loop copy(/BLOCK/)
> +  do i = 1, n
> + a(i) = b(i) + c
> +  end do
> +  !$acc kernels
> +  do i = 1, n
> + x(i) = y(i) + c
> +  end do
> +  !$acc end kernels
> +end program main
> +
> +! { dg-final { scan-tree-dump-times "omp target oacc_parallel 
> .*map\\(tofrom:a \\\[len: 400\\\]\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_parallel 
> .*map\\(tofrom:b \\\[len: 400\\\]\\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_parallel 
> .*map\\(tofrom:c \\\[len: 4\\\]\\)" 1 "omplower" } }
> +
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels 
> .*map\\(force_tofrom:i \\\[len: 4\\\]\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels 
> .*map\\(tofrom:x \\\[len: 400\\\]\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels 
> .*map\\(tofrom:y \\\[len: 400\\\]\\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels 
> .*map\\(force_tofrom:c \\\[len: 4\\\]\\)" 1 "omplower" } }
> +
> +! { dg-final { scan-tree-dump-not "map\\(.*:block\\)" "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:kernels_block\\)" "omplower" } }

For my understanding: the several unused variables in the common blocks
are to make sure that they don't cause any issues, don't get mapped at
all?

I we were to add to 'gfortran.dg/goacc/common-block-3.f90' a test case
for the upcoming OpenACC 'serial' construct (which basically equals the
OpenACC 'parallel' construct), would we copy/adapt the 'parallel' 'BLOCK'
test case, or add a new, separate common block?

Or, asking the other way round: why aren't in the current test case,
'parallel' and 'kernels' using the same common block, and both explicitly
'copy' the common block vs. not do that?


> * In the compile-time *{2,3} test case, there is now also a 'enter data' 
> and 'update host/self/device' test.

;-) Heh, 'update' got inside the 'parallel' region:

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
> @@ -0,0 +1,74 @@
> +[...]
> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
> +  !$acc update device(/blockA/)
> +  !$acc update self(/blockB/, v)
> +  !$acc update host(/blockA/, e, /blockB/)
> +  !$acc end parallel
> +[...]

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90

Likewise.

As obvoius; see attached, committed "Fix OpenACC directives nesting in
'gfortran.dg/goacc/common-block-1.f90',
'gfortran.dg/goacc/common-block-2.f90'" to trunk in r278047.


Grüße
 Thomas


From 068b41bc6db50d6f