Re: [patch,opencc] Don't mark OpenACC auto loops as independent inside acc parallel regions

2018-12-03 Thread Julian Brown
On Thu, 20 Sep 2018 09:49:43 -0700
Cesar Philippidis  wrote:

> OpenACC as a concept of loop independence, in which independent loops
> may be executed in parallel across gangs, workers and vectors. Inside
> acc parallel regions, if a loop isn't explicitly marked seq or auto,
> it is predetermined to be independent.
> 
> This patch corrects a bug where acc loops marked as auto were being
> mistakenly promoted to independent. That's bad because it can generate
> bogus results if a dependency exist.
> 
> Note that this patch depends on the following patches for
> -fnote-info-omp-optimized which is used in a test case.
> 
>   * Add user-friendly OpenACC diagnostics regarding detected
> parallelism.
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01652.html
> 
>   * Correct the reported line number in fortran combined OpenACC
> directives
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01554.html
> 
>   * Correct the reported line number in c++ combined OpenACC
> directives https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01552.html
> 
> Is this OK for trunk? I bootstrapped and regtested on x86_64 Linux
> with nvptx offloading.

LGTM, FWIW.

Thanks,

Julian


[patch,opencc] Don't mark OpenACC auto loops as independent inside acc parallel regions

2018-09-20 Thread Cesar Philippidis
OpenACC as a concept of loop independence, in which independent loops
may be executed in parallel across gangs, workers and vectors. Inside
acc parallel regions, if a loop isn't explicitly marked seq or auto, it
is predetermined to be independent.

This patch corrects a bug where acc loops marked as auto were being
mistakenly promoted to independent. That's bad because it can generate
bogus results if a dependency exist.

Note that this patch depends on the following patches for
-fnote-info-omp-optimized which is used in a test case.

  * Add user-friendly OpenACC diagnostics regarding detected
parallelism.
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01652.html

  * Correct the reported line number in fortran combined OpenACC
directives
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01554.html

  * Correct the reported line number in c++ combined OpenACC directives
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01552.html

Is this OK for trunk? I bootstrapped and regtested on x86_64 Linux with
nvptx offloading.

Thanks,
Cesar
[OpenACC] Don't mark OpenACC auto loops as independent inside acc parallel regions

2018-XX-YY  Cesar Philippidis  

	gcc/
	* omp-low.c (lower_oacc_head_mark): Don't mark OpenACC auto
	loops as independent inside acc parallel regions.

	gcc/testsuite/
	* c-c++-common/goacc/loop-auto-1.c: Adjust test case to conform to
	the new behavior of the auto clause in OpenACC 2.5.
	* c-c++-common/goacc/loop-auto-2.c: Likewise.
	* gcc.dg/goacc/loop-processing-1.c: Likewise.
	* c-c++-common/goacc/loop-auto-3.c: New test.
	* gfortran.dg/goacc/loop-auto-1.f90: New test.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Adjust test case
	to conform to the new behavior of the auto clause in OpenACC 2.5.

(cherry picked from gomp-4_0-branch r247569, 6d30b542f29)

---
 gcc/omp-low.c |  5 +-
 .../c-c++-common/goacc/loop-auto-1.c  | 50 +--
 .../c-c++-common/goacc/loop-auto-2.c  |  4 +-
 .../c-c++-common/goacc/loop-auto-3.c  | 78 
 .../gcc.dg/goacc/loop-processing-1.c  |  2 +-
 .../gfortran.dg/goacc/loop-auto-1.f90 | 88 +++
 .../libgomp.oacc-c-c++-common/loop-auto-1.c   | 20 ++---
 7 files changed, 207 insertions(+), 40 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/loop-auto-3.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/loop-auto-1.f90

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index fdabf67249b..24685fd012c 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -5647,9 +5647,10 @@ lower_oacc_head_mark (location_t loc, tree ddvar, tree clauses,
   tag |= OLF_GANG_STATIC;
 }
 
-  /* In a parallel region, loops are implicitly INDEPENDENT.  */
+  /* In a parallel region, loops without auto and seq clauses are
+ implicitly INDEPENDENT.  */
   omp_context *tgt = enclosing_target_ctx (ctx);
-  if (!tgt || is_oacc_parallel (tgt))
+  if ((!tgt || is_oacc_parallel (tgt)) && !(tag & (OLF_SEQ | OLF_AUTO)))
 tag |= OLF_INDEPENDENT;
 
   if (tag & OLF_TILE)
diff --git a/gcc/testsuite/c-c++-common/goacc/loop-auto-1.c b/gcc/testsuite/c-c++-common/goacc/loop-auto-1.c
index 124befc4002..dcad07f11c8 100644
--- a/gcc/testsuite/c-c++-common/goacc/loop-auto-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/loop-auto-1.c
@@ -10,7 +10,7 @@ void Foo ()
 #pragma acc loop seq
 	for (int jx = 0; jx < 10; jx++) {}
 
-#pragma acc loop auto /* { dg-warning "insufficient partitioning" } */
+#pragma acc loop auto independent /* { dg-warning "insufficient partitioning" } */
 	for (int jx = 0; jx < 10; jx++) {}
   }
 
@@ -20,7 +20,7 @@ void Foo ()
 #pragma acc loop auto
 	for (int jx = 0; jx < 10; jx++) {}
 
-#pragma acc loop auto /* { dg-warning "insufficient partitioning" } */
+#pragma acc loop auto independent /* { dg-warning "insufficient partitioning" } */
 	for (int jx = 0; jx < 10; jx++)
 	  {
 #pragma acc loop vector
@@ -51,7 +51,7 @@ void Foo ()
 #pragma acc loop vector
 	for (int jx = 0; jx < 10; jx++)
 	  {
-#pragma acc loop auto /* { dg-warning "insufficient partitioning" } */
+#pragma acc loop auto independent /* { dg-warning "insufficient partitioning" } */
 	for (int kx = 0; kx < 10; kx++) {}
 	  }
 
@@ -64,27 +64,27 @@ void Foo ()
 
   }
 
-#pragma acc loop auto
+#pragma acc loop auto independent
 for (int ix = 0; ix < 10; ix++)
   {
-#pragma acc loop auto
+#pragma acc loop auto independent
 	for (int jx = 0; jx < 10; jx++)
 	  {
-#pragma acc loop auto
+#pragma acc loop auto independent
 	for (int kx = 0; kx < 10; kx++) {}
 	  }
   }
 
-#pragma acc loop auto
+#pragma acc loop auto independent
 for (int ix = 0; ix < 10; ix++)
   {
-#pragma acc loop auto
+#pragma acc loop auto independent
 	for (int jx = 0; jx < 10; jx++)
 	  {
-#pragma acc loop auto /* { dg-warning "insufficient partitioning" } */
+#pragma acc loop auto independent /* { dg-warning "insufficient partitioning" } */
 	for (int kx = 0; kx