OpenACC as a concept of loop independence, in which independent loops
may be executed in parallel across gangs, workers and vectors. Inside
acc parallel regions, if a loop isn't explicitly marked seq or auto, it
is predetermined to be independent.
This patch corrects a bug where acc loops marked as auto were being
mistakenly promoted to independent. That's bad because it can generate
bogus results if a dependency exist.
Note that this patch depends on the following patches for
-fnote-info-omp-optimized which is used in a test case.
* Add user-friendly OpenACC diagnostics regarding detected
parallelism.
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01652.html
* Correct the reported line number in fortran combined OpenACC
directives
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01554.html
* Correct the reported line number in c++ combined OpenACC directives
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01552.html
Is this OK for trunk? I bootstrapped and regtested on x86_64 Linux with
nvptx offloading.
Thanks,
Cesar
[OpenACC] Don't mark OpenACC auto loops as independent inside acc parallel regions
2018-XX-YY Cesar Philippidis
gcc/
* omp-low.c (lower_oacc_head_mark): Don't mark OpenACC auto
loops as independent inside acc parallel regions.
gcc/testsuite/
* c-c++-common/goacc/loop-auto-1.c: Adjust test case to conform to
the new behavior of the auto clause in OpenACC 2.5.
* c-c++-common/goacc/loop-auto-2.c: Likewise.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
* c-c++-common/goacc/loop-auto-3.c: New test.
* gfortran.dg/goacc/loop-auto-1.f90: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Adjust test case
to conform to the new behavior of the auto clause in OpenACC 2.5.
(cherry picked from gomp-4_0-branch r247569, 6d30b542f29)
---
gcc/omp-low.c | 5 +-
.../c-c++-common/goacc/loop-auto-1.c | 50 +--
.../c-c++-common/goacc/loop-auto-2.c | 4 +-
.../c-c++-common/goacc/loop-auto-3.c | 78
.../gcc.dg/goacc/loop-processing-1.c | 2 +-
.../gfortran.dg/goacc/loop-auto-1.f90 | 88 +++
.../libgomp.oacc-c-c++-common/loop-auto-1.c | 20 ++---
7 files changed, 207 insertions(+), 40 deletions(-)
create mode 100644 gcc/testsuite/c-c++-common/goacc/loop-auto-3.c
create mode 100644 gcc/testsuite/gfortran.dg/goacc/loop-auto-1.f90
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index fdabf67249b..24685fd012c 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -5647,9 +5647,10 @@ lower_oacc_head_mark (location_t loc, tree ddvar, tree clauses,
tag |= OLF_GANG_STATIC;
}
- /* In a parallel region, loops are implicitly INDEPENDENT. */
+ /* In a parallel region, loops without auto and seq clauses are
+ implicitly INDEPENDENT. */
omp_context *tgt = enclosing_target_ctx (ctx);
- if (!tgt || is_oacc_parallel (tgt))
+ if ((!tgt || is_oacc_parallel (tgt)) && !(tag & (OLF_SEQ | OLF_AUTO)))
tag |= OLF_INDEPENDENT;
if (tag & OLF_TILE)
diff --git a/gcc/testsuite/c-c++-common/goacc/loop-auto-1.c b/gcc/testsuite/c-c++-common/goacc/loop-auto-1.c
index 124befc4002..dcad07f11c8 100644
--- a/gcc/testsuite/c-c++-common/goacc/loop-auto-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/loop-auto-1.c
@@ -10,7 +10,7 @@ void Foo ()
#pragma acc loop seq
for (int jx = 0; jx < 10; jx++) {}
-#pragma acc loop auto /* { dg-warning "insufficient partitioning" } */
+#pragma acc loop auto independent /* { dg-warning "insufficient partitioning" } */
for (int jx = 0; jx < 10; jx++) {}
}
@@ -20,7 +20,7 @@ void Foo ()
#pragma acc loop auto
for (int jx = 0; jx < 10; jx++) {}
-#pragma acc loop auto /* { dg-warning "insufficient partitioning" } */
+#pragma acc loop auto independent /* { dg-warning "insufficient partitioning" } */
for (int jx = 0; jx < 10; jx++)
{
#pragma acc loop vector
@@ -51,7 +51,7 @@ void Foo ()
#pragma acc loop vector
for (int jx = 0; jx < 10; jx++)
{
-#pragma acc loop auto /* { dg-warning "insufficient partitioning" } */
+#pragma acc loop auto independent /* { dg-warning "insufficient partitioning" } */
for (int kx = 0; kx < 10; kx++) {}
}
@@ -64,27 +64,27 @@ void Foo ()
}
-#pragma acc loop auto
+#pragma acc loop auto independent
for (int ix = 0; ix < 10; ix++)
{
-#pragma acc loop auto
+#pragma acc loop auto independent
for (int jx = 0; jx < 10; jx++)
{
-#pragma acc loop auto
+#pragma acc loop auto independent
for (int kx = 0; kx < 10; kx++) {}
}
}
-#pragma acc loop auto
+#pragma acc loop auto independent
for (int ix = 0; ix < 10; ix++)
{
-#pragma acc loop auto
+#pragma acc loop auto independent
for (int jx = 0; jx < 10; jx++)
{
-#pragma acc loop auto /* { dg-warning "insufficient partitioning" } */
+#pragma acc loop auto independent /* { dg-warning "insufficient partitioning" } */
for (int kx = 0; kx