Re: [gomp4] reduction bug fix
Hi Cesar! On Wed, 8 Oct 2014 09:57:22 -0700, Cesar Philippidis ce...@codesourcery.com wrote: On 10/08/2014 02:38 AM, Thomas Schwinge wrote: On Fri, 3 Oct 2014 09:22:52 -0700, Cesar Philippidis ce...@codesourcery.com wrote: There is a reduction bug [...] This patch also includes a runtime test case. I won't apply it to gomp-4_0-branch just yet. But I wanted to demonstrate a test case nonetheless. You can add it as a compile test, and I'll toggle to a run test as part of the merge into our internal development branch. OK. I've added a compile test to gcc.dg/goaccc/. Note that this test depends on -std=c99, so I couldn't put it in the c-c++-common directory. Oh, sorry if that was unclear. I meant to have it already now added to libgomp/testsuite/, but as a compile-only test, and then I would toggle that to an execution test later on. I've changed that... Is this OK for gomp-4_0-branch? Yes, with the following addressed: --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -10140,11 +10140,20 @@ process_reduction_data (gimple_seq *body, gimple_seq *in_stmt_seqp, [...]/source-gcc/gcc/omp-low.c: In function 'void _ZL22process_reduction_dataPP21gimple_statement_baseS1_S1_P11omp_context.isra.167.constprop.180(gimple_statement_base**, gimple_statement_base**, gimple_statement_base**, gimple)': [...]/source-gcc/gcc/omp-low.c:10172:14: warning: 'inner' may be used uninitialized in this function [-Wmaybe-uninitialized] gimple_seq inner; ^ ..., and addressed this in r216040: commit d77ae8538ddf8a550f431f7849da735cba8f37bb Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Oct 9 16:30:40 2014 + Follow-up on reduction bug fix. gcc/ * omp-low.c (process_reduction_data): Initialize variable inner. gcc/testsuite/ * gcc.dg/goacc/collapse.c: Move file to libgomp/testsuite/libgomp.oacc-c/collapse-4.c. libgomp/ * testsuite/libgomp.oacc-c/collapse-4.c: New file, moved from gcc/testsuite/gcc.dg/goacc/collapse.c. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@216040 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 4 gcc/omp-low.c| 2 +- gcc/testsuite/ChangeLog.gomp | 5 + libgomp/ChangeLog.gomp | 5 + .../collapse.c = libgomp/testsuite/libgomp.oacc-c/collapse-4.c | 1 + 5 files changed, 16 insertions(+), 1 deletion(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index 76f8d2b..ce0dcc7 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,3 +1,7 @@ +2014-10-09 Thomas Schwinge tho...@codesourcery.com + + * omp-low.c (process_reduction_data): Initialize variable inner. + 2014-10-08 Cesar Philippidis ce...@codesourcery.com * omp-low.c (lower_reduction_clauses): Clarify comment. diff --git gcc/omp-low.c gcc/omp-low.c index 42e84b0..b8022c2 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -10013,7 +10013,7 @@ process_reduction_data (gimple_seq *body, gimple_seq *in_stmt_seqp, gcc_assert (is_gimple_omp_oacc_specifically (ctx-stmt)); gimple_stmt_iterator gsi; - gimple_seq inner; + gimple_seq inner = NULL; gimple stmt; /* A collapse clause may have inserted a new bind block. */ diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp index 01bc225..48651ad 100644 --- gcc/testsuite/ChangeLog.gomp +++ gcc/testsuite/ChangeLog.gomp @@ -1,3 +1,8 @@ +2014-10-09 Thomas Schwinge tho...@codesourcery.com + + * gcc.dg/goacc/collapse.c: Move file to + libgomp/testsuite/libgomp.oacc-c/collapse-4.c. + 2014-10-08 Cesar Philippidis ce...@codesourcery.com * gcc.dg/goacc/collapse.c: New test. diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp index abc38a6..0acc4ca 100644 --- libgomp/ChangeLog.gomp +++ libgomp/ChangeLog.gomp @@ -1,3 +1,8 @@ +2014-10-09 Thomas Schwinge tho...@codesourcery.com + + * testsuite/libgomp.oacc-c/collapse-4.c: New file, moved from + gcc/testsuite/gcc.dg/goacc/collapse.c. + 2014-09-23 Thomas Schwinge tho...@codesourcery.com * libgomp.map (OACC_2.0): Add acc_on_device, acc_on_device_. diff --git gcc/testsuite/gcc.dg/goacc/collapse.c libgomp/testsuite/libgomp.oacc-c/collapse-4.c similarity index 83% rename from gcc/testsuite/gcc.dg/goacc/collapse.c rename to libgomp/testsuite/libgomp.oacc-c/collapse-4.c index 1ec20a4..7df0de2 100644 --- gcc/testsuite/gcc.dg/goacc/collapse.c +++ libgomp/testsuite/libgomp.oacc-c/collapse-4.c @@ -1,3 +1,4 @@ +/* TODO: change to a run test once libgomp supports all data clauses. */ /* { dg-do compile } */ /* { dg-options -O2 -std=c99 } */ Grüße, Thomas pgpCnNohaBcTO.pgp Description: PGP signature
Re: [gomp4] reduction bug fix
Hi Cesar! On Fri, 3 Oct 2014 09:22:52 -0700, Cesar Philippidis ce...@codesourcery.com wrote: There is a reduction bug [...] Thanks for looking into this! This is a problem because initialize_reduction_data originally expected a GIMPLE_BIND at the very beginning of a parallel block. [...] Does your patch also fix the issue that Jim saw in his C++ work? This patch also includes a runtime test case. I won't apply it to gomp-4_0-branch just yet. But I wanted to demonstrate a test case nonetheless. You can add it as a compile test, and I'll toggle to a run test as part of the merge into our internal development branch. Also, note that part of this patch also changes a comment. I found some typos in the original comment, so I took the opportunity to fix them, I hope. Sure, and it's also always fine to separately apply such patches under the obvious rule. Is this OK for gomp-4_0-branch? Yes, with the following addressed: --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -10140,11 +10140,20 @@ process_reduction_data (gimple_seq *body, gimple_seq *in_stmt_seqp, [...]/source-gcc/gcc/omp-low.c: In function 'void _ZL22process_reduction_dataPP21gimple_statement_baseS1_S1_P11omp_context.isra.167.constprop.180(gimple_statement_base**, gimple_statement_base**, gimple_statement_base**, gimple)': [...]/source-gcc/gcc/omp-low.c:10172:14: warning: 'inner' may be used uninitialized in this function [-Wmaybe-uninitialized] gimple_seq inner; ^ --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c/collapse-4.c @@ -0,0 +1,34 @@ +/* { dg-do run } */ +/* { dg-options -O2 -std=c99 } */ + +#include string.h +#include stdlib.h +#include stdio.h + +int +main (void) +{ + int i2, l = 0, r = 0; + int a[3][3][3]; + int b[3][3]; A lot of these variables are unused. +printf(a %p\n, a[0][0][0]); +printf(i2 %p\n, i2); +printf(l %p\n, l); +printf(r %p\n, r); Please remove stdio.h include and the printfs, or at least put those into DEBUG conditionals. Grüße, Thomas pgpjFoJdfpG5c.pgp Description: PGP signature
Re: [gomp4] reduction bug fix
On 10/08/2014 02:38 AM, Thomas Schwinge wrote: On Fri, 3 Oct 2014 09:22:52 -0700, Cesar Philippidis ce...@codesourcery.com wrote: There is a reduction bug [...] Thanks for looking into this! This is a problem because initialize_reduction_data originally expected a GIMPLE_BIND at the very beginning of a parallel block. [...] Does your patch also fix the issue that Jim saw in his C++ work? Yes it does. This patch also includes a runtime test case. I won't apply it to gomp-4_0-branch just yet. But I wanted to demonstrate a test case nonetheless. You can add it as a compile test, and I'll toggle to a run test as part of the merge into our internal development branch. OK. I've added a compile test to gcc.dg/goaccc/. Note that this test depends on -std=c99, so I couldn't put it in the c-c++-common directory. Also, note that part of this patch also changes a comment. I found some typos in the original comment, so I took the opportunity to fix them, I hope. Sure, and it's also always fine to separately apply such patches under the obvious rule. Is this OK for gomp-4_0-branch? Yes, with the following addressed: --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -10140,11 +10140,20 @@ process_reduction_data (gimple_seq *body, gimple_seq *in_stmt_seqp, [...]/source-gcc/gcc/omp-low.c: In function 'void _ZL22process_reduction_dataPP21gimple_statement_baseS1_S1_P11omp_context.isra.167.constprop.180(gimple_statement_base**, gimple_statement_base**, gimple_statement_base**, gimple)': [...]/source-gcc/gcc/omp-low.c:10172:14: warning: 'inner' may be used uninitialized in this function [-Wmaybe-uninitialized] gimple_seq inner; ^ --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c/collapse-4.c @@ -0,0 +1,34 @@ +/* { dg-do run } */ +/* { dg-options -O2 -std=c99 } */ + +#include string.h +#include stdlib.h +#include stdio.h + +int +main (void) +{ + int i2, l = 0, r = 0; + int a[3][3][3]; + int b[3][3]; A lot of these variables are unused. +printf(a %p\n, a[0][0][0]); +printf(i2 %p\n, i2); +printf(l %p\n, l); +printf(r %p\n, r); Please remove stdio.h include and the printfs, or at least put those into DEBUG conditionals. I removed those thanks. This updated patch has been committed. Thanks, Cesar 2014-10-08 Cesar Philippidis ce...@codesourcery.com gcc/ * omp-low.c (lower_reduction_clauses): Clarify comment. (process_reduction_data): Scan for nonempty bind statements at the beginning of parallel blocks. gcc/testsuite/ * gcc.dg/goacc/collapse.c: New test. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 3bb6a24..5c452c6 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -4434,10 +4434,10 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, omp_context *ctx) else { /* The atomic add at the end of the sum creates unnecessary - write contention on accelerators. To work around that, - create an array or vector_length and assign an element to - each thread. Later, in lower_omp_for (for openacc), the - values of array will be combined. */ + write contention on accelerators. To work around this, + create an array to store the partial reductions. Later, in + lower_omp_for (for openacc), the values of array will be + combined. */ tree t = NULL_TREE, array, nthreads; tree type = get_base_type (var); @@ -10140,11 +10140,20 @@ process_reduction_data (gimple_seq *body, gimple_seq *in_stmt_seqp, gimple stmt; /* A collapse clause may have inserted a new bind block. */ - stmt = gimple_seq_first (*body); - if (stmt gimple_code (stmt) == GIMPLE_BIND) + gsi = gsi_start (*body); + while (!gsi_end_p (gsi)) { - inner = gimple_bind_body (gimple_seq_first (*body)); - body = inner; + stmt = gsi_stmt (gsi); + if (gimple_code (stmt) == GIMPLE_BIND) + { + inner = gimple_bind_body (stmt); + body = inner; + gsi = gsi_start (*body); + } + else if (gimple_code (stmt) == GIMPLE_OMP_FOR) + break; + else + gsi_next (gsi); } for (gsi = gsi_start (*body); !gsi_end_p (gsi); gsi_next (gsi)) diff --git a/gcc/testsuite/gcc.dg/goacc/collapse.c b/gcc/testsuite/gcc.dg/goacc/collapse.c new file mode 100644 index 000..1ec20a4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/goacc/collapse.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -std=c99 } */ + +#include string.h + +int +main (void) +{ + int l = 0; + int b[3][3]; + + memset (b, '\0', sizeof (b)); + +#pragma acc parallel copy(b[0:3][0:3]) copy(l) +{ +#pragma acc loop collapse(2) reduction(+:l) + for (int i = 0; i 2; i++) + for (int j = 0; j 2; j++) + if (b[i][j] != 16) + l += 1; +} + + return 0; +}
[gomp4] reduction bug fix
There is a reduction bug exposed in the following parallel block. #pragma acc parallel copy(b[0:3][0:3]) copy(l) { #pragma acc loop collapse(2) reduction(+:l) for (int i = 0; i 2; i++) for (int j = 0; j 2; j++) if (b[i][j] != 16) l += 1; } Because i and j are local, the collapsed loop is lowered into something as follows #pragma acc parallel ... { int i { int j { #pragma acc loop ... This is a problem because initialize_reduction_data originally expected a GIMPLE_BIND at the very beginning of a parallel block. I also made the assumption that collapse would create a single bind. Looking at this some more, I may need revise initialize_reduction_data to scan for multiple acc loops within a parallel block. E.g., #pragma acc parallel { #pragma acc loop reduction (+:foo) { } ... #pragma acc loop reduction (-:bar) { } } I'll address this issue in a follow up patch. This patch also includes a runtime test case. I won't apply it to gomp-4_0-branch just yet. But I wanted to demonstrate a test case nonetheless. Also, note that part of this patch also changes a comment. I found some typos in the original comment, so I took the opportunity to fix them, I hope. Is this OK for gomp-4_0-branch? Thanks, Cesar 2014-10-02 Cesar Philippidis ce...@codesourcery.com gcc/ * omp-low.c (lower_reduction_clauses): Clarify comment. (process_reduction_data): Scan for nonempty bind statements at the beginning of parallel blocks. libgomp/ * testsuite/libgomp.oacc-c/collapse-4.c: New test. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 3bb6a24..5c452c6 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -4434,10 +4434,10 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, omp_context *ctx) else { /* The atomic add at the end of the sum creates unnecessary - write contention on accelerators. To work around that, - create an array or vector_length and assign an element to - each thread. Later, in lower_omp_for (for openacc), the - values of array will be combined. */ + write contention on accelerators. To work around this, + create an array to store the partial reductions. Later, in + lower_omp_for (for openacc), the values of array will be + combined. */ tree t = NULL_TREE, array, nthreads; tree type = get_base_type (var); @@ -10140,11 +10140,20 @@ process_reduction_data (gimple_seq *body, gimple_seq *in_stmt_seqp, gimple stmt; /* A collapse clause may have inserted a new bind block. */ - stmt = gimple_seq_first (*body); - if (stmt gimple_code (stmt) == GIMPLE_BIND) + gsi = gsi_start (*body); + while (!gsi_end_p (gsi)) { - inner = gimple_bind_body (gimple_seq_first (*body)); - body = inner; + stmt = gsi_stmt (gsi); + if (gimple_code (stmt) == GIMPLE_BIND) + { + inner = gimple_bind_body (stmt); + body = inner; + gsi = gsi_start (*body); + } + else if (gimple_code (stmt) == GIMPLE_OMP_FOR) + break; + else + gsi_next (gsi); } for (gsi = gsi_start (*body); !gsi_end_p (gsi); gsi_next (gsi)) diff --git a/libgomp/testsuite/libgomp.oacc-c/collapse-4.c b/libgomp/testsuite/libgomp.oacc-c/collapse-4.c new file mode 100644 index 000..b08115a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c/collapse-4.c @@ -0,0 +1,34 @@ +/* { dg-do run } */ +/* { dg-options -O2 -std=c99 } */ + +#include string.h +#include stdlib.h +#include stdio.h + +int +main (void) +{ + int i2, l = 0, r = 0; + int a[3][3][3]; + int b[3][3]; + +printf(a %p\n, a[0][0][0]); +printf(i2 %p\n, i2); +printf(l %p\n, l); +printf(r %p\n, r); + + memset (a, '\0', sizeof (a)); + memset (b, '\0', sizeof (b)); + + +#pragma acc parallel copy(b[0:3][0:3]) copy(l) +{ +#pragma acc loop collapse(2) reduction(+:l) + for (int i = 0; i 2; i++) + for (int j = 0; j 2; j++) + if (b[i][j] != 16) + l += 1; +} + + return 0; +}