On 2016/5/31 3:28 PM, Thomas Schwinge wrote: > Hi! > > On Mon, 30 May 2016 18:53:41 +0200, Jakub Jelinek <ja...@redhat.com> wrote: >> On Mon, May 30, 2016 at 10:38:59PM +0800, Chung-Lin Tang wrote: >>> Hi, a previous patch of Cesar's has made the middle-end omp-lowering >>> automatically create and insert a tofrom (i.e. present_or_copy) map for >>> parallel reductions. This allowed the user to not need explicit >>> clauses to copy out the reduction result, but because reduction arguments >>> are not marked addressable, async does not work as expected, >>> i.e. the asynchronous copy-out results are not used in the compiler >>> generated code. >> >> If you need it only for async parallel/kernels? regions, can't you do that >> only for those and not for others?
That is achievable, but not in line with how we currently treat all other data clause OMP_CLAUSE_MAPs, which are all marked addressable. Is this special case handling really better here? > Also, please add comments to the source code to document the need for > such special handling. > >>> This patch fixes this in the front-ends, I've tested this patch without >>> new regressions, and fixes some C++ OpenACC tests that regressed after >>> my last OpenACC async patch. Is this okay for trunk? >> >> Testcases in the testsuite or others? If the latter, we should add them. > > The r236772 commit "[PATCH, libgomp] Rewire OpenACC async", > <http://news.gmane.org/find-root.php?message_id=%3C56FA4F69.1060101%40codesourcery.com%3E> > regressed (or, triggered/exposed the existing wrong behavior?) > libgomp.oacc-c++/template-reduction.C execution testing for nvptx > offloading. (Please always send email about such known regressions, and > XFAIL them with your commit -- that would have saved me an hour > yesterday, when I bisected recent changes to figure out why that test > suddenly fails.) Sorry, Thomas. I was going to quickly send this follow-up patch, so glossed over XFAILing. > For reference, here is a test case, a reduced C version of > libgomp/testsuite/libgomp.oacc-c++/template-reduction.C. This test case > works (without Chung-Lin's "[PATCH, OpenACC] Make reduction arguments > addressable" patch) if I enable "POCs", which surprises me a bit, because > I thought after Cesar's recent changes, the gimplifier is doing the same > thing of adding a data clause next to the reduction clause. Probably > it's not doing the exactly same thing, though. Should it? Cesar, do you > have any comments on this? For example (just guessing), should > TREE_ADDRESSABLE be set where the gimplifier does its work, instead of in > the three front ends? There's really nothing wrong about Cesar's patch. The marking addressable needs to be done earlier, or it may be too late during gimplification. I already tried to fix this in gimplify.c before, but didn't completely work. I'll add more testcases for this before I commit any final patches. Thanks, Chung-Lin > > // Reduced C version of > libgomp/testsuite/libgomp.oacc-c++/template-reduction.C. > > const int n = 100; > > // Check present and async and an explicit firstprivate > > int > async_sum (int c) > { > int s = 0; > > #define POCs //present_or_copy(s) > #pragma acc parallel loop num_gangs (10) gang reduction (+:s) POCs > firstprivate (c) async > for (int i = 0; i < n; i++) > s += i+c; > > #pragma acc wait > > return s; > } > > int > main() > { > int result = 0; > > for (int i = 0; i < n; i++) > { > result += i+1; > } > > if (async_sum (1) != result) > __builtin_abort (); > > return 0; > } > > > Grüße > Thomas >