On 09/11/15 16:35, Tom de Vries wrote:
Hi,

this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.

The patch series contains these patches:

      1    Insert new exit block only when needed in
         transform_to_exit_first_loop_alt
      2    Make create_parallel_loop return void
      3    Ignore reduction clause on kernels directive
      4    Implement -foffload-alias
      5    Add in_oacc_kernels_region in struct loop
      6    Add pass_oacc_kernels
      7    Add pass_dominator_oacc_kernels
      8    Add pass_ch_oacc_kernels
      9    Add pass_parallelize_loops_oacc_kernels
     10    Add pass_oacc_kernels pass group in passes.def
     11    Update testcases after adding kernels pass group
     12    Handle acc loop directive
     13    Add c-c++-common/goacc/kernels-*.c
     14    Add gfortran.dg/goacc/kernels-*.f95
     15    Add libgomp.oacc-c-c++-common/kernels-*.c
     16    Add libgomp.oacc-fortran/kernels-*.f95

The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.

Bootstrapped and reg-tested on x86_64.

Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).

I'll post the individual patches in reply to this message.

this patch addresses the problem that once the offloading region has been split off from the original function, alias analysis can no longer use information available in the original function that would allow it to do a more precise analysis for the offloading function. [ At some point we could use fipa-pta for that, as discussed in PR46032, but that's not feasible now. ]

The basic idea behind the patch is that for typical usage, the base pointers used in an offloaded region are non-aliasing. The patch works by adding restrict to the types of the fields used to pass data to an offloading region.


The patch implements a new option
-foffload-alias=<none|pointer|all>.

The option -foffload-alias=none instructs the compiler to assume that
object references and pointer dereferences in an offload region do not
alias.

The option -foffload-alias=pointer instructs the compiler to assume that objects references in an offload region do not alias.

The option -foffload-alias=all instructs the compiler to make no
assumptions about aliasing in offload regions.

The default value is -foffload-alias=none.

Thanks,
- Tom

Implement -foffload-alias

2015-11-03  Tom de Vries  <t...@codesourcery.com>

	* common.opt (foffload-alias): New option.
	* flag-types.h (enum offload_alias): New enum.
	* omp-low.c (install_var_field): Handle flag_offload_alias.
	* doc/invoke.texi (@item Code Generation Options): Add -foffload-alias.
	(@item -foffload-alias): New item.

	* c-c++-common/goacc/kernels-loop-offload-alias-none.c: New test.
	* c-c++-common/goacc/kernels-loop-offload-alias-ptr.c: New test.
---
 gcc/common.opt                                     | 16 ++++++
 gcc/doc/invoke.texi                                | 11 ++++
 gcc/flag-types.h                                   |  7 +++
 gcc/omp-low.c                                      | 28 +++++++++-
 .../goacc/kernels-loop-offload-alias-none.c        | 61 ++++++++++++++++++++++
 .../goacc/kernels-loop-offload-alias-ptr.c         | 44 ++++++++++++++++
 6 files changed, 165 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-none.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-ptr.c

diff --git a/gcc/common.opt b/gcc/common.opt
index 961a1b6..7135b1a 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1735,6 +1735,22 @@ Enum(offload_abi) String(ilp32) Value(OFFLOAD_ABI_ILP32)
 EnumValue
 Enum(offload_abi) String(lp64) Value(OFFLOAD_ABI_LP64)
 
+foffload-alias=
+Common Joined RejectNegative Enum(offload_alias) Var(flag_offload_alias) Init(OFFLOAD_ALIAS_NONE)
+-foffload-alias=[all|pointer|none]     Assume non-aliasing in an offload region
+
+Enum
+Name(offload_alias) Type(enum offload_alias) UnknownError(unknown offload aliasing %qs)
+
+EnumValue
+Enum(offload_alias) String(all) Value(OFFLOAD_ALIAS_ALL)
+
+EnumValue
+Enum(offload_alias) String(pointer) Value(OFFLOAD_ALIAS_POINTER)
+
+EnumValue
+Enum(offload_alias) String(none) Value(OFFLOAD_ALIAS_NONE)
+
 fomit-frame-pointer
 Common Report Var(flag_omit_frame_pointer) Optimization
 When possible do not generate stack frames.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2e5953b..6928efd 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1143,6 +1143,7 @@ See S/390 and zSeries Options.
 -finstrument-functions-exclude-function-list=@var{sym},@var{sym},@dots{} @gol
 -finstrument-functions-exclude-file-list=@var{file},@var{file},@dots{} @gol
 -fno-common  -fno-ident @gol
+-foffload-alias=@r{[}none@r{|}pointer@r{|}all@r{]} @gol
 -fpcc-struct-return  -fpic  -fPIC -fpie -fPIE -fno-plt @gol
 -fno-jump-tables @gol
 -frecord-gcc-switches @gol
@@ -23852,6 +23853,16 @@ The options @option{-ftrapv} and @option{-fwrapv} override each other, so using
 using @option{-ftrapv} @option{-fwrapv} @option{-fno-wrapv} on the command-line
 results in @option{-ftrapv} being effective.
 
+@item -foffload-alias=@r{[}none@r{|}pointer@r{|}all@r{]}
+@opindex -foffload-alias
+The option @option{-foffload-alias=none} instructs the compiler to assume that
+objects references and pointer dereferences in an offload region do not alias.
+The option @option{-foffload-alias=pointer} instruct the compiler to assume that
+objects references in an offload region do not alias.  The option
+@option{-foffload-alias=all} instructs the compiler to make no assumptions about
+aliasing in offload regions.  The default value is
+@option{-foffload-alias=none}.
+
 @item -fexceptions
 @opindex fexceptions
 Enable exception handling.  Generates extra code needed to propagate
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 6301cea..87b1677 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -293,5 +293,12 @@ enum gfc_convert
   GFC_FLAG_CONVERT_LITTLE
 };
 
+enum offload_alias
+{
+  OFFLOAD_ALIAS_ALL,
+  OFFLOAD_ALIAS_POINTER,
+  OFFLOAD_ALIAS_NONE
+};
+
 
 #endif /* ! GCC_FLAG_TYPES_H */
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 45d1927..d052c13 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -1371,6 +1371,14 @@ install_var_field (tree var, bool by_ref, int mask, omp_context *ctx)
   tree field, type, sfield = NULL_TREE;
   splay_tree_key key = (splay_tree_key) var;
 
+  /* We use flag_offload_alias only for the oacc kernels region for the
+     moment.  */
+  bool offload_alias_p = is_oacc_kernels (ctx);
+  bool no_alias_var_p
+    = offload_alias_p && flag_offload_alias != OFFLOAD_ALIAS_ALL;
+  bool no_alias_ptr_p
+    = offload_alias_p && flag_offload_alias == OFFLOAD_ALIAS_NONE;
+
   if ((mask & 8) != 0)
     {
       key = (splay_tree_key) &DECL_UID (var);
@@ -1387,10 +1395,26 @@ install_var_field (tree var, bool by_ref, int mask, omp_context *ctx)
   if (mask & 4)
     {
       gcc_assert (TREE_CODE (type) == ARRAY_TYPE);
-      type = build_pointer_type (build_pointer_type (type));
+
+      type = build_pointer_type (type);
+      if (no_alias_var_p)
+	type = build_qualified_type (type, TYPE_QUAL_RESTRICT);
+
+      type = build_pointer_type (type);
+      if (no_alias_var_p)
+	type = build_qualified_type (type, TYPE_QUAL_RESTRICT);
     }
   else if (by_ref)
-    type = build_pointer_type (type);
+    {
+      if (no_alias_ptr_p
+	  && POINTER_TYPE_P (type))
+	type = build_qualified_type (type, TYPE_QUAL_RESTRICT);
+
+      type = build_pointer_type (type);
+
+      if (no_alias_var_p)
+	type = build_qualified_type (type, TYPE_QUAL_RESTRICT);
+    }
   else if ((mask & 3) == 1 && is_reference (var))
     type = TREE_TYPE (type);
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-none.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-none.c
new file mode 100644
index 0000000..79d8daa
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-none.c
@@ -0,0 +1,61 @@
+/* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fdump-tree-optimized" } */
+/* { dg-additional-options "-fdump-tree-alias-all" } */
+/* { dg-additional-options "-foffload-alias=none" } */
+
+#include <stdlib.h>
+
+#define N (1024 * 512)
+#define COUNTERTYPE unsigned int
+
+static void
+foo (unsigned int *a, unsigned int *b, unsigned int *c)
+{
+  for (COUNTERTYPE i = 0; i < N; i++)
+    a[i] = i * 2;
+
+  for (COUNTERTYPE i = 0; i < N; i++)
+    b[i] = i * 4;
+
+#pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N])
+  {
+    for (COUNTERTYPE ii = 0; ii < N; ii++)
+      c[ii] = a[ii] + b[ii];
+  }
+
+  for (COUNTERTYPE i = 0; i < N; i++)
+    if (c[i] != a[i] + b[i])
+      abort ();
+}
+
+int
+main (void)
+{
+  unsigned int *a;
+  unsigned int *b;
+  unsigned int *c;
+
+  a = (unsigned int *)malloc (N * sizeof (unsigned int));
+  b = (unsigned int *)malloc (N * sizeof (unsigned int));
+  c = (unsigned int *)malloc (N * sizeof (unsigned int));
+
+  foo (a, b, c);
+
+  free (a);
+  free (b);
+  free (c);
+
+  return 0;
+}
+
+/* Check that the loop has been split off into a function.  */
+/* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*\\._omp_fn\\.0" 1 "optimized" } } */
+
+/* { dg-final { scan-tree-dump-times "clique 1 base 1" 3 "alias" } } */
+/* { dg-final { scan-tree-dump-times "clique 1 base 2" 1 "alias" } } */
+/* { dg-final { scan-tree-dump-times "clique 1 base 3" 1 "alias" } } */
+/* { dg-final { scan-tree-dump-times "clique 1 base 4" 1 "alias" } } */
+/* { dg-final { scan-tree-dump-times "clique 1 base 5" 1 "alias" } } */
+/* { dg-final { scan-tree-dump-times "clique 1 base 6" 1 "alias" } } */
+/* { dg-final { scan-tree-dump-times "clique 1 base 7" 1 "alias" } } */
+/* { dg-final { scan-tree-dump-times "(?n)clique .* base .*" 9 "alias" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-ptr.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-ptr.c
new file mode 100644
index 0000000..de4f45a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-ptr.c
@@ -0,0 +1,44 @@
+/* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fdump-tree-optimized" } */
+/* { dg-additional-options "-fdump-tree-alias-all" } */
+/* { dg-additional-options "-foffload-alias=pointer" } */
+
+#include <stdlib.h>
+
+#define N (1024 * 512)
+#define COUNTERTYPE unsigned int
+
+unsigned int a[N];
+unsigned int b[N];
+unsigned int c[N];
+
+int
+main (void)
+{
+  for (COUNTERTYPE i = 0; i < N; i++)
+    a[i] = i * 2;
+
+  for (COUNTERTYPE i = 0; i < N; i++)
+    b[i] = i * 4;
+
+#pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N])
+  {
+    for (COUNTERTYPE ii = 0; ii < N; ii++)
+      c[ii] = a[ii] + b[ii];
+  }
+
+  for (COUNTERTYPE i = 0; i < N; i++)
+    if (c[i] != a[i] + b[i])
+      abort ();
+
+  return 0;
+}
+
+/* Check that the loop has been split off into a function.  */
+/* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */
+
+/* { dg-final { scan-tree-dump-times "clique 1 base 1" 3 "alias" } } */
+/* { dg-final { scan-tree-dump-times "clique 1 base 2" 1 "alias" } } */
+/* { dg-final { scan-tree-dump-times "clique 1 base 3" 1 "alias" } } */
+/* { dg-final { scan-tree-dump-times "clique 1 base 4" 1 "alias" } } */
+/* { dg-final { scan-tree-dump-times "(?n)clique .* base .*" 6 "alias" } } */
-- 
1.9.1

Reply via email to