[gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-15 Thread Ilya Verbin
Hi,

This patch introduces new versions of GOMP_target{,_data,_update} for OpenMP 4.1
with unsigned short for map kinds, but without new async arguments yet.

make check-target-libgomp and bootstrap passed, ok for gomp-4_1-branch?


gcc/
* builtin-types.def (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR): Remove.
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New.
(BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove.
* omp-builtins.def (BUILT_IN_GOMP_TARGET): Replace GOMP_target with
GOMP_target1.
(BUILT_IN_GOMP_TARGET_DATA): Replace GOMP_target_data with
GOMP_target_data1.
(BUILT_IN_GOMP_TARGET_UPDATE): Replace GOMP_target_update with
GOMP_target_update1.
(BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA): New.
* omp-low.c (expand_omp_target): Use
BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA for GF_OMP_TARGET_KIND_ENTER_DATA
and GF_OMP_TARGET_KIND_EXIT_DATA.
Do not pass obsolete pointer to new builtins.
(lower_omp_target): Always use unsigned short for map kinds.
gcc/fortran/
* types.def (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR): Remove.
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New.
(BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove.
libgomp/
* libgomp.map (GOMP_4.1): Add GOMP_target1, GOMP_target_data1,
GOMP_target_update1, GOMP_target_enter_exit_data.
* libgomp_g.h: Declare GOMP_target1, GOMP_target_data1,
GOMP_target_update1, GOMP_target_enter_exit_data.
* target.c (resolve_device): Call gomp_init_device here instead of
GOMP_target*.
(get_kind): Rename is_openacc to short_mapkind.
(gomp_map_vars): Likewise.
(gomp_unmap_vars): Likewise.
(gomp_update): Likewise.
(gomp_target_fallback): New static function.
(gomp_get_target_fn_addr): New static function.
(GOMP_target): Move host fallback and fn lookup to the new functions.
(GOMP_target1): New function.
(gomp_target_data_fallback): New static function.
(GOMP_target_data): Move host fallback to the new function.
(GOMP_target_data1): New function.
(GOMP_target_update): Do not call gomp_init_device.
(GOMP_target_update1): New function.
(GOMP_target_enter_exit_data): New function.


diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 492ca63..3c4b9e3 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -524,8 +524,9 @@ DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_VPTR_PTR_I16_BOOL_INT_INT,
 BT_INT)
 DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_SIZE_VPTR_PTR_PTR_INT_INT, BT_BOOL, BT_SIZE,
 BT_VOLATILE_PTR, BT_PTR, BT_PTR, BT_INT, BT_INT)
-DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR,
-BT_VOID, BT_INT, BT_PTR, BT_SIZE, BT_PTR, BT_PTR, BT_PTR)
+DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR,
+BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
+BT_PTR, BT_PTR)
 
 DEF_FUNCTION_TYPE_7 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_UINT,
 BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT,
@@ -534,9 +535,6 @@ DEF_FUNCTION_TYPE_7 
(BT_FN_BOOL_BOOL_ULL_ULL_ULL_ULL_ULLPTR_ULLPTR,
 BT_BOOL, BT_BOOL, BT_ULONGLONG, BT_ULONGLONG,
 BT_ULONGLONG, BT_ULONGLONG,
 BT_PTR_ULONGLONG, BT_PTR_ULONGLONG)
-DEF_FUNCTION_TYPE_7 (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR,
-BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_PTR, BT_SIZE,
-BT_PTR, BT_PTR, BT_PTR)
 
 DEF_FUNCTION_TYPE_8 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_LONG_UINT,
 BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT,
diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def
index c0d3989..18f81e6 100644
--- a/gcc/fortran/types.def
+++ b/gcc/fortran/types.def
@@ -189,8 +189,9 @@ DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_VPTR_PTR_I16_BOOL_INT_INT,
 BT_INT)
 DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_SIZE_VPTR_PTR_PTR_INT_INT, BT_BOOL, BT_SIZE,
 BT_VOLATILE_PTR, BT_PTR, BT_PTR, BT_INT, BT_INT)
-DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR,
-BT_VOID, BT_INT, BT_PTR, BT_SIZE, BT_PTR, BT_PTR, BT_PTR)
+DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR,
+BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
+BT_PTR, BT_PTR)
 
 DEF_FUNCTION_TYPE_7 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_UINT,
  BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT,
@@ -199,9 +200,6 @@ DEF_FUNCTION_TYPE_7 
(BT_FN_BOOL_BOOL_ULL_ULL_ULL_ULL_ULLPTR_ULLPTR,
 BT_BOOL, BT_BOOL, BT_ULONGLONG, BT_ULONGLONG,
 BT_ULONGLONG, BT_ULONGLONG,
 BT_PTR_ULONGLONG, BT_PTR_ULONGLONG)
-DEF_FUNCTION_TYPE_7 (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR,
-BT_VOID, BT_INT, BT_PTR_FN_VOID

Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-10-13 Thread Ilya Verbin
On Mon, Jun 15, 2015 at 22:48:50 +0300, Ilya Verbin wrote:
> @@ -950,50 +997,41 @@ GOMP_target (int device, void (*fn) (void *), const 
> void *unused,
> ...
> +  devicep->run_func (devicep->target_id, fn_addr, (void *) 
> tgt_vars->tgt_start);

If mapnum is 0, tgt_vars->tgt_start is uninitialized.  This is not a big bug,
because in this case the target function doesn't use this pointer, however
valgrind warns about sending uninitialized data to target.
OK for gomp-4_1-branch?


libgomp/
* target.c (gomp_map_vars): Zero tgt->tgt_start when mapnum is 0.


diff --git a/libgomp/target.c b/libgomp/target.c
index 95360d1..c4e3323 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -323,6 +323,7 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t 
mapnum,
   struct splay_tree_key_s cur_node;
   struct target_mem_desc *tgt
 = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
+  tgt->tgt_start = 0;
   tgt->list_count = mapnum;
   tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
   tgt->device_descr = devicep;


  -- Ilya


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-10-13 Thread Jakub Jelinek
On Tue, Oct 13, 2015 at 05:48:11PM +0300, Ilya Verbin wrote:
> On Mon, Jun 15, 2015 at 22:48:50 +0300, Ilya Verbin wrote:
> > @@ -950,50 +997,41 @@ GOMP_target (int device, void (*fn) (void *), const 
> > void *unused,
> > ...
> > +  devicep->run_func (devicep->target_id, fn_addr, (void *) 
> > tgt_vars->tgt_start);
> 
> If mapnum is 0, tgt_vars->tgt_start is uninitialized.  This is not a big bug,
> because in this case the target function doesn't use this pointer, however
> valgrind warns about sending uninitialized data to target.
> OK for gomp-4_1-branch?
> 
> 
> libgomp/
>   * target.c (gomp_map_vars): Zero tgt->tgt_start when mapnum is 0.

gomp-4_1-branch is frozen.  I'd prefer to initialize tgt_start and tgt_end
to 0 just in the
  if (mapnum == 0)
return tgt;
case.  With that change it is ok for trunk.

> diff --git a/libgomp/target.c b/libgomp/target.c
> index 95360d1..c4e3323 100644
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -323,6 +323,7 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t 
> mapnum,
>struct splay_tree_key_s cur_node;
>struct target_mem_desc *tgt
>  = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
> +  tgt->tgt_start = 0;
>tgt->list_count = mapnum;
>tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
>tgt->device_descr = devicep;

Jakub


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-15 Thread Jakub Jelinek
On Mon, Jun 15, 2015 at 03:20:37PM +0300, Ilya Verbin wrote:
> This patch introduces new versions of GOMP_target{,_data,_update} for OpenMP 
> 4.1
> with unsigned short for map kinds, but without new async arguments yet.

I think I'd prefer (for now) to suffix the functions with _41 instead of 1
(and we'll see if we can come up with better names when async support is
added).  Do we need to change GOMP_target_update though (at least right
now)?  I mean, the construct only allows to and from clauses, not the map
clause, and those don't really have an always modifier, nor release/delete
semantics etc., so at least for now I think using the current
GOMP_target_update should be ok.

Jakub


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-15 Thread Ilya Verbin
On Mon, Jun 15, 2015 at 15:06:09 +0200, Jakub Jelinek wrote:
> On Mon, Jun 15, 2015 at 03:20:37PM +0300, Ilya Verbin wrote:
> > This patch introduces new versions of GOMP_target{,_data,_update} for 
> > OpenMP 4.1
> > with unsigned short for map kinds, but without new async arguments yet.
> 
> I think I'd prefer (for now) to suffix the functions with _41 instead of 1
> (and we'll see if we can come up with better names when async support is
> added).

OK.

> Do we need to change GOMP_target_update though (at least right
> now)?  I mean, the construct only allows to and from clauses, not the map
> clause, and those don't really have an always modifier, nor release/delete
> semantics etc., so at least for now I think using the current
> GOMP_target_update should be ok.

I thought that it wouldn't look good, since without GOMP_target_update_41 we
will need to keep this obsolete parts:

-  switch (start_ix)
-{
-case BUILT_IN_GOMP_TARGET_UPDATE:
-  /* This const void * is part of the current ABI, but we're not actually
-using it.  */
-  args.quick_push (build_zero_cst (ptr_type_node));
-  break;
-case BUILT_IN_GOMP_TARGET:
-case BUILT_IN_GOMP_TARGET_DATA:
-case BUILT_IN_GOACC_DATA_START:
-case BUILT_IN_GOACC_ENTER_EXIT_DATA:
-case BUILT_IN_GOACC_PARALLEL:
-case BUILT_IN_GOACC_UPDATE:
-  break;
-default:
-  gcc_unreachable ();
-}

and

-  tree tkind_type;
-  int talign_shift;
-  if (is_gimple_omp_oacc (stmt))
-   {
- tkind_type = short_unsigned_type_node;
- talign_shift = 8;
-   }
-  else
-   {
- tkind_type = unsigned_char_type_node;
- talign_shift = 3;
-   }
+  tree tkind_type = short_unsigned_type_node;
+  int talign_shift = 8;

  -- Ilya


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-15 Thread Jakub Jelinek
On Mon, Jun 15, 2015 at 07:18:27PM +0300, Ilya Verbin wrote:
> On Mon, Jun 15, 2015 at 15:06:09 +0200, Jakub Jelinek wrote:
> > On Mon, Jun 15, 2015 at 03:20:37PM +0300, Ilya Verbin wrote:
> > > This patch introduces new versions of GOMP_target{,_data,_update} for 
> > > OpenMP 4.1
> > > with unsigned short for map kinds, but without new async arguments yet.
> > 
> > I think I'd prefer (for now) to suffix the functions with _41 instead of 1
> > (and we'll see if we can come up with better names when async support is
> > added).
> 
> OK.

Thanks.

> > Do we need to change GOMP_target_update though (at least right
> > now)?  I mean, the construct only allows to and from clauses, not the map
> > clause, and those don't really have an always modifier, nor release/delete
> > semantics etc., so at least for now I think using the current
> > GOMP_target_update should be ok.
> 
> I thought that it wouldn't look good, since without GOMP_target_update_41 we
> will need to keep this obsolete parts:

I'd prefer to keep it for now, perhaps later on we'll switch to 16-bit kinds
even for that, but better figure out first what to do with the async stuff,
handle the enter/exit data correctly, change the library for OpenMP 4.1 to
do the fully refcounted model.

Jakub


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-15 Thread Ilya Verbin
On Mon, Jun 15, 2015 at 18:25:28 +0200, Jakub Jelinek wrote:
> On Mon, Jun 15, 2015 at 07:18:27PM +0300, Ilya Verbin wrote:
> > On Mon, Jun 15, 2015 at 15:06:09 +0200, Jakub Jelinek wrote:
> > > On Mon, Jun 15, 2015 at 03:20:37PM +0300, Ilya Verbin wrote:
> > > > This patch introduces new versions of GOMP_target{,_data,_update} for 
> > > > OpenMP 4.1
> > > > with unsigned short for map kinds, but without new async arguments yet.
> > > 
> > > I think I'd prefer (for now) to suffix the functions with _41 instead of 1
> > > (and we'll see if we can come up with better names when async support is
> > > added).
> > 
> > OK.
> 
> Thanks.
> 
> > > Do we need to change GOMP_target_update though (at least right
> > > now)?  I mean, the construct only allows to and from clauses, not the map
> > > clause, and those don't really have an always modifier, nor release/delete
> > > semantics etc., so at least for now I think using the current
> > > GOMP_target_update should be ok.
> > 
> > I thought that it wouldn't look good, since without GOMP_target_update_41 we
> > will need to keep this obsolete parts:
> 
> I'd prefer to keep it for now, perhaps later on we'll switch to 16-bit kinds
> even for that, but better figure out first what to do with the async stuff,
> handle the enter/exit data correctly, change the library for OpenMP 4.1 to
> do the fully refcounted model.

Here is the new patch.  OK to commit?


gcc/
* builtin-types.def (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New.
(BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove.
* omp-builtins.def (BUILT_IN_GOMP_TARGET): Replace GOMP_target with
GOMP_target_41.
(BUILT_IN_GOMP_TARGET_DATA): Replace GOMP_target_data with
GOMP_target_data_41.
(BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA): New.
* omp-low.c (expand_omp_target): Use
BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA for GF_OMP_TARGET_KIND_ENTER_DATA
and GF_OMP_TARGET_KIND_EXIT_DATA.
Do not pass obsolete pointer to new builtins.
(lower_omp_target): Use unsigned short for map kinds, except
BUILT_IN_GOMP_TARGET_UPDATE.
gcc/fortran/
* types.def (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New.
(BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove.
libgomp/
* libgomp.map (GOMP_4.1): Add GOMP_target_41, GOMP_target_data_41,
GOMP_target_enter_exit_data.
* libgomp_g.h: Declare GOMP_target_41, GOMP_target_data_41,
GOMP_target_enter_exit_data.
* target.c (resolve_device): Call gomp_init_device here instead of
GOMP_target*.
(get_kind): Rename is_openacc to short_mapkind.
(gomp_map_vars): Likewise.
(gomp_unmap_vars): Likewise.
(gomp_update): Likewise.
(gomp_target_fallback): New static function.
(gomp_get_target_fn_addr): New static function.
(GOMP_target): Move host fallback and fn lookup to the new functions.
(GOMP_target_41): New function.
(gomp_target_data_fallback): New static function.
(GOMP_target_data): Move host fallback to the new function.
(GOMP_target_data_41): New function.
(GOMP_target_update): Do not call gomp_init_device.
(GOMP_target_enter_exit_data): New function.


diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 492ca63..870c957 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -526,6 +526,9 @@ DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_SIZE_VPTR_PTR_PTR_INT_INT, 
BT_BOOL, BT_SIZE,
 BT_VOLATILE_PTR, BT_PTR, BT_PTR, BT_INT, BT_INT)
 DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR,
 BT_VOID, BT_INT, BT_PTR, BT_SIZE, BT_PTR, BT_PTR, BT_PTR)
+DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR,
+BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
+BT_PTR, BT_PTR)
 
 DEF_FUNCTION_TYPE_7 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_UINT,
 BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT,
@@ -534,9 +537,6 @@ DEF_FUNCTION_TYPE_7 
(BT_FN_BOOL_BOOL_ULL_ULL_ULL_ULL_ULLPTR_ULLPTR,
 BT_BOOL, BT_BOOL, BT_ULONGLONG, BT_ULONGLONG,
 BT_ULONGLONG, BT_ULONGLONG,
 BT_PTR_ULONGLONG, BT_PTR_ULONGLONG)
-DEF_FUNCTION_TYPE_7 (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR,
-BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_PTR, BT_SIZE,
-BT_PTR, BT_PTR, BT_PTR)
 
 DEF_FUNCTION_TYPE_8 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_LONG_UINT,
 BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT,
diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def
index c0d3989..a830235 100644
--- a/gcc/fortran/types.def
+++ b/gcc/fortran/types.def
@@ -189,6 +189,9 @@ DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_VPTR_PTR_I16_BOOL_INT_INT,
 BT_INT)
 DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_SIZE_VPTR_PTR_PTR_INT_INT, BT_BOOL, BT_SIZE,
 BT_VOLATILE_PTR, BT_PT

Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-15 Thread Jakub Jelinek
On Mon, Jun 15, 2015 at 10:48:50PM +0300, Ilya Verbin wrote:
> Here is the new patch.  OK to commit?
> 
> 
> gcc/
>   * builtin-types.def (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New.
>   (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove.
>   * omp-builtins.def (BUILT_IN_GOMP_TARGET): Replace GOMP_target with
>   GOMP_target_41.
>   (BUILT_IN_GOMP_TARGET_DATA): Replace GOMP_target_data with
>   GOMP_target_data_41.
>   (BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA): New.
>   * omp-low.c (expand_omp_target): Use
>   BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA for GF_OMP_TARGET_KIND_ENTER_DATA
>   and GF_OMP_TARGET_KIND_EXIT_DATA.
>   Do not pass obsolete pointer to new builtins.
>   (lower_omp_target): Use unsigned short for map kinds, except
>   BUILT_IN_GOMP_TARGET_UPDATE.
> gcc/fortran/
>   * types.def (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New.
>   (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove.
> libgomp/
>   * libgomp.map (GOMP_4.1): Add GOMP_target_41, GOMP_target_data_41,
>   GOMP_target_enter_exit_data.
>   * libgomp_g.h: Declare GOMP_target_41, GOMP_target_data_41,
>   GOMP_target_enter_exit_data.
>   * target.c (resolve_device): Call gomp_init_device here instead of
>   GOMP_target*.
>   (get_kind): Rename is_openacc to short_mapkind.
>   (gomp_map_vars): Likewise.
>   (gomp_unmap_vars): Likewise.
>   (gomp_update): Likewise.
>   (gomp_target_fallback): New static function.
>   (gomp_get_target_fn_addr): New static function.
>   (GOMP_target): Move host fallback and fn lookup to the new functions.
>   (GOMP_target_41): New function.
>   (gomp_target_data_fallback): New static function.
>   (GOMP_target_data): Move host fallback to the new function.
>   (GOMP_target_data_41): New function.
>   (GOMP_target_update): Do not call gomp_init_device.
>   (GOMP_target_enter_exit_data): New function.

Ok, thanks.

Jakub


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-19 Thread Ilya Verbin
Given that a mapped variable in 4.1 can have different kinds across nested data
regions, we need to store map-type not only for each var, but also for each
structured mapping.  Here is my WIP patch, is it sane? :)
Attached testcase works OK on the device with non-shared memory.


diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index f8efbdd..88623ac 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -107,6 +107,12 @@ enum gomp_map_kind
 #define GOMP_MAP_POINTER_P(X) \
   ((X) == GOMP_MAP_POINTER)
 
+#define GOMP_MAP_ALWAYS_TO_P(X) \
+  (((X) == GOMP_MAP_ALWAYS_TO) || ((X) == GOMP_MAP_ALWAYS_TOFROM))
+
+#define GOMP_MAP_ALWAYS_FROM_P(X) \
+  (((X) == GOMP_MAP_ALWAYS_FROM) || ((X) == GOMP_MAP_ALWAYS_TOFROM))
+
 
 /* Asynchronous behavior.  Keep in sync with
libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t.  */
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 87d6c40..8e6d4ac 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -636,6 +636,15 @@ typedef struct splay_tree_node_s *splay_tree_node;
 typedef struct splay_tree_s *splay_tree;
 typedef struct splay_tree_key_s *splay_tree_key;
 
+struct target_var_desc {
+  /* Splay key.  */
+  splay_tree_key key;
+  /* True if data should be copied from device to host at the end.  */
+  bool copy_from;
+  /* True if data always should be copied from device to host at the end.  */
+  bool always_copy_from;
+};
+
 struct target_mem_desc {
   /* Reference count.  */
   uintptr_t refcount;
@@ -655,9 +664,9 @@ struct target_mem_desc {
   /* Corresponding target device descriptor.  */
   struct gomp_device_descr *device_descr;
 
-  /* List of splay keys to remove (or decrease refcount)
+  /* List of target items to remove (or decrease refcount)
  at the end of region.  */
-  splay_tree_key list[];
+  struct target_var_desc list[];
 };
 
 struct splay_tree_key_s {
@@ -673,8 +682,6 @@ struct splay_tree_key_s {
   uintptr_t refcount;
   /* Asynchronous reference count.  */
   uintptr_t async_refcount;
-  /* True if data should be copied from device to host at the end.  */
-  bool copy_from;
 };
 
 #include "splay-tree.h"
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 90d43eb..c0fcb07 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -651,7 +651,7 @@ gomp_acc_remove_pointer (void *h, bool force_copyfrom, int 
async, int mapnum)
 }
 
   if (force_copyfrom)
-t->list[0]->copy_from = 1;
+t->list[0].copy_from = 1;
 
   gomp_mutex_unlock (&acc_dev->lock);
 
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index d899946..8ea3dd1 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -135,8 +135,8 @@ GOACC_parallel (int device, void (*fn) (void *),
 
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
-devaddrs[i] = (void *) (tgt->list[i]->tgt->tgt_start
-   + tgt->list[i]->tgt_offset);
+devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
+   + tgt->list[i].key->tgt_offset);
 
   acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs, sizes, 
kinds,
  num_gangs, num_workers, vector_length, async,
diff --git a/libgomp/target.c b/libgomp/target.c
index fb8487a..6829ff4 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -161,6 +161,12 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep, 
splay_tree_key oldn,
  (void *) newn->host_start, (void *) newn->host_end,
  (void *) oldn->host_start, (void *) oldn->host_end);
 }
+
+  if (GOMP_MAP_ALWAYS_TO_P (kind))
+devicep->host2dev_func (devicep->target_id,
+   (void *) (oldn->tgt->tgt_start + oldn->tgt_offset),
+   (void *) newn->host_start,
+   newn->host_end - newn->host_start);
   oldn->refcount++;
 }
 
@@ -260,7 +266,7 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t 
mapnum,
   int kind = get_kind (short_mapkind, kinds, i);
   if (hostaddrs[i] == NULL)
{
- tgt->list[i] = NULL;
+ tgt->list[i].key = NULL;
  continue;
}
   cur_node.host_start = (uintptr_t) hostaddrs[i];
@@ -271,12 +277,15 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t 
mapnum,
   splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
   if (n)
{
- tgt->list[i] = n;
+ tgt->list[i].key = n;
+ tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask);
+ tgt->list[i].always_copy_from
+   = GOMP_MAP_ALWAYS_FROM_P (kind & typemask);
  gomp_map_vars_existing (devicep, n, &cur_node, kind & typemask);
}
   else
{
- tgt->list[i] = NULL;
+ tgt->list[i].key = NULL;
 
  size_t align = (size_t) 1 << (kind >> rshift);
  not_found_cnt++;
@@ -297,7 +306,7 @@ gomp_map_vars (struct gomp_device_descr *devicep, size

Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-23 Thread Ilya Verbin
On Sat, Jun 20, 2015 at 00:35:14 +0300, Ilya Verbin wrote:
> Given that a mapped variable in 4.1 can have different kinds across nested 
> data
> regions, we need to store map-type not only for each var, but also for each
> structured mapping.  Here is my WIP patch, is it sane? :)
> Attached testcase works OK on the device with non-shared memory.

A bit updated version with a fix for GOMP_MAP_TO_PSET.
make check-target-libgomp passed.


include/gcc/
* gomp-constants.h (GOMP_MAP_ALWAYS_TO_P,
GOMP_MAP_ALWAYS_FROM_P): Define.
libgomp/
* libgomp.h (struct target_var_desc): New.
(struct target_mem_desc): Replace array of splay_tree_key with array of
target_var_desc.
(struct splay_tree_key_s): Move copy_from to target_var_desc.
* oacc-mem.c (gomp_acc_remove_pointer): Use copy_from from
target_var_desc.
* oacc-parallel.c (GOACC_parallel): Use copy_from from target_var_desc.
* target.c (gomp_map_vars_existing): Copy data to device if map-type is
'always to' or 'always tofrom'.
(gomp_map_vars): Use key from target_var_desc.  Set copy_from and
always_copy_from.
(gomp_copy_from_async): Use key and copy_from from target_var_desc.
(gomp_unmap_vars): Copy data from device if always_copy_from is set.
(gomp_offload_image_to_device): Do not use copy_from.
* testsuite/libgomp.c/target-11.c: New test.


diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index 1849478..42bec04 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -107,6 +107,12 @@ enum gomp_map_kind
 #define GOMP_MAP_POINTER_P(X) \
   ((X) == GOMP_MAP_POINTER)
 
+#define GOMP_MAP_ALWAYS_TO_P(X) \
+  (((X) == GOMP_MAP_ALWAYS_TO) || ((X) == GOMP_MAP_ALWAYS_TOFROM))
+
+#define GOMP_MAP_ALWAYS_FROM_P(X) \
+  (((X) == GOMP_MAP_ALWAYS_FROM) || ((X) == GOMP_MAP_ALWAYS_TOFROM))
+
 
 /* Asynchronous behavior.  Keep in sync with
libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t.  */
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 87d6c40..8e6d4ac 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -636,6 +636,15 @@ typedef struct splay_tree_node_s *splay_tree_node;
 typedef struct splay_tree_s *splay_tree;
 typedef struct splay_tree_key_s *splay_tree_key;
 
+struct target_var_desc {
+  /* Splay key.  */
+  splay_tree_key key;
+  /* True if data should be copied from device to host at the end.  */
+  bool copy_from;
+  /* True if data always should be copied from device to host at the end.  */
+  bool always_copy_from;
+};
+
 struct target_mem_desc {
   /* Reference count.  */
   uintptr_t refcount;
@@ -655,9 +664,9 @@ struct target_mem_desc {
   /* Corresponding target device descriptor.  */
   struct gomp_device_descr *device_descr;
 
-  /* List of splay keys to remove (or decrease refcount)
+  /* List of target items to remove (or decrease refcount)
  at the end of region.  */
-  splay_tree_key list[];
+  struct target_var_desc list[];
 };
 
 struct splay_tree_key_s {
@@ -673,8 +682,6 @@ struct splay_tree_key_s {
   uintptr_t refcount;
   /* Asynchronous reference count.  */
   uintptr_t async_refcount;
-  /* True if data should be copied from device to host at the end.  */
-  bool copy_from;
 };
 
 #include "splay-tree.h"
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 90d43eb..c0fcb07 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -651,7 +651,7 @@ gomp_acc_remove_pointer (void *h, bool force_copyfrom, int 
async, int mapnum)
 }
 
   if (force_copyfrom)
-t->list[0]->copy_from = 1;
+t->list[0].copy_from = 1;
 
   gomp_mutex_unlock (&acc_dev->lock);
 
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index d899946..8ea3dd1 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -135,8 +135,8 @@ GOACC_parallel (int device, void (*fn) (void *),
 
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
-devaddrs[i] = (void *) (tgt->list[i]->tgt->tgt_start
-   + tgt->list[i]->tgt_offset);
+devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
+   + tgt->list[i].key->tgt_offset);
 
   acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs, sizes, 
kinds,
  num_gangs, num_workers, vector_length, async,
diff --git a/libgomp/target.c b/libgomp/target.c
index fb8487a..b1640c1 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -161,6 +161,12 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep, 
splay_tree_key oldn,
  (void *) newn->host_start, (void *) newn->host_end,
  (void *) oldn->host_start, (void *) oldn->host_end);
 }
+
+  if (GOMP_MAP_ALWAYS_TO_P (kind))
+devicep->host2dev_func (devicep->target_id,
+   (void *) (oldn->tgt->tgt_start + oldn->tgt_offset),
+   (void *) newn->host_start,
+

Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 02:40:43PM +0300, Ilya Verbin wrote:
> On Sat, Jun 20, 2015 at 00:35:14 +0300, Ilya Verbin wrote:
> > Given that a mapped variable in 4.1 can have different kinds across nested 
> > data
> > regions, we need to store map-type not only for each var, but also for each
> > structured mapping.  Here is my WIP patch, is it sane? :)
> > Attached testcase works OK on the device with non-shared memory.
> 
> A bit updated version with a fix for GOMP_MAP_TO_PSET.
> make check-target-libgomp passed.

Ok, thanks.

> include/gcc/
>   * gomp-constants.h (GOMP_MAP_ALWAYS_TO_P,
>   GOMP_MAP_ALWAYS_FROM_P): Define.
> libgomp/
>   * libgomp.h (struct target_var_desc): New.
>   (struct target_mem_desc): Replace array of splay_tree_key with array of
>   target_var_desc.
>   (struct splay_tree_key_s): Move copy_from to target_var_desc.
>   * oacc-mem.c (gomp_acc_remove_pointer): Use copy_from from
>   target_var_desc.
>   * oacc-parallel.c (GOACC_parallel): Use copy_from from target_var_desc.
>   * target.c (gomp_map_vars_existing): Copy data to device if map-type is
>   'always to' or 'always tofrom'.
>   (gomp_map_vars): Use key from target_var_desc.  Set copy_from and
>   always_copy_from.
>   (gomp_copy_from_async): Use key and copy_from from target_var_desc.
>   (gomp_unmap_vars): Copy data from device if always_copy_from is set.
>   (gomp_offload_image_to_device): Do not use copy_from.
>   * testsuite/libgomp.c/target-11.c: New test.

> +  /* Set dd on target to 0 for the further check.  */
> +  #pragma omp target map(always to: dd)
> + { dd; }

This reminds me that:
  if (ctx->region_type == ORT_TARGET && !(n->value & GOVD_SEEN))
remove = true;
in gimplify.c is not what we want, if it is has GOMP_MAP_KIND_ALWAYS,
then we shouldn't remove it even when it is not mentioned inside of the
region's body, because it then has side-effects.

Jakub


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-23 Thread Ilya Verbin
On Tue, Jun 23, 2015 at 13:51:39 +0200, Jakub Jelinek wrote:
> > +  /* Set dd on target to 0 for the further check.  */
> > +  #pragma omp target map(always to: dd)
> > +   { dd; }
> 
> This reminds me that:
>   if (ctx->region_type == ORT_TARGET && !(n->value & GOVD_SEEN))
> remove = true;
> in gimplify.c is not what we want, if it is has GOMP_MAP_KIND_ALWAYS,
> then we shouldn't remove it even when it is not mentioned inside of the
> region's body, because it then has side-effects.

OK for gomp-4_1-branch?


gcc/
* gimplify.c (gimplify_adjust_omp_clauses): Don't remove map clause if
it has map-type-modifier always.
libgomp/
* testsuite/libgomp.c/target-11.c (main): Remove dd from target region.


diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 9b2347a..74fe60b 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6870,7 +6870,8 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree 
*list_p)
  if (!DECL_P (decl))
break;
  n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
- if (ctx->region_type == ORT_TARGET && !(n->value & GOVD_SEEN))
+ if (ctx->region_type == ORT_TARGET && !(n->value & GOVD_SEEN)
+ && !(OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_FLAG_ALWAYS))
remove = true;
  else if (DECL_SIZE (decl)
   && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST
diff --git a/libgomp/testsuite/libgomp.c/target-11.c 
b/libgomp/testsuite/libgomp.c/target-11.c
index 4562d88..0fd183b 100644
--- a/libgomp/testsuite/libgomp.c/target-11.c
+++ b/libgomp/testsuite/libgomp.c/target-11.c
@@ -13,7 +13,7 @@ int main ()
 
   /* Set dd on target to 0 for the further check.  */
   #pragma omp target map(always to: dd)
-   { dd; }
+   ;
 
   dd = 1;
   #pragma omp target map(tofrom: aa) map(always to: bb) \


  -- Ilya


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-23 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 05:54:48PM +0300, Ilya Verbin wrote:
> On Tue, Jun 23, 2015 at 13:51:39 +0200, Jakub Jelinek wrote:
> > > +  /* Set dd on target to 0 for the further check.  */
> > > +  #pragma omp target map(always to: dd)
> > > + { dd; }
> > 
> > This reminds me that:
> >   if (ctx->region_type == ORT_TARGET && !(n->value & GOVD_SEEN))
> > remove = true;
> > in gimplify.c is not what we want, if it is has GOMP_MAP_KIND_ALWAYS,
> > then we shouldn't remove it even when it is not mentioned inside of the
> > region's body, because it then has side-effects.
> 
> OK for gomp-4_1-branch?
> 
> 
> gcc/
>   * gimplify.c (gimplify_adjust_omp_clauses): Don't remove map clause if
>   it has map-type-modifier always.
> libgomp/
>   * testsuite/libgomp.c/target-11.c (main): Remove dd from target region.

GOMP_MAP_RELEASE uses the GOMP_MAP_FLAG_ALWAYS for something different from
always, because always release and always delete is not meaningful.
But as neither release nor delete can appear on map clause in target region,
it doesn't matter (at least for now).
So the patch is ok, thanks.

Jakub


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-24 Thread Jakub Jelinek
On Tue, Jun 23, 2015 at 02:40:43PM +0300, Ilya Verbin wrote:
> On Sat, Jun 20, 2015 at 00:35:14 +0300, Ilya Verbin wrote:
> > Given that a mapped variable in 4.1 can have different kinds across nested 
> > data
> > regions, we need to store map-type not only for each var, but also for each
> > structured mapping.  Here is my WIP patch, is it sane? :)
> > Attached testcase works OK on the device with non-shared memory.
> 
> A bit updated version with a fix for GOMP_MAP_TO_PSET.
> make check-target-libgomp passed.

Thinking about this more, for always modifier this isn't really sufficient.
Consider:
void
foo (int *p)
{
  #pragma omp target data (alloc:p[0:32])
  {
#pragma omp target data (always, from:p[7:9])
{
  ...
}
  }
}
If all we record is the corresponding splay_tree and the flags
(from/always_from), then this would try to copy from the device
the whole array section, rather than just the small portion of it.
So, supposedly in addition to the splay_tree for always from case we also
need to remember e.g. [relative offset, length] within the splay tree
object.

Jakub


Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-24 Thread Ilya Verbin
On Wed, Jun 24, 2015 at 13:39:03 +0200, Jakub Jelinek wrote:
> Thinking about this more, for always modifier this isn't really sufficient.
> Consider:
> void
> foo (int *p)
> {
>   #pragma omp target data (alloc:p[0:32])
>   {
> #pragma omp target data (always, from:p[7:9])
> {
>   ...
> }
>   }
> }
> If all we record is the corresponding splay_tree and the flags
> (from/always_from), then this would try to copy from the device
> the whole array section, rather than just the small portion of it.
> So, supposedly in addition to the splay_tree for always from case we also
> need to remember e.g. [relative offset, length] within the splay tree
> object.

Indeed, here is the fix, make check-target-libgomp passed.


libgomp/
* libgomp.h (struct target_var_desc): Add offset and length.
* target.c (gomp_map_vars_existing): New argument tgt_var, fill it.
(gomp_map_vars): Move filling of tgt->list[i] into
gomp_map_vars_existing.  Add missed case GOMP_MAP_ALWAYS_FROM.
(gomp_unmap_vars): Add list[i].offset to host and target addresses,
use list[i].length instead of k->host_end - k->host_start.
* testsuite/libgomp.c/target-11.c: Extend for testing array sections.


diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index bd17828..c48e708 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -644,6 +644,12 @@ struct target_var_desc {
   bool copy_from;
   /* True if data always should be copied from device to host at the end.  */
   bool always_copy_from;
+  /* Used for unmapping of array sections, can be nonzero only when
+ always_copy_from is true.  */
+  uintptr_t offset;
+  /* Used for unmapping of array sections, can be less than the size of the
+ whole object only when always_copy_from is true.  */
+  uintptr_t length;
 };
 
 struct target_mem_desc {
diff --git a/libgomp/target.c b/libgomp/target.c
index b1640c1..a394e95 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -149,8 +149,15 @@ resolve_device (int device_id)
 
 static inline void
 gomp_map_vars_existing (struct gomp_device_descr *devicep, splay_tree_key oldn,
-   splay_tree_key newn, unsigned char kind)
+   splay_tree_key newn, struct target_var_desc *tgt_var,
+   unsigned char kind)
 {
+  tgt_var->key = oldn;
+  tgt_var->copy_from = GOMP_MAP_COPY_FROM_P (kind);
+  tgt_var->always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind);
+  tgt_var->offset = newn->host_start - oldn->host_start;
+  tgt_var->length = newn->host_end - newn->host_start;
+
   if ((kind & GOMP_MAP_FLAG_FORCE)
   || oldn->host_start > newn->host_start
   || oldn->host_end < newn->host_end)
@@ -276,13 +283,8 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t 
mapnum,
cur_node.host_end = cur_node.host_start + sizeof (void *);
   splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
   if (n)
-   {
- tgt->list[i].key = n;
- tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask);
- tgt->list[i].always_copy_from
-   = GOMP_MAP_ALWAYS_FROM_P (kind & typemask);
- gomp_map_vars_existing (devicep, n, &cur_node, kind & typemask);
-   }
+   gomp_map_vars_existing (devicep, n, &cur_node, &tgt->list[i],
+   kind & typemask);
   else
{
  tgt->list[i].key = NULL;
@@ -367,13 +369,8 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t 
mapnum,
  k->host_end = k->host_start + sizeof (void *);
splay_tree_key n = splay_tree_lookup (mem_map, k);
if (n)
- {
-   tgt->list[i].key = n;
-   tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask);
-   tgt->list[i].always_copy_from
- = GOMP_MAP_ALWAYS_FROM_P (kind & typemask);
-   gomp_map_vars_existing (devicep, n, k, kind & typemask);
- }
+ gomp_map_vars_existing (devicep, n, k, &tgt->list[i],
+ kind & typemask);
else
  {
size_t align = (size_t) 1 << (kind >> rshift);
@@ -385,6 +382,8 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t 
mapnum,
tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask);
tgt->list[i].always_copy_from
  = GOMP_MAP_ALWAYS_FROM_P (kind & typemask);
+   tgt->list[i].offset = 0;
+   tgt->list[i].length = k->host_end - k->host_start;
k->refcount = 1;
k->async_refcount = 0;
tgt->refcount++;
@@ -397,6 +396,7 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t 
mapnum,
  case GOMP_MAP_FROM:
  case GOMP_MAP_FORCE_ALLOC:
  case GOMP_MAP_FORCE_FROM:
+ case GOMP_MAP_ALWAYS_FROM:
break;
  case GOMP

Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data

2015-06-24 Thread Jakub Jelinek
On Wed, Jun 24, 2015 at 11:11:12PM +0300, Ilya Verbin wrote:
> Indeed, here is the fix, make check-target-libgomp passed.
> 
> 
> libgomp/
>   * libgomp.h (struct target_var_desc): Add offset and length.
>   * target.c (gomp_map_vars_existing): New argument tgt_var, fill it.
>   (gomp_map_vars): Move filling of tgt->list[i] into
>   gomp_map_vars_existing.  Add missed case GOMP_MAP_ALWAYS_FROM.
>   (gomp_unmap_vars): Add list[i].offset to host and target addresses,
>   use list[i].length instead of k->host_end - k->host_start.
>   * testsuite/libgomp.c/target-11.c: Extend for testing array sections.

Ok, thanks.

Jakub


OpenACC async clause regressions (was: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data)

2015-10-19 Thread Thomas Schwinge
Hi!

Chung-Lin, would you please have a look at the following (on
gomp-4_0-branch)?  Also, anyone else got any ideas off-hand?

On Tue, 23 Jun 2015 13:51:39 +0200, Jakub Jelinek  wrote:
> On Tue, Jun 23, 2015 at 02:40:43PM +0300, Ilya Verbin wrote:
> > On Sat, Jun 20, 2015 at 00:35:14 +0300, Ilya Verbin wrote:
> > > Given that a mapped variable in 4.1 can have different kinds across 
> > > nested data
> > > regions, we need to store map-type not only for each var, but also for 
> > > each
> > > structured mapping.  Here is my WIP patch, is it sane? :)
> > > Attached testcase works OK on the device with non-shared memory.
> > 
> > A bit updated version with a fix for GOMP_MAP_TO_PSET.
> > make check-target-libgomp passed.
> 
> Ok, thanks.
> 
> > include/gcc/
> > * gomp-constants.h (GOMP_MAP_ALWAYS_TO_P,
> > GOMP_MAP_ALWAYS_FROM_P): Define.
> > libgomp/
> > * libgomp.h (struct target_var_desc): New.
> > (struct target_mem_desc): Replace array of splay_tree_key with array of
> > target_var_desc.
> > (struct splay_tree_key_s): Move copy_from to target_var_desc.
> > * oacc-mem.c (gomp_acc_remove_pointer): Use copy_from from
> > target_var_desc.
> > * oacc-parallel.c (GOACC_parallel): Use copy_from from target_var_desc.
> > * target.c (gomp_map_vars_existing): Copy data to device if map-type is
> > 'always to' or 'always tofrom'.
> > (gomp_map_vars): Use key from target_var_desc.  Set copy_from and
> > always_copy_from.
> > (gomp_copy_from_async): Use key and copy_from from target_var_desc.
> > (gomp_unmap_vars): Copy data from device if always_copy_from is set.
> > (gomp_offload_image_to_device): Do not use copy_from.
> > * testsuite/libgomp.c/target-11.c: New test.

(That's gomp-4_1-branch r224838.  The attached
gomp-4_1-branch-r224838.patch is a variant that applies on top of
gomp-4_0-branch r228972.)  This change introduces regressions in OpenACC
async clause handling.

Testing on gomp-4_1-branch r224838:

PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-3.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
[-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-3.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test

Same for C++.

Testing on gomp-4_0-branch r228972 plus the attached
gomp-4_1-branch-r224838.patch:

PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/asyncwait-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none (test for 
excess errors)
[-PASS:-]{+FAIL:+} 
libgomp.oacc-c/../libgomp.oacc-c-c++-common/asyncwait-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none execution 
test

Same for C++.

As I mentioned in
,
all three regressions are visible when testing on trunk r228777.  I have
not analyzed why the three different branches show different sets of
regressions -- I'm hoping they're all manifestations of the same
underlying problem: they're all using the OpenACC async clause.

Looking at gomp-4_0-branch r228972 plus the attached
gomp-4_1-branch-r224838.patch, clearly there is "some kind of data
corruption":

$ gdb -q a.out 
Reading symbols from a.out...done.
(gdb) start
[...]
25  a = (float *) malloc (nbytes);
(gdb) n
26  b = (float *) malloc (nbytes);
(gdb) print a
$1 = (float *) 0xab12c0
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x004015d2 in main (argc=1, argv=0x7fffd408) at 
source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-1.c:133
133 if (a[i] != 3.0)
(gdb) print a
$2 = (float *) 0x500680620

0x500680620 looks like a nvptx device pointer to me, which is a) wrong
(after the "malloc", "a" shouldn't change its value throughout program
execution), and b) that "explains" the segmentation fault (device pointer
dereferenced in host code).

So, maybe data is erroneously being copied back to the host from device,
or from libgomp internal data structures.  Maybe some copy_from flag
handling needs to be adjusted or added in the OpenACC code in libgomp?


I have no idea whether that's related, but I noticed that currently we're
not in any way handling async_refcount in libgomp/oacc-*.c -- do we have
to?  (Its name certainly makes me believe it's related to asynchronous
data (un-)mapping.)  Should we be able to drop some of the
OpenACC-specific async implementation in libgomp, and use new/generic
target.c code instead?


Please note that there will be further libgomp changes (target.c, and
other files) coming in l

Re: OpenACC async clause regressions (was: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data)

2015-10-19 Thread Ilya Verbin
On Mon, Oct 19, 2015 at 18:24:35 +0200, Thomas Schwinge wrote:
> Chung-Lin, would you please have a look at the following (on
> gomp-4_0-branch)?  Also, anyone else got any ideas off-hand?
> 
> PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-2.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
> [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-2.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
> PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-3.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
> [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-3.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test

Maybe it was caused by this change in gomp_unmap_vars?
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01376.html

Looking at the code, I don't see any difference in async_refcount handling, but
I was unable to test it without having hardware :(

  -- Ilya


Re: OpenACC async clause regressions (was: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data)

2015-10-20 Thread Jakub Jelinek
On Mon, Oct 19, 2015 at 07:43:59PM +0300, Ilya Verbin wrote:
> On Mon, Oct 19, 2015 at 18:24:35 +0200, Thomas Schwinge wrote:
> > Chung-Lin, would you please have a look at the following (on
> > gomp-4_0-branch)?  Also, anyone else got any ideas off-hand?
> > 
> > PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-2.c 
> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
> > [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-2.c 
> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
> > PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-3.c 
> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
> > [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/data-3.c 
> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
> 
> Maybe it was caused by this change in gomp_unmap_vars?
> https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01376.html
> 
> Looking at the code, I don't see any difference in async_refcount handling, 
> but
> I was unable to test it without having hardware :(

I think that is the only patch that could have affected it.
The copy_from change is from the old behavior, where basically all
concurrent mappings ored into the copy_from flag and when refcount went to
0, if there were any mappings with from or tofrom, it copied back,
the OpenMP 4.5 behavior is that whether data is copied from the device
is determined solely by the mapping kind of the mapping that performs the
refcount decrease to 0.  Plus there is the always flag which requests
the data copying operation always, no matter what the refcount is (either on
the mapping/refcount increase side, or unmapping/refcount decrease size).

Jakub


[gomp4] OpenACC async clause regressions (was: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data)

2015-10-22 Thread Thomas Schwinge
Hi!

On Mon, 19 Oct 2015 18:24:35 +0200, I wrote:
> Chung-Lin, would you please have a look at the following (on
> gomp-4_0-branch)?  Also, anyone else got any ideas off-hand?

Ilya, Jakub, thanks for your comments!

> On Tue, 23 Jun 2015 13:51:39 +0200, Jakub Jelinek  wrote:
> > On Tue, Jun 23, 2015 at 02:40:43PM +0300, Ilya Verbin wrote:
> > > On Sat, Jun 20, 2015 at 00:35:14 +0300, Ilya Verbin wrote:
> > > > Given that a mapped variable in 4.1 can have different kinds across 
> > > > nested data
> > > > regions, we need to store map-type not only for each var, but also for 
> > > > each
> > > > structured mapping.  Here is my WIP patch, is it sane? :)
> > > > Attached testcase works OK on the device with non-shared memory.
> > > 
> > > A bit updated version with a fix for GOMP_MAP_TO_PSET.
> > > make check-target-libgomp passed.
> > 
> > Ok, thanks.
> > 
> > > include/gcc/
> > >   * gomp-constants.h (GOMP_MAP_ALWAYS_TO_P,
> > >   GOMP_MAP_ALWAYS_FROM_P): Define.
> > > libgomp/
> > >   * libgomp.h (struct target_var_desc): New.
> > >   (struct target_mem_desc): Replace array of splay_tree_key with array of
> > >   target_var_desc.
> > >   (struct splay_tree_key_s): Move copy_from to target_var_desc.
> > >   * oacc-mem.c (gomp_acc_remove_pointer): Use copy_from from
> > >   target_var_desc.
> > >   * oacc-parallel.c (GOACC_parallel): Use copy_from from target_var_desc.
> > >   * target.c (gomp_map_vars_existing): Copy data to device if map-type is
> > >   'always to' or 'always tofrom'.
> > >   (gomp_map_vars): Use key from target_var_desc.  Set copy_from and
> > >   always_copy_from.
> > >   (gomp_copy_from_async): Use key and copy_from from target_var_desc.
> > >   (gomp_unmap_vars): Copy data from device if always_copy_from is set.
> > >   (gomp_offload_image_to_device): Do not use copy_from.
> > >   * testsuite/libgomp.c/target-11.c: New test.
> 
> (That's gomp-4_1-branch r224838.  The attached
> gomp-4_1-branch-r224838.patch is a variant that applies on top of
> gomp-4_0-branch r228972.)  This change introduces regressions in OpenACC
> async clause handling.

> Testing on gomp-4_0-branch r228972 plus the attached
> gomp-4_1-branch-r224838.patch:
> 
> PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/asyncwait-1.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none (test for 
> excess errors)
> [-PASS:-]{+FAIL:+} 
> libgomp.oacc-c/../libgomp.oacc-c-c++-common/asyncwait-1.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none execution 
> test
> 
> Same for C++.

With an XFAIL added (Chung-Lin, please remove that one once you come up
with a fix), and merge conflicts resolved as follows, I have now merged
gomp-4_1-branch r224838 in gomp-4_0-branch r229178:

commit cbef8ef8e3b6bf7ea3705b1fae5462be9e619a56
Merge: 3596aeb a568354
Author: tschwinge 
Date:   Thu Oct 22 17:50:08 2015 +

svn merge -r 224607:224838 
svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_1-branch


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@229178 
138bc75d-0d04-0410-961f-82ee72b054a4

 include/ChangeLog.gomp41   |   5 +
 include/gomp-constants.h   |   6 ++
 libgomp/ChangeLog.gomp41   |  18 
 libgomp/libgomp.h  |  15 ++-
 libgomp/oacc-mem.c |   2 +-
 libgomp/oacc-parallel.c|   6 +-
 libgomp/target.c   | 106 +
 libgomp/testsuite/libgomp.c/target-11.c|  51 ++
 .../libgomp.oacc-c-c++-common/asyncwait-1.c|   2 +
 9 files changed, 162 insertions(+), 49 deletions(-)

diff --cc libgomp/oacc-mem.c
index 7fcf199,c0fcb07..a90c912
--- libgomp/oacc-mem.c
+++ libgomp/oacc-mem.c
@@@ -685,7 -650,8 +685,7 @@@ gomp_acc_remove_pointer (void *h, bool 
}
  }
  
-   t->list[0]->copy_from = force_copyfrom ? 1 : 0;
 -  if (force_copyfrom)
 -t->list[0].copy_from = 1;
++  t->list[0].copy_from = force_copyfrom ? 1 : 0;
  
gomp_mutex_unlock (&acc_dev->lock);
  
diff --cc libgomp/oacc-parallel.c
index 2b90c9f,8ea3dd1..e4ecc87
--- libgomp/oacc-parallel.c
+++ libgomp/oacc-parallel.c
@@@ -261,16 -135,12 +261,16 @@@ GOACC_parallel_keyed (int device, void 
  
devaddrs = gomp_alloca (sizeof (void *) * mapnum);
for (i = 0; i < mapnum; i++)
 -devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
 -  + tgt->list[i].key->tgt_offset);
 +{
-   if (tgt->list[i] != NULL)
-   devaddrs[i] = (void *) (tgt->list[i]->tgt->tgt_start
-   + tgt->list[i]->tgt_offset);
++  if (tgt->list[i].key != NULL)
++  devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
++  + tgt->list[i].key->tgt_offset);
 +  else
 +  devaddrs[i] = NULL;
 +}
  
 -  acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs, sizes, 
ki