Hi Thomas,
thanks for the first review. I'm still working on another revision,
but wanted to respond to some of the issues you raised first:

On 2019/11/7 8:48 AM, Thomas Schwinge wrote:
(1) The simplest solution: implement a processing which searches and reverts 
such
non-contiguous array map entries in GOACC_parallel_keyed.
(note: I have implemented this in the current attached "v2" patch)

(2) Make the GOACC_parallel_keyed code to not make short cuts for host-modes;
i.e. still do the proper gomp_map_vars processing for all cases.

(3) Modify the non-contiguous array map conventions: a possible solution is to 
use
two maps placed together: one for the array pointer, another for the array 
descriptor (as
opposed to the current style of using only one map) This needs more further 
elaborate
compiler/runtime work.

The first two options will pessimize host-mode performance somewhat. The third 
I have
some WIP patches, but it's still buggy ATM. Seeking your opinion on what we 
should do.
I'll have to think about it some more, but variant (1) doesn't seem so
bad actually, for a first take.  While it's not nice to pessimize in
particular directives with 'if (false)' clauses, at least it does work,
the run-time overhead should not be too bad (also compared to variant
(2), I suppose), and variant (3) can still be implemented later.

The issue is that (1),(2) vs (3) have different binary interfaces, so a 
decision has to be
made first, lest we again have compatibility issues later.

Also, (1) vs (2) also may be somewhat different do to the memory copying 
effects of
gomp_map_vars()  (possible semantic difference versus the usual shared memory 
expectations?)

I'm currently working on another way of implementing something similar to (3),
but using the variadic arguments of GOACC_parallel_keyed instead of maps, WDYT?

@@ -13238,6 +13247,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
        unsigned int num = types.length (), i;
        tree t, side_effects = NULL_TREE, size = NULL_TREE;
        tree condition = NULL_TREE;
+      tree ncarray_dims = NULL_TREE;
if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
        maybe_zero_len = true;
@@ -13261,6 +13271,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi
            length = fold_convert (sizetype, length);
          if (low_bound == NULL_TREE)
            low_bound = integer_zero_node;
+
+         if (non_contiguous)
+           {
+             ncarray_dims = tree_cons (low_bound, length, ncarray_dims);
+             continue;
+           }
+
          if (!maybe_zero_len && i > first_non_one)
            {
              if (integer_nonzerop (low_bound))
I'm not at all familiar with this array sections code, will trust your
understanding that we don't need any of the processing that you're
skipping here ('continue'): 'TREE_SIDE_EFFECTS' handling for the length
expressions, and other things.

I will re-check on this.

Ditto for the other minor issues you raised.

          if (DECL_P (decl))
            {
              if (DECL_SIZE (decl)
@@ -2624,6 +2830,14 @@ scan_omp_target (gomp_target *stmt, omp_context *o
        gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
      }

+  /* If is OpenACC construct, put non-contiguous array clauses (if any)
+     in front of clause chain. The runtime can then test the first to see
+     if the additional map processing for them is required.  */
+  if (is_gimple_omp_oacc (stmt))
+    reorder_noncontig_array_clauses (gimple_omp_target_clauses_ptr (stmt));
Should that be deemed unsuitable for any reason, then add a new
'GOACC_FLAG_*' flag to indicate existance of non-contiguous arrays.

I'm considering using that convention unconditionally, not sure if it's faster
though, since that means we can't do the 'early breaking' you mentioned when
scanning through maps looking for GOMP_MAP_NONCONTIG_ARRAY_P.

--- include/gomp-constants.h    (revision 277827)
+++ include/gomp-constants.h    (working copy)
@@ -40,6 +40,7 @@
  #define GOMP_MAP_FLAG_SPECIAL_0               (1 << 2)
  #define GOMP_MAP_FLAG_SPECIAL_1               (1 << 3)
  #define GOMP_MAP_FLAG_SPECIAL_2               (1 << 4)
+#define GOMP_MAP_FLAG_SPECIAL_3                (1 << 5)
  #define GOMP_MAP_FLAG_SPECIAL         (GOMP_MAP_FLAG_SPECIAL_1 \
                                         | GOMP_MAP_FLAG_SPECIAL_0)
  /* Flag to force a specific behavior (or else, trigger a run-time error).  */
@@ -127,6 +128,26 @@ enum gomp_map_kind
      /* Decrement usage count and deallocate if zero.  */
      GOMP_MAP_RELEASE =                        (GOMP_MAP_FLAG_SPECIAL_2
                                         | GOMP_MAP_DELETE),
+    /* Mapping kinds for non-contiguous arrays.  */
+    GOMP_MAP_NONCONTIG_ARRAY =         (GOMP_MAP_FLAG_SPECIAL_3),
+    GOMP_MAP_NONCONTIG_ARRAY_TO =      (GOMP_MAP_NONCONTIG_ARRAY
+                                        | GOMP_MAP_TO),
+    GOMP_MAP_NONCONTIG_ARRAY_FROM =    (GOMP_MAP_NONCONTIG_ARRAY
+                                        | GOMP_MAP_FROM),
+    GOMP_MAP_NONCONTIG_ARRAY_TOFROM =  (GOMP_MAP_NONCONTIG_ARRAY
+                                        | GOMP_MAP_TOFROM),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO =        (GOMP_MAP_NONCONTIG_ARRAY_TO
+                                        | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM =      (GOMP_MAP_NONCONTIG_ARRAY_FROM
+                                                | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM =    (GOMP_MAP_NONCONTIG_ARRAY_TOFROM
+                                                | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_NONCONTIG_ARRAY_ALLOC =           (GOMP_MAP_NONCONTIG_ARRAY
+                                                | GOMP_MAP_ALLOC),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC =     (GOMP_MAP_NONCONTIG_ARRAY
+                                                | GOMP_MAP_FORCE_ALLOC),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT =   (GOMP_MAP_NONCONTIG_ARRAY
+                                                | GOMP_MAP_FORCE_PRESENT),
Just an idea: instead of this long list, would it maybe be better (if
feasible at all?) to have a single "lead-in" mapping
'GOMP_MAP_NONCONTIG_ARRAY_MODE', which specifies how many of the
following (normal) mappings belong to that "non-contiguous array mode".
(Roughly similar to what 'GOMP_MAP_TO_PSET' is doing with any
'GOMP_MAP_POINTER's following it.)  Might that make some things simpler,
or even more complicated (more internal state to keep)?

I prefer not, wrangling with multiple-map sequences in the complex 
gomp_map_vars code
is proving to be a tedious task; my now given-up version of method (3) above 
tried using
two map kinds (an 'array' and an 'array descriptor'). Haven't yet got it to 
work properly.

Also, a non-contiguous array is just a data clause specification feature, and 
should support
all modes (copy/in/out,present,alloc,etc.) Using a whole 
GOMP_MAP_FLAG_SPECIAL_3 bit in
combination with other flags independently should be warranted.


--- libgomp/oacc-parallel.c     (revision 277827)
+++ libgomp/oacc-parallel.c     (working copy)
+static inline void
+revert_noncontig_array_map_pointers (size_t mapnum, void **hostaddrs,
+                                    unsigned short *kinds)
+{
+  for (int i = 0; i < mapnum; i++)
+    {
+      if (GOMP_MAP_NONCONTIG_ARRAY_P (kinds[i] & 0xff))
+       hostaddrs[i] = *((void **)hostaddrs[i]);
Can we be (or, do we make) sure that 'hostaddrs' will never be in
read-only memory?

And, it's permissible to alter 'hostaddrs'?

Ah, other code (including 'libgomp/target.c') is doing such things, too,
so it must be fine.

The hostaddrs[] array is the 'receiver' record built on stack by omp-low,
so it should always be safe to modify, I think.

Thanks again for the review!
Chung-Lin

Reply via email to