[PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-05-04 Thread Kevin Buettner
Consider the following OpenMP program:

void foo (int a1) {}

int
main (void)
{
  static int s1 = -41;
  int i1 = 11, i2;

  for (i2 = 1; i2 <= 2; i2++)
{
  int pass = i2;
#pragma omp parallel num_threads (2) firstprivate (i1)
  {
foo (i1);
  }
  foo(pass);
}
  foo (s1); foo (i2);
}

At the moment, when debugging such a program, GDB is not able to find
and print the values of s1, i2, and pass.

My changes to omp-low.c and omp-expand.c, in conjunction with several
other patches, allow GDB to find and print these values.

This is the current behavior when debugging in GDB:

(gdb) b 14
Breakpoint 1 at 0x400617: file ex3.c, line 14.
(gdb) run
Starting program: /mesquite2/.ironwood2/omp-tests/k/ex3-trunk
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x773ca700 (LWP 32628)]

Thread 1 "ex3-trunk" hit Breakpoint 1, main._omp_fn.0 () at ex3.c:14
14  foo (i1);
(gdb) p s1
No symbol "s1" in current context.
(gdb) p i1
$1 = 11
(gdb) p i2
No symbol "i2" in current context.
(gdb) p pass
No symbol "pass" in current context.
(gdb) c
Continuing.
[Switching to Thread 0x773ca700 (LWP 32628)]

Thread 2 "ex3-trunk" hit Breakpoint 1, main._omp_fn.0 () at ex3.c:14
14  foo (i1);
(gdb) p s1
No symbol "s1" in current context.
(gdb) p i1
$2 = 11
(gdb) p i2
No symbol "i2" in current context.
(gdb) p pass
No symbol "pass" in current context.
(gdb) bt
#0  main._omp_fn.0 () at ex3.c:14
#1  0x77bc4926 in gomp_thread_start (xdata=)
at gcc/libgomp/team.c:122
#2  0x7799761a in start_thread () from /lib64/libpthread.so.0
#3  0x776d159d in clone () from /lib64/libc.so.6

Note that GDB is unable to find s1, i2, or pass for either thread.

I show the backtrace for thread 2 because it's the more difficult case
to handle due to the stack trace stopping at clone().  The stack frame
for main(), which is where the variables of interest reside, is not a
part of this stack.

When we run this example using the patches associated with this change
along with several other patches, GDB's behavior looks like this:

(gdb) b 14
Breakpoint 1 at 0x400617: file ex3.c, line 14.
(gdb) run
Starting program: /mesquite2/.ironwood2/omp-tests/k/ex3-new
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x773ca700 (LWP 32643)]

Thread 1 "ex3-new" hit Breakpoint 1, main._omp_fn.0 () at ex3.c:14
14  foo (i1);
(gdb) p s1
$1 = -41
(gdb) p i1
$2 = 11
(gdb) p i2
$3 = 1
(gdb) p pass
$4 = 1
(gdb) c
Continuing.
[Switching to Thread 0x773ca700 (LWP 32643)]

Thread 2 "ex3-new" hit Breakpoint 1, main._omp_fn.0 () at ex3.c:14
14  foo (i1);
(gdb) p s1
$5 = -41
(gdb) p i1
$6 = 11
(gdb) p i2
$7 = 1
(gdb) p pass
$8 = 1

I didn't show the stack here.  It's the same as before.  (I would,
however, like to be able to make GDB display a unified stack.)

Note that GDB is now able to find and print values for s1, i2, and
pass.

GCC constructs a new function for executing the parallel code.
The debugging information entry for this function is presently
placed at the same level as that of the function in which the
"#pragma omp parallel" directive appeared.

This is partial output from "readelf -v" for the (non-working) example
shown above:

 <1><2d>: Abbrev Number: 2 (DW_TAG_subprogram)
<2e>   DW_AT_external: 1
<2e>   DW_AT_name: (indirect string, offset: 0x80): main
<32>   DW_AT_decl_file   : 1
<33>   DW_AT_decl_line   : 4
<34>   DW_AT_prototyped  : 1
<34>   DW_AT_type: <0x9d>
<38>   DW_AT_low_pc  : 0x400591
<40>   DW_AT_high_pc : 0x71
<48>   DW_AT_frame_base  : 1 byte block: 9c (DW_OP_call_frame_cfa)
<4a>   DW_AT_GNU_all_tail_call_sites: 1
<4a>   DW_AT_sibling : <0x9d>
 <2><4e>: Abbrev Number: 3 (DW_TAG_variable)
<4f>   DW_AT_name: s1
<52>   DW_AT_decl_file   : 1
<53>   DW_AT_decl_line   : 6
<54>   DW_AT_type: <0x9d>
<58>   DW_AT_location: 9 byte block: 3 30 10 60 0 0 0 0 0   
(DW_OP_addr: 601030)
 ...
 <2><7c>: Abbrev Number: 4 (DW_TAG_lexical_block)
<7d>   DW_AT_low_pc  : 0x4005a9
<85>   DW_AT_high_pc : 0x31
 <3><8d>: Abbrev Number: 5 (DW_TAG_variable)
<8e>   DW_AT_name: (indirect string, offset: 0x55): pass
<92>   DW_AT_decl_file   : 1
<93>   DW_AT_decl_line   : 11
<94>   DW_AT_type: <0x9d>
<98>   DW_AT_location: 2 byte block: 91 64  (DW_OP_fbreg: -28)
 ...
 <1>: Abbrev Number: 7 (DW_TAG_subprogram)
   DW_AT_name: 

Re: [PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-08-07 Thread Kevin Buettner
On Wed, 10 May 2017 17:24:27 +0200
Jakub Jelinek  wrote:
 
> What I don't like is that the patch is inconsistent, it sets DECL_CONTEXT
> of the child function for all kinds of outlined functions, but then you just
> choose one of the many places and add it into the BLOCK tree.  Any reason
> why the DECL_CONTEXT change can't be done in a helper function together
> with all the changes you've added into omp-expand.c, and then call it from
> expand_omp_parallel (with the child_fn and entry_stmt arguments) so that
> you can call it easily also for other constructs, either now or later on?

I've worked out a way to do the DECL_CONTEXT and the scope change
together.  The helper function should be usable for other constructs,
though I have not tested this yet.

> Also, is there any rationale on appending the FUNCTION_DECL to BLOCK_VARS
> instead of prepending it there (which is cheaper)?  Does the debugger
> care about relative order of those artificial functions vs. other
> variables in the lexical scope?

To the best of my knowledge, the debugger doesn't care about the
order.  I've changed the code to prepend the FUNCTION_DECL to
BLOCK_VARS instead.

How does this new version (below) look?

I've done a "make bootstrap" and "make -k check". No regressions found
for x86_64.

gcc/ChangeLog:

* omp-expand.c (adjust_context_scope): New function.
(expand_parallel_call): Call adjust_context_scope.
---
 gcc/omp-expand.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index d6755cd..9eb0a89 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -498,6 +498,42 @@ parallel_needs_hsa_kernel_p (struct omp_region *region)
   return false;
 }
 
+/* Change DECL_CONTEXT of CHILD_FNDECL to that of the parent function.
+   Add CHILD_FNDECL to decl chain of the supercontext of the block
+   ENTRY_BLOCK - this is the block which originally contained the
+   code from which CHILD_FNDECL was created.
+   
+   Together, these actions ensure that the debug info for the outlined
+   function will be emitted with the correct lexical scope.  */
+
+static void
+adjust_context_and_scope (tree entry_block, tree child_fndecl)
+{
+  if (entry_block != NULL_TREE && TREE_CODE (entry_block) == BLOCK)
+{
+  tree b = BLOCK_SUPERCONTEXT (entry_block);
+
+  if (TREE_CODE (b) == BLOCK)
+{
+ tree parent_fndecl;
+
+ /* Follow supercontext chain until the parent fndecl
+is found.  */
+ for (parent_fndecl = BLOCK_SUPERCONTEXT (b);
+  TREE_CODE (parent_fndecl) == BLOCK;
+  parent_fndecl = BLOCK_SUPERCONTEXT (parent_fndecl))
+   ;
+
+ gcc_assert (TREE_CODE (parent_fndecl) == FUNCTION_DECL);
+
+ DECL_CONTEXT (child_fndecl) = parent_fndecl;
+
+ DECL_CHAIN (child_fndecl) = BLOCK_VARS (b);
+ BLOCK_VARS (b) = child_fndecl;
+   }
+}
+}
+
 /* Build the function calls to GOMP_parallel_start etc to actually
generate the parallel operation.  REGION is the parallel region
being expanded.  BB is the block where to insert the code.  WS_ARGS
@@ -667,6 +703,8 @@ expand_parallel_call (struct omp_region *region, 
basic_block bb,
   tree child_fndecl = gimple_omp_parallel_child_fn (entry_stmt);
   t2 = build_fold_addr_expr (child_fndecl);
 
+  adjust_context_and_scope (gimple_block (entry_stmt), child_fndecl);
+
   vec_alloc (args, 4 + vec_safe_length (ws_args));
   args->quick_push (t2);
   args->quick_push (t1);


Re: [PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-05-04 Thread Kevin Buettner
Ahem...  I forgot to note that:

I have bootstrapped and regression tested my patch on x86_64-pc-linux-gnu.

Kevin

On Thu, 4 May 2017 17:45:51 -0700
Kevin Buettner  wrote:

> Consider the following OpenMP program:
> 
> void foo (int a1) {}
> 
> int
> main (void)
> {
>   static int s1 = -41;
>   int i1 = 11, i2;
> 
>   for (i2 = 1; i2 <= 2; i2++)
> {
>   int pass = i2;
> #pragma omp parallel num_threads (2) firstprivate (i1)
>   {
> foo (i1);
>   }
>   foo(pass);
> }
>   foo (s1); foo (i2);
> }
> 
> At the moment, when debugging such a program, GDB is not able to find
> and print the values of s1, i2, and pass.
> 
> My changes to omp-low.c and omp-expand.c, in conjunction with several
> other patches, allow GDB to find and print these values.
> 
> This is the current behavior when debugging in GDB:
> 
> (gdb) b 14
> Breakpoint 1 at 0x400617: file ex3.c, line 14.
> (gdb) run
> Starting program: /mesquite2/.ironwood2/omp-tests/k/ex3-trunk
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [New Thread 0x773ca700 (LWP 32628)]
> 
> Thread 1 "ex3-trunk" hit Breakpoint 1, main._omp_fn.0 () at ex3.c:14
> 14  foo (i1);
> (gdb) p s1
> No symbol "s1" in current context.
> (gdb) p i1
> $1 = 11
> (gdb) p i2
> No symbol "i2" in current context.
> (gdb) p pass
> No symbol "pass" in current context.
> (gdb) c
> Continuing.
> [Switching to Thread 0x773ca700 (LWP 32628)]
> 
> Thread 2 "ex3-trunk" hit Breakpoint 1, main._omp_fn.0 () at ex3.c:14
> 14  foo (i1);
> (gdb) p s1
> No symbol "s1" in current context.
> (gdb) p i1
> $2 = 11
> (gdb) p i2
> No symbol "i2" in current context.
> (gdb) p pass
> No symbol "pass" in current context.
> (gdb) bt
> #0  main._omp_fn.0 () at ex3.c:14
> #1  0x77bc4926 in gomp_thread_start (xdata=)
> at gcc/libgomp/team.c:122
> #2  0x7799761a in start_thread () from /lib64/libpthread.so.0
> #3  0x776d159d in clone () from /lib64/libc.so.6
> 
> Note that GDB is unable to find s1, i2, or pass for either thread.
> 
> I show the backtrace for thread 2 because it's the more difficult case
> to handle due to the stack trace stopping at clone().  The stack frame
> for main(), which is where the variables of interest reside, is not a
> part of this stack.
> 
> When we run this example using the patches associated with this change
> along with several other patches, GDB's behavior looks like this:
> 
> (gdb) b 14
> Breakpoint 1 at 0x400617: file ex3.c, line 14.
> (gdb) run
> Starting program: /mesquite2/.ironwood2/omp-tests/k/ex3-new
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [New Thread 0x773ca700 (LWP 32643)]
> 
> Thread 1 "ex3-new" hit Breakpoint 1, main._omp_fn.0 () at ex3.c:14
> 14  foo (i1);
> (gdb) p s1
> $1 = -41
> (gdb) p i1
> $2 = 11
> (gdb) p i2
> $3 = 1
> (gdb) p pass
> $4 = 1
> (gdb) c
> Continuing.
> [Switching to Thread 0x773ca700 (LWP 32643)]
> 
> Thread 2 "ex3-new" hit Breakpoint 1, main._omp_fn.0 () at ex3.c:14
> 14  foo (i1);
> (gdb) p s1
> $5 = -41
> (gdb) p i1
> $6 = 11
> (gdb) p i2
> $7 = 1
> (gdb) p pass
> $8 = 1
> 
> I didn't show the stack here.  It's the same as before.  (I would,
> however, like to be able to make GDB display a unified stack.)
> 
> Note that GDB is now able to find and print values for s1, i2, and
> pass.
> 
> GCC constructs a new function for executing the parallel code.
> The debugging information entry for this function is presently
> placed at the same level as that of the function in which the
> "#pragma omp parallel" directive appeared.
> 
> This is partial output from "readelf -v" for the (non-working) example
> shown above:
> 
>  <1><2d>: Abbrev Number: 2 (DW_TAG_subprogram)
> <2e>   DW_AT_external: 1
> <2e>   DW_AT_name: (indirect string, offset: 0x80): main
> <32>   DW_AT_decl_file   : 1
> <33>   DW_AT_decl_line   : 4
> <34>   DW_AT_prototyped  : 1
> <34>   DW_AT_type: <0x9d>
> <38>   DW_AT_low_pc  : 0x400591
> <40>   DW_AT_high_pc : 0x71
> <48>   DW_AT_frame_base  : 1 byte block: 9c (DW_OP_call_frame_cfa)
> <4a>   DW_AT_GNU_all_tail_call_sites: 1
> <4a>   DW_AT_sibling : <0x9d>
>  <2><4e>: Abbrev Number: 3 (DW_TAG_variable)
> <4f>   DW_AT_name: s1
> <52>   DW_AT_decl_file   : 1
> <53>   DW_AT_decl_line   : 6
> <54>   DW_AT_type: <0x9d>
> <58>   DW_AT_location: 9 byte block: 3 30 10 60 0 0 0 0 0   
> (DW_OP_addr: 601030)
>  ...
>  <2><7c>: Abbrev Number: 4 

Re: [PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-05-05 Thread Alexander Monakov
On Thu, 4 May 2017, Kevin Buettner wrote:
> diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
> index 5c48b78..7029951 100644
> --- a/gcc/omp-expand.c
> +++ b/gcc/omp-expand.c
> @@ -667,6 +667,25 @@ expand_parallel_call (struct omp_region *region, 
> basic_block bb,

Outlined functions are also used for 'omp task' and 'omp target' regions, but
here only 'omp parallel' is handled. Will this code need to be duplicated for
those region types?

>tree child_fndecl = gimple_omp_parallel_child_fn (entry_stmt);
>t2 = build_fold_addr_expr (child_fndecl);
>  
> +  if (gimple_block (entry_stmt) != NULL_TREE
> +  && TREE_CODE (gimple_block (entry_stmt)) == BLOCK)

Here and also below, ...

> +{
> +  tree b = BLOCK_SUPERCONTEXT (gimple_block (entry_stmt));
> +
> +  /* Add child_fndecl to var chain of the supercontext of the
> +block corresponding to entry_stmt.  This ensures that debug
> +info for the outlined function will be emitted for the correct
> +lexical scope.  */
> +  if (b != NULL_TREE && TREE_CODE (b) == BLOCK)

... here, I'm curious why the conditionals are necessary -- I don't see why the
conditions can be sometimes true and sometimes false.  Sorry if I'm missing
something obvious.

Thanks.
Alexander


Re: [PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-05-05 Thread Kevin Buettner
On Fri, 5 May 2017 14:23:14 +0300 (MSK)
Alexander Monakov  wrote:

> On Thu, 4 May 2017, Kevin Buettner wrote:
> > diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
> > index 5c48b78..7029951 100644
> > --- a/gcc/omp-expand.c
> > +++ b/gcc/omp-expand.c
> > @@ -667,6 +667,25 @@ expand_parallel_call (struct omp_region *region, 
> > basic_block bb,  
> 
> Outlined functions are also used for 'omp task' and 'omp target' regions, but
> here only 'omp parallel' is handled. Will this code need to be duplicated for
> those region types?

For 'omp task' and 'omp target', I think it's possible or even likely
that the original context which started these parallel tasks will no
longer exist.  So, it might not make sense to do something equivalent
for 'task' and 'target'.

That said, I have not yet given the matter much study.  There may be
cases where having scoped debug info might still prove useful.

The short answer is, "I don't know."

> >tree child_fndecl = gimple_omp_parallel_child_fn (entry_stmt);
> >t2 = build_fold_addr_expr (child_fndecl);
> >  
> > +  if (gimple_block (entry_stmt) != NULL_TREE
> > +  && TREE_CODE (gimple_block (entry_stmt)) == BLOCK)  
> 
> Here and also below, ...
> 
> > +{
> > +  tree b = BLOCK_SUPERCONTEXT (gimple_block (entry_stmt));
> > +
> > +  /* Add child_fndecl to var chain of the supercontext of the
> > +block corresponding to entry_stmt.  This ensures that debug
> > +info for the outlined function will be emitted for the correct
> > +lexical scope.  */
> > +  if (b != NULL_TREE && TREE_CODE (b) == BLOCK)  
> 
> ... here, I'm curious why the conditionals are necessary -- I don't see why 
> the
> conditions can be sometimes true and sometimes false.  Sorry if I'm missing
> something obvious.

I'm not especially knowledgeable about gcc internals.  It may be the
case these conditionals that you noted are unnecessary and might
perhaps better be handled via the use of an assert.

I will note that when I originally coded it, I had fewer tests.  The code
still worked for the cases that I tried.  Later, when I reviewed it
for posting here, I decided to add some more checks.

I'll explain my reasoning for each of them...

> > +  if (gimple_block (entry_stmt) != NULL_TREE

If we have NULL_TREE here, dereferencing gimple_block (entry_stmt) further
won't work.

> > +  && TREE_CODE (gimple_block (entry_stmt)) == BLOCK)  

It seemed to me that a having a BLOCK is necessary in order to later
use BLOCK_SUPERCONTEXT.

> > +  if (b != NULL_TREE && TREE_CODE (b) == BLOCK)  

I check to make sure that b is a block so that I can later refer to
BLOCK_VARS (b).

Again, it may be the case that these should always evaluate to true.
If so, then use of an assert might be better here.

Kevin


Re: [PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-05-10 Thread Jakub Jelinek
On Fri, May 05, 2017 at 10:23:59AM -0700, Kevin Buettner wrote:
> On Fri, 5 May 2017 14:23:14 +0300 (MSK)
> Alexander Monakov  wrote:
> 
> > On Thu, 4 May 2017, Kevin Buettner wrote:
> > > diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
> > > index 5c48b78..7029951 100644
> > > --- a/gcc/omp-expand.c
> > > +++ b/gcc/omp-expand.c
> > > @@ -667,6 +667,25 @@ expand_parallel_call (struct omp_region *region, 
> > > basic_block bb,  
> > 
> > Outlined functions are also used for 'omp task' and 'omp target' regions, 
> > but
> > here only 'omp parallel' is handled. Will this code need to be duplicated 
> > for
> > those region types?
> 
> For 'omp task' and 'omp target', I think it's possible or even likely
> that the original context which started these parallel tasks will no
> longer exist.  So, it might not make sense to do something equivalent
> for 'task' and 'target'.

It depends.  E.g. for #pragma omp taskloop without nogroup clause, it acts the
same as #pragma omp parallel in the nesting regard, the GOMP_taskloop* function 
will
not return until all the tasks finished.  Or if you have #pragma omp task and 
#pragma omp taskwait on the next line, or #pragma omp taskgroup
around it, or #pragma omp target without nowait clause, it will behave the same.
Then there are cases where the encountering function will still be around,
but already not all the lexical scopes (or inline functions), e.g. if there
is #pragma omp taskwait or taskgroup etc. outside of the innermost lexical
scope(s), but still somewhere in the function.  What the debugger should do
in that case is that it should figure out that the spot the task has been
created in has passed, so not show vars in the lexical scopes already left,
but still show others?  Then of course if there is nothing waiting for the
task or async target in the current function, the function's frame could be
left, perhaps multiple callers too.

> > >tree child_fndecl = gimple_omp_parallel_child_fn (entry_stmt);
> > >t2 = build_fold_addr_expr (child_fndecl);
> > >  
> > > +  if (gimple_block (entry_stmt) != NULL_TREE
> > > +  && TREE_CODE (gimple_block (entry_stmt)) == BLOCK)  
> > 
> > Here and also below, ...
> > 
> > > +{
> > > +  tree b = BLOCK_SUPERCONTEXT (gimple_block (entry_stmt));
> > > +
> > > +  /* Add child_fndecl to var chain of the supercontext of the
> > > +block corresponding to entry_stmt.  This ensures that debug
> > > +info for the outlined function will be emitted for the correct
> > > +lexical scope.  */
> > > +  if (b != NULL_TREE && TREE_CODE (b) == BLOCK)  
> > 
> > ... here, I'm curious why the conditionals are necessary -- I don't see why 
> > the
> > conditions can be sometimes true and sometimes false.  Sorry if I'm missing
> > something obvious.

gimple_block can be NULL.  And, most calls of gimple_block that want to
ensure it is a BLOCK actually do verify it is a BLOCK, while it is unlikely
and it is usually just LTO that screws things up, I'd keep it.

> > > +  if (b != NULL_TREE && TREE_CODE (b) == BLOCK)  
> 
> I check to make sure that b is a block so that I can later refer to
> BLOCK_VARS (b).

I believe BLOCK_SUPERCONTEXT of a BLOCK should always be non-NULL, either
another BLOCK, or FUNCTION_DECL.  Thus I think b != NULL_TREE && is
redundant here.

What I don't like is that the patch is inconsistent, it sets DECL_CONTEXT
of the child function for all kinds of outlined functions, but then you just
choose one of the many places and add it into the BLOCK tree.  Any reason
why the DECL_CONTEXT change can't be done in a helper function together
with all the changes you've added into omp-expand.c, and then call it from
expand_omp_parallel (with the child_fn and entry_stmt arguments) so that
you can call it easily also for other constructs, either now or later on?
Also, is there any rationale on appending the FUNCTION_DECL to BLOCK_VARS
instead of prepending it there (which is cheaper)?  Does the debugger
care about relative order of those artificial functions vs. other
variables in the lexical scope?

Jakub


Re: [PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-05-11 Thread Richard Biener
On Wed, May 10, 2017 at 5:24 PM, Jakub Jelinek  wrote:
> On Fri, May 05, 2017 at 10:23:59AM -0700, Kevin Buettner wrote:
>> On Fri, 5 May 2017 14:23:14 +0300 (MSK)
>> Alexander Monakov  wrote:
>>
>> > On Thu, 4 May 2017, Kevin Buettner wrote:
>> > > diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
>> > > index 5c48b78..7029951 100644
>> > > --- a/gcc/omp-expand.c
>> > > +++ b/gcc/omp-expand.c
>> > > @@ -667,6 +667,25 @@ expand_parallel_call (struct omp_region *region, 
>> > > basic_block bb,
>> >
>> > Outlined functions are also used for 'omp task' and 'omp target' regions, 
>> > but
>> > here only 'omp parallel' is handled. Will this code need to be duplicated 
>> > for
>> > those region types?
>>
>> For 'omp task' and 'omp target', I think it's possible or even likely
>> that the original context which started these parallel tasks will no
>> longer exist.  So, it might not make sense to do something equivalent
>> for 'task' and 'target'.
>
> It depends.  E.g. for #pragma omp taskloop without nogroup clause, it acts the
> same as #pragma omp parallel in the nesting regard, the GOMP_taskloop* 
> function will
> not return until all the tasks finished.  Or if you have #pragma omp task and
> #pragma omp taskwait on the next line, or #pragma omp taskgroup
> around it, or #pragma omp target without nowait clause, it will behave the 
> same.
> Then there are cases where the encountering function will still be around,
> but already not all the lexical scopes (or inline functions), e.g. if there
> is #pragma omp taskwait or taskgroup etc. outside of the innermost lexical
> scope(s), but still somewhere in the function.  What the debugger should do
> in that case is that it should figure out that the spot the task has been
> created in has passed, so not show vars in the lexical scopes already left,
> but still show others?  Then of course if there is nothing waiting for the
> task or async target in the current function, the function's frame could be
> left, perhaps multiple callers too.
>
>> > >tree child_fndecl = gimple_omp_parallel_child_fn (entry_stmt);
>> > >t2 = build_fold_addr_expr (child_fndecl);
>> > >
>> > > +  if (gimple_block (entry_stmt) != NULL_TREE
>> > > +  && TREE_CODE (gimple_block (entry_stmt)) == BLOCK)
>> >
>> > Here and also below, ...
>> >
>> > > +{
>> > > +  tree b = BLOCK_SUPERCONTEXT (gimple_block (entry_stmt));
>> > > +
>> > > +  /* Add child_fndecl to var chain of the supercontext of the
>> > > +block corresponding to entry_stmt.  This ensures that debug
>> > > +info for the outlined function will be emitted for the correct
>> > > +lexical scope.  */
>> > > +  if (b != NULL_TREE && TREE_CODE (b) == BLOCK)
>> >
>> > ... here, I'm curious why the conditionals are necessary -- I don't see 
>> > why the
>> > conditions can be sometimes true and sometimes false.  Sorry if I'm missing
>> > something obvious.
>
> gimple_block can be NULL.  And, most calls of gimple_block that want to
> ensure it is a BLOCK actually do verify it is a BLOCK, while it is unlikely
> and it is usually just LTO that screws things up, I'd keep it.

gimple_block should always be either NULL or a BLOCK.  It's *_CONTEXT that
can be types, decls or blocks.

>> > > +  if (b != NULL_TREE && TREE_CODE (b) == BLOCK)
>>
>> I check to make sure that b is a block so that I can later refer to
>> BLOCK_VARS (b).
>
> I believe BLOCK_SUPERCONTEXT of a BLOCK should always be non-NULL, either
> another BLOCK, or FUNCTION_DECL.  Thus I think b != NULL_TREE && is
> redundant here.

Yes.

>
> What I don't like is that the patch is inconsistent, it sets DECL_CONTEXT
> of the child function for all kinds of outlined functions, but then you just
> choose one of the many places and add it into the BLOCK tree.  Any reason
> why the DECL_CONTEXT change can't be done in a helper function together
> with all the changes you've added into omp-expand.c, and then call it from
> expand_omp_parallel (with the child_fn and entry_stmt arguments) so that
> you can call it easily also for other constructs, either now or later on?
> Also, is there any rationale on appending the FUNCTION_DECL to BLOCK_VARS
> instead of prepending it there (which is cheaper)?  Does the debugger
> care about relative order of those artificial functions vs. other
> variables in the lexical scope?
>
> Jakub


[PING][PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-09-21 Thread Kevin Buettner
Ping.

On Mon, 7 Aug 2017 17:51:38 -0700
Kevin Buettner  wrote:

> On Wed, 10 May 2017 17:24:27 +0200
> Jakub Jelinek  wrote:
>  
> > What I don't like is that the patch is inconsistent, it sets DECL_CONTEXT
> > of the child function for all kinds of outlined functions, but then you just
> > choose one of the many places and add it into the BLOCK tree.  Any reason
> > why the DECL_CONTEXT change can't be done in a helper function together
> > with all the changes you've added into omp-expand.c, and then call it from
> > expand_omp_parallel (with the child_fn and entry_stmt arguments) so that
> > you can call it easily also for other constructs, either now or later on?  
> 
> I've worked out a way to do the DECL_CONTEXT and the scope change
> together.  The helper function should be usable for other constructs,
> though I have not tested this yet.
> 
> > Also, is there any rationale on appending the FUNCTION_DECL to BLOCK_VARS
> > instead of prepending it there (which is cheaper)?  Does the debugger
> > care about relative order of those artificial functions vs. other
> > variables in the lexical scope?  
> 
> To the best of my knowledge, the debugger doesn't care about the
> order.  I've changed the code to prepend the FUNCTION_DECL to
> BLOCK_VARS instead.
> 
> How does this new version (below) look?
> 
> I've done a "make bootstrap" and "make -k check". No regressions found
> for x86_64.
> 
> gcc/ChangeLog:
> 
>   * omp-expand.c (adjust_context_scope): New function.
>   (expand_parallel_call): Call adjust_context_scope.
> ---
>  gcc/omp-expand.c | 38 ++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
> index d6755cd..9eb0a89 100644
> --- a/gcc/omp-expand.c
> +++ b/gcc/omp-expand.c
> @@ -498,6 +498,42 @@ parallel_needs_hsa_kernel_p (struct omp_region *region)
>return false;
>  }
>  
> +/* Change DECL_CONTEXT of CHILD_FNDECL to that of the parent function.
> +   Add CHILD_FNDECL to decl chain of the supercontext of the block
> +   ENTRY_BLOCK - this is the block which originally contained the
> +   code from which CHILD_FNDECL was created.
> +   
> +   Together, these actions ensure that the debug info for the outlined
> +   function will be emitted with the correct lexical scope.  */
> +
> +static void
> +adjust_context_and_scope (tree entry_block, tree child_fndecl)
> +{
> +  if (entry_block != NULL_TREE && TREE_CODE (entry_block) == BLOCK)
> +{
> +  tree b = BLOCK_SUPERCONTEXT (entry_block);
> +
> +  if (TREE_CODE (b) == BLOCK)
> +{
> +   tree parent_fndecl;
> +
> +   /* Follow supercontext chain until the parent fndecl
> +  is found.  */
> +   for (parent_fndecl = BLOCK_SUPERCONTEXT (b);
> +TREE_CODE (parent_fndecl) == BLOCK;
> +parent_fndecl = BLOCK_SUPERCONTEXT (parent_fndecl))
> + ;
> +
> +   gcc_assert (TREE_CODE (parent_fndecl) == FUNCTION_DECL);
> +
> +   DECL_CONTEXT (child_fndecl) = parent_fndecl;
> +
> +   DECL_CHAIN (child_fndecl) = BLOCK_VARS (b);
> +   BLOCK_VARS (b) = child_fndecl;
> + }
> +}
> +}
> +
>  /* Build the function calls to GOMP_parallel_start etc to actually
> generate the parallel operation.  REGION is the parallel region
> being expanded.  BB is the block where to insert the code.  WS_ARGS
> @@ -667,6 +703,8 @@ expand_parallel_call (struct omp_region *region, 
> basic_block bb,
>tree child_fndecl = gimple_omp_parallel_child_fn (entry_stmt);
>t2 = build_fold_addr_expr (child_fndecl);
>  
> +  adjust_context_and_scope (gimple_block (entry_stmt), child_fndecl);
> +
>vec_alloc (args, 4 + vec_safe_length (ws_args));
>args->quick_push (t2);
>args->quick_push (t1);


Re: [PING][PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-09-22 Thread Jakub Jelinek
On Thu, Sep 21, 2017 at 01:08:01PM -0700, Kevin Buettner wrote:
> Ping.

Ok, thanks.

> On Mon, 7 Aug 2017 17:51:38 -0700
> Kevin Buettner  wrote:
> 
> > On Wed, 10 May 2017 17:24:27 +0200
> > Jakub Jelinek  wrote:
> >  
> > > What I don't like is that the patch is inconsistent, it sets DECL_CONTEXT
> > > of the child function for all kinds of outlined functions, but then you 
> > > just
> > > choose one of the many places and add it into the BLOCK tree.  Any reason
> > > why the DECL_CONTEXT change can't be done in a helper function together
> > > with all the changes you've added into omp-expand.c, and then call it from
> > > expand_omp_parallel (with the child_fn and entry_stmt arguments) so that
> > > you can call it easily also for other constructs, either now or later on? 
> > >  
> > 
> > I've worked out a way to do the DECL_CONTEXT and the scope change
> > together.  The helper function should be usable for other constructs,
> > though I have not tested this yet.
> > 
> > > Also, is there any rationale on appending the FUNCTION_DECL to BLOCK_VARS
> > > instead of prepending it there (which is cheaper)?  Does the debugger
> > > care about relative order of those artificial functions vs. other
> > > variables in the lexical scope?  
> > 
> > To the best of my knowledge, the debugger doesn't care about the
> > order.  I've changed the code to prepend the FUNCTION_DECL to
> > BLOCK_VARS instead.
> > 
> > How does this new version (below) look?
> > 
> > I've done a "make bootstrap" and "make -k check". No regressions found
> > for x86_64.
> > 
> > gcc/ChangeLog:
> > 
> > * omp-expand.c (adjust_context_scope): New function.
> > (expand_parallel_call): Call adjust_context_scope.
> > ---
> >  gcc/omp-expand.c | 38 ++
> >  1 file changed, 38 insertions(+)
> > 
> > diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
> > index d6755cd..9eb0a89 100644
> > --- a/gcc/omp-expand.c
> > +++ b/gcc/omp-expand.c
> > @@ -498,6 +498,42 @@ parallel_needs_hsa_kernel_p (struct omp_region *region)
> >return false;
> >  }
> >  
> > +/* Change DECL_CONTEXT of CHILD_FNDECL to that of the parent function.
> > +   Add CHILD_FNDECL to decl chain of the supercontext of the block
> > +   ENTRY_BLOCK - this is the block which originally contained the
> > +   code from which CHILD_FNDECL was created.
> > +   
> > +   Together, these actions ensure that the debug info for the outlined
> > +   function will be emitted with the correct lexical scope.  */
> > +
> > +static void
> > +adjust_context_and_scope (tree entry_block, tree child_fndecl)
> > +{
> > +  if (entry_block != NULL_TREE && TREE_CODE (entry_block) == BLOCK)
> > +{
> > +  tree b = BLOCK_SUPERCONTEXT (entry_block);
> > +
> > +  if (TREE_CODE (b) == BLOCK)
> > +{
> > + tree parent_fndecl;
> > +
> > + /* Follow supercontext chain until the parent fndecl
> > +is found.  */
> > + for (parent_fndecl = BLOCK_SUPERCONTEXT (b);
> > +  TREE_CODE (parent_fndecl) == BLOCK;
> > +  parent_fndecl = BLOCK_SUPERCONTEXT (parent_fndecl))
> > +   ;
> > +
> > + gcc_assert (TREE_CODE (parent_fndecl) == FUNCTION_DECL);
> > +
> > + DECL_CONTEXT (child_fndecl) = parent_fndecl;
> > +
> > + DECL_CHAIN (child_fndecl) = BLOCK_VARS (b);
> > + BLOCK_VARS (b) = child_fndecl;
> > +   }
> > +}
> > +}
> > +
> >  /* Build the function calls to GOMP_parallel_start etc to actually
> > generate the parallel operation.  REGION is the parallel region
> > being expanded.  BB is the block where to insert the code.  WS_ARGS
> > @@ -667,6 +703,8 @@ expand_parallel_call (struct omp_region *region, 
> > basic_block bb,
> >tree child_fndecl = gimple_omp_parallel_child_fn (entry_stmt);
> >t2 = build_fold_addr_expr (child_fndecl);
> >  
> > +  adjust_context_and_scope (gimple_block (entry_stmt), child_fndecl);
> > +
> >vec_alloc (args, 4 + vec_safe_length (ws_args));
> >args->quick_push (t2);
> >args->quick_push (t1);

Jakub


Re: [PING][PATCH] Output DIEs for outlined OpenMP functions in correct lexical scope

2017-10-01 Thread Kevin Buettner
On Fri, 22 Sep 2017 18:08:39 +0200
Jakub Jelinek  wrote:

> On Thu, Sep 21, 2017 at 01:08:01PM -0700, Kevin Buettner wrote:
> > Ping.  
> 
> Ok, thanks.

Committed.

Kevin
 
> > On Mon, 7 Aug 2017 17:51:38 -0700
> > Kevin Buettner  wrote:
> >   
> > > On Wed, 10 May 2017 17:24:27 +0200
> > > Jakub Jelinek  wrote:
> > >
> > > > What I don't like is that the patch is inconsistent, it sets 
> > > > DECL_CONTEXT
> > > > of the child function for all kinds of outlined functions, but then you 
> > > > just
> > > > choose one of the many places and add it into the BLOCK tree.  Any 
> > > > reason
> > > > why the DECL_CONTEXT change can't be done in a helper function together
> > > > with all the changes you've added into omp-expand.c, and then call it 
> > > > from
> > > > expand_omp_parallel (with the child_fn and entry_stmt arguments) so that
> > > > you can call it easily also for other constructs, either now or later 
> > > > on?
> > > 
> > > I've worked out a way to do the DECL_CONTEXT and the scope change
> > > together.  The helper function should be usable for other constructs,
> > > though I have not tested this yet.
> > >   
> > > > Also, is there any rationale on appending the FUNCTION_DECL to 
> > > > BLOCK_VARS
> > > > instead of prepending it there (which is cheaper)?  Does the debugger
> > > > care about relative order of those artificial functions vs. other
> > > > variables in the lexical scope?
> > > 
> > > To the best of my knowledge, the debugger doesn't care about the
> > > order.  I've changed the code to prepend the FUNCTION_DECL to
> > > BLOCK_VARS instead.
> > > 
> > > How does this new version (below) look?
> > > 
> > > I've done a "make bootstrap" and "make -k check". No regressions found
> > > for x86_64.
> > > 
> > > gcc/ChangeLog:
> > > 
> > >   * omp-expand.c (adjust_context_scope): New function.
> > >   (expand_parallel_call): Call adjust_context_scope.
> > > ---
> > >  gcc/omp-expand.c | 38 ++
> > >  1 file changed, 38 insertions(+)
> > > 
> > > diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
> > > index d6755cd..9eb0a89 100644
> > > --- a/gcc/omp-expand.c
> > > +++ b/gcc/omp-expand.c
> > > @@ -498,6 +498,42 @@ parallel_needs_hsa_kernel_p (struct omp_region 
> > > *region)
> > >return false;
> > >  }
> > >  
> > > +/* Change DECL_CONTEXT of CHILD_FNDECL to that of the parent function.
> > > +   Add CHILD_FNDECL to decl chain of the supercontext of the block
> > > +   ENTRY_BLOCK - this is the block which originally contained the
> > > +   code from which CHILD_FNDECL was created.
> > > +   
> > > +   Together, these actions ensure that the debug info for the outlined
> > > +   function will be emitted with the correct lexical scope.  */
> > > +
> > > +static void
> > > +adjust_context_and_scope (tree entry_block, tree child_fndecl)
> > > +{
> > > +  if (entry_block != NULL_TREE && TREE_CODE (entry_block) == BLOCK)
> > > +{
> > > +  tree b = BLOCK_SUPERCONTEXT (entry_block);
> > > +
> > > +  if (TREE_CODE (b) == BLOCK)
> > > +{
> > > +   tree parent_fndecl;
> > > +
> > > +   /* Follow supercontext chain until the parent fndecl
> > > +  is found.  */
> > > +   for (parent_fndecl = BLOCK_SUPERCONTEXT (b);
> > > +TREE_CODE (parent_fndecl) == BLOCK;
> > > +parent_fndecl = BLOCK_SUPERCONTEXT (parent_fndecl))
> > > + ;
> > > +
> > > +   gcc_assert (TREE_CODE (parent_fndecl) == FUNCTION_DECL);
> > > +
> > > +   DECL_CONTEXT (child_fndecl) = parent_fndecl;
> > > +
> > > +   DECL_CHAIN (child_fndecl) = BLOCK_VARS (b);
> > > +   BLOCK_VARS (b) = child_fndecl;
> > > + }
> > > +}
> > > +}
> > > +
> > >  /* Build the function calls to GOMP_parallel_start etc to actually
> > > generate the parallel operation.  REGION is the parallel region
> > > being expanded.  BB is the block where to insert the code.  WS_ARGS
> > > @@ -667,6 +703,8 @@ expand_parallel_call (struct omp_region *region, 
> > > basic_block bb,
> > >tree child_fndecl = gimple_omp_parallel_child_fn (entry_stmt);
> > >t2 = build_fold_addr_expr (child_fndecl);
> > >  
> > > +  adjust_context_and_scope (gimple_block (entry_stmt), child_fndecl);
> > > +
> > >vec_alloc (args, 4 + vec_safe_length (ws_args));
> > >args->quick_push (t2);
> > >args->quick_push (t1);  
> 
>   Jakub