Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-07-07 Thread Dominik Vogt
On Wed, Jul 06, 2016 at 02:01:06PM +0200, Bernd Schmidt wrote:
> There's one thing I don't quite understand and which seems to have
> changed since v1:
> 
> On 07/04/2016 02:19 PM, Dominik Vogt wrote:
> >@@ -1099,8 +1101,10 @@ expand_stack_vars (bool (*pred) (size_t), struct 
> >stack_vars_data *data)
> >
> >   /* If there were any, allocate space.  */
> >   if (large_size > 0)
> >-large_base = allocate_dynamic_stack_space (GEN_INT (large_size), 0,
> >-   large_align, true);
> >+{
> >+  large_allocsize = GEN_INT (large_size);
> >+  get_dynamic_stack_size (_allocsize, 0, large_align, NULL);
> >+}
> > }
> >
> >   for (si = 0; si < n; ++si)
> >@@ -1186,6 +1190,19 @@ expand_stack_vars (bool (*pred) (size_t), struct 
> >stack_vars_data *data)
> >   /* Large alignment is only processed in the last pass.  */
> >   if (pred)
> > continue;
> >+
> >+  if (large_allocsize && ! large_allocation_done)
> >+{
> >+  /* Allocate space the virtual stack vars area in the
> >+ prologue.  */
> >+  HOST_WIDE_INT loffset;
> >+
> >+  loffset = alloc_stack_frame_space
> >+(INTVAL (large_allocsize),
> >+ PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
> >+  large_base = get_dynamic_stack_base (loffset, large_align);
> >+  large_allocation_done = true;
> >+}
> >   gcc_assert (large_base != NULL);
> >
> 
> Why is this code split between the two places here?

This is a bugfix from v1 to v2.
I think I had to move this code to the second loop so that the
space for dynamically aligned variables is allocated at the "end"
of the space for stack variables.  I cannot remember what the bug
was, but maybe it was that the variables with fixed and static
alignment ended up at the same address.

Anyway, now that the new allocation strategy is used
unconditionally, it should be possible to move the
get_dynamic_stack_size call down to the second loop and simplify
the code somewhat.  I'll look into that.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-07-06 Thread Bernd Schmidt
There's one thing I don't quite understand and which seems to have 
changed since v1:


On 07/04/2016 02:19 PM, Dominik Vogt wrote:

@@ -1099,8 +1101,10 @@ expand_stack_vars (bool (*pred) (size_t), struct 
stack_vars_data *data)

   /* If there were any, allocate space.  */
   if (large_size > 0)
-   large_base = allocate_dynamic_stack_space (GEN_INT (large_size), 0,
-  large_align, true);
+   {
+ large_allocsize = GEN_INT (large_size);
+ get_dynamic_stack_size (_allocsize, 0, large_align, NULL);
+   }
 }

   for (si = 0; si < n; ++si)
@@ -1186,6 +1190,19 @@ expand_stack_vars (bool (*pred) (size_t), struct 
stack_vars_data *data)
  /* Large alignment is only processed in the last pass.  */
  if (pred)
continue;
+
+ if (large_allocsize && ! large_allocation_done)
+   {
+ /* Allocate space the virtual stack vars area in the
+prologue.  */
+ HOST_WIDE_INT loffset;
+
+ loffset = alloc_stack_frame_space
+   (INTVAL (large_allocsize),
+PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
+ large_base = get_dynamic_stack_base (loffset, large_align);
+ large_allocation_done = true;
+   }
  gcc_assert (large_base != NULL);



Why is this code split between the two places here? v1 seems to have 
done it all in the first piece of code where we now only set 
large_allocsize.



Bernd



Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-07-04 Thread Andreas Krebbel
On 07/04/2016 02:19 PM, Dominik Vogt wrote:
> Version 4 with the following change:
> 
>  * Rebased on top of the "Minor cleanup to
>allocate_dynamic_stack_space" patch.  The "Drop excess size
>used for run time allocated stack variables." path needs an
>update because it touches the dsame code as the patch in this
>message.
> 
> Ran the testsuite on s390x biarch, s390 and x86_64.
> 
> On Fri, Jun 24, 2016 at 01:30:44PM +0100, Dominik Vogt wrote:
>>> The only open question I'm aware of is the
>>> stack-usage-2.c test.  I guess foo3() will not generate
>>>
>>>   stack usage might be ... bytes
>>>
>>> On any target anymore, and using alloca() with a constant size
>>> results in "unbounded".  It's unclear to me whether that message
>>> is ever generated, and if so, how to trigger it.
> 
> This point is still open.  If nobody has more comments Andreas
> will commit the (afaik already approved) patch soon and we can
> clean up the test case in a follow up patch.

I would like to see an explicit approval before doing the commit.  I think it 
would also make sense
to let other target maintainers have a look whether this might cause any 
problems.

Bye,

-Andreas-


> 
> Ciao
> 
> Dominik ^_^  ^_^
> 



Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-07-04 Thread Dominik Vogt
Version 4 with the following change:

 * Rebased on top of the "Minor cleanup to
   allocate_dynamic_stack_space" patch.  The "Drop excess size
   used for run time allocated stack variables." path needs an
   update because it touches the dsame code as the patch in this
   message.

Ran the testsuite on s390x biarch, s390 and x86_64.

On Fri, Jun 24, 2016 at 01:30:44PM +0100, Dominik Vogt wrote:
> > The only open question I'm aware of is the
> > stack-usage-2.c test.  I guess foo3() will not generate
> > 
> >   stack usage might be ... bytes
> > 
> > On any target anymore, and using alloca() with a constant size
> > results in "unbounded".  It's unclear to me whether that message
> > is ever generated, and if so, how to trigger it.

This point is still open.  If nobody has more comments Andreas
will commit the (afaik already approved) patch soon and we can
clean up the test case in a follow up patch.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* cfgexpand.c (expand_stack_vars): Implement synamic stack space
allocation in the prologue.
* explow.c (get_dynamic_stack_base): New function to return an address
expression for the dynamic stack base.
(get_dynamic_stack_size): New function to do the required dynamic stack
space size calculations.
(allocate_dynamic_stack_space): Use new functions.
(align_dynamic_address): Move some code from
allocate_dynamic_stack_space to new function.
* explow.h (get_dynamic_stack_base, get_dynamic_stack_size): Export.
gcc/testsuite/ChangeLog

* gcc.target/s390/warn-dynamicstack-1.c: New test.
* gcc.dg/stack-usage-2.c (foo3): Adapt expected warning.
stack-layout-dynamic-1.c: New test.
>From 83fafd37883e1af3deb2ff13b9fcaefc9d3b7c7e Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 25 Nov 2015 09:31:19 +0100
Subject: [PATCH] Allocate constant size dynamic stack space in the
 prologue ...

... and place it in the virtual stack vars area, if the platform supports it.
On S/390 this saves adjusting the stack pointer twice and forcing the frame
pointer into existence.  It also removes the warning with -mwarn-dynamicstack
that is triggered by cfun->calls_alloca == 1.

This fixes a problem with the Linux kernel which aligns the page structure to
16 bytes at run time using inefficient code and issuing a bogus warning.
---
 gcc/cfgexpand.c|  21 +-
 gcc/explow.c   | 226 ++---
 gcc/explow.h   |   8 +
 gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c  |  14 ++
 gcc/testsuite/gcc.dg/stack-usage-2.c   |   4 +-
 .../gcc.target/s390/warn-dynamicstack-1.c  |  17 ++
 6 files changed, 209 insertions(+), 81 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/warn-dynamicstack-1.c

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 56ef71d..f0ef80f 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -1052,7 +1052,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
   size_t si, i, j, n = stack_vars_num;
   HOST_WIDE_INT large_size = 0, large_alloc = 0;
   rtx large_base = NULL;
+  rtx large_allocsize = NULL;
   unsigned large_align = 0;
+  bool large_allocation_done = false;
   tree decl;
 
   /* Determine if there are any variables requiring "large" alignment.
@@ -1099,8 +1101,10 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 
   /* If there were any, allocate space.  */
   if (large_size > 0)
-	large_base = allocate_dynamic_stack_space (GEN_INT (large_size), 0,
-		   large_align, true);
+	{
+	  large_allocsize = GEN_INT (large_size);
+	  get_dynamic_stack_size (_allocsize, 0, large_align, NULL);
+	}
 }
 
   for (si = 0; si < n; ++si)
@@ -1186,6 +1190,19 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 	  /* Large alignment is only processed in the last pass.  */
 	  if (pred)
 	continue;
+
+	  if (large_allocsize && ! large_allocation_done)
+	{
+	  /* Allocate space the virtual stack vars area in the
+	 prologue.  */
+	  HOST_WIDE_INT loffset;
+
+	  loffset = alloc_stack_frame_space
+		(INTVAL (large_allocsize),
+		 PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
+	  large_base = get_dynamic_stack_base (loffset, large_align);
+	  large_allocation_done = true;
+	}
 	  gcc_assert (large_base != NULL);
 
 	  large_alloc += alignb - 1;
diff --git a/gcc/explow.c b/gcc/explow.c
index 09a0330..d505e98 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1146,82 +1146,55 @@ record_new_stack_level (void)
 update_sjlj_context ();
 }
 
-/* Return an rtx representing the address of an area of memory dynamically
-   pushed on the stack.
+/* Return an rtx doing runtime alignment to REQUIRED_ALIGN on 

Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-06-24 Thread Dominik Vogt
On Thu, Jun 23, 2016 at 04:48:14PM +0100, Dominik Vogt wrote:
> Third version of the patch.  Changes:
> 
>  * Corrected a typo in a test case comment.
>  * Verify that stack variable alignment does not force the frame
>pointer into existence (with -fomit-frame-pointer)
> 
> The test should hopefully run on all targets.  Tested on s390,
> s390x biarch, x86_64.  The only open question I'm aware of is the
> stack-usage-2.c test.  I guess foo3() will not generate
> 
>   stack usage might be ... bytes
> 
> On any target anymore, and using alloca() with a constant size
> results in "unbounded".  It's unclear to me whether that message
> is ever generated, and if so, how to trigger it.

> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -1186,6 +1190,18 @@ expand_stack_vars (bool (*pred) (size_t), struct 
> stack_vars_data *data)
> /* Large alignment is only processed in the last pass.  */
> if (pred)
>   continue;
> +
> +   if (large_allocsize && ! large_allocation_done)
> + {
> +   /* Allocate space the virtual stack vars area in the
> +  prologue.  */
> +   HOST_WIDE_INT loffset;
> +
> +   loffset = alloc_stack_frame_space (INTVAL (large_allocsize),
> +  PREFERRED_STACK_BOUNDARY);
 

Uh, this must be PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-06-23 Thread Dominik Vogt
Third version of the patch.  Changes:

 * Corrected a typo in a test case comment.
 * Verify that stack variable alignment does not force the frame
   pointer into existence (with -fomit-frame-pointer)

The test should hopefully run on all targets.  Tested on s390,
s390x biarch, x86_64.  The only open question I'm aware of is the
stack-usage-2.c test.  I guess foo3() will not generate

  stack usage might be ... bytes

On any target anymore, and using alloca() with a constant size
results in "unbounded".  It's unclear to me whether that message
is ever generated, and if so, how to trigger it.

> >>diff --git a/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c 
> >>b/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
> >>new file mode 100644
> >>index 000..e06a16c
> >>--- /dev/null
> >>+++ b/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
> >>@@ -0,0 +1,14 @@
> >>+/* Verify that run time aligned local variables are aloocated in the 
> >>prologue
> >>+   in one pass together with normal local variables.  */
> >>+/* { dg-do compile } */
> >>+/* { dg-options "-O0" } */
> >>+
> >>+extern void bar (void *, void *, void *);
> >>+void foo (void)
> >>+{
> >>+  int i;
> >>+  __attribute__ ((aligned(65536))) char runtime_aligned_1[512];
> >>+  __attribute__ ((aligned(32768))) char runtime_aligned_2[1024];
> >>+  bar (, _aligned_1, _aligned_2);
> >>+}
> >>+/* { dg-final { scan-assembler-times "cfi_def_cfa_offset" 2 { target { 
> >>s390*-*-* } } } } */
> >
> >I've no idea how to test this on other targets, or how to express
> >the test in a target independent way.  The scan-assembler-times
> >does not work on x86_64.
> I wonder if you could force -fomit-frame-pointer and see if we still
> end up with a frame pointer?
> 
> jeff

On Fri, May 06, 2016 at 10:37:47AM +0100, Dominik Vogt wrote:
> Updated version of the patch described below.  Apart from fixing a
> bug and adding a test, the new logic is now used always, for all
> targets.  The discussion of the original patch starts here:
>
> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg03052.html
>
> The new patch has been bootstrapped and regression tested on s390,
> s390x and x86_64, but please check the questions/comments in the
> follow up message.
>
> On Wed, Nov 25, 2015 at 01:56:10PM +0100, Dominik Vogt wrote:
> > The attached patch fixes a warning during Linux kernel compilation
> > on S/390 due to -mwarn-dynamicstack and runtime alignment of stack
> > variables with constant size causing cfun->calls_alloca to be set
> > (even if alloca is not used at all).  The patched code places
> > constant size runtime aligned variables in the "virtual stack
> > vars" area instead of creating a "virtual stack dynamic" area.
> >
> > This behaviour is activated by defining
> >
> >   #define ALLOCATE_DYNAMIC_STACK_SPACE_IN_PROLOGUE 1
> >
> > in the backend; otherwise the old logic is used.
> >
> > The kernel uses runtime alignment for the page structure (aligned
> > to 16 bytes), and apart from triggereing the alloca warning
> > (-mwarn-dynamicstack), the current Gcc also generates inefficient
> > code like
> >
> >   aghi %r15,-160  # prologue: create stack frame
> >   lgr %r11,%r15   # prologue: generate frame pointer
> >   aghi %r15,-32   # space for dynamic stack
> >
> > which could be simplified to
> >
> >   aghi %r15,-192
> >
> > (if later optimization passes are able to get rid of the frame
> > pointer).  Is there a specific reason why the patched behaviour
> > shouldn't be used for all platforms?
> >
> > --
> >
> > As the placement of runtime aligned stack variables with constant
> > size is done completely in the middleend, I don't see a way to fix
> > this in the backend.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* cfgexpand.c (expand_stack_vars): Implement synamic stack space
allocation in the prologue.
* explow.c (get_dynamic_stack_base): New function to return an address
expression for the dynamic stack base.
(get_dynamic_stack_size): New function to do the required dynamic stack
space size calculations.
(allocate_dynamic_stack_space): Use new functions.
(align_dynamic_address): Move some code from
allocate_dynamic_stack_space to new function.
* explow.h (get_dynamic_stack_base, get_dynamic_stack_size): Export.
gcc/testsuite/ChangeLog

* gcc.target/s390/warn-dynamicstack-1.c: New test.
* gcc.dg/stack-usage-2.c (foo3): Adapt expected warning.
stack-layout-dynamic-1.c: New test.
>From 8e48e4a8ff063e7894a375724ed5eddb57018c03 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 25 Nov 2015 09:31:19 +0100
Subject: [PATCH] Allocate constant size dynamic stack space in the
 prologue ...

... and place it in the virtual stack vars area, if the platform supports it.
On S/390 this saves adjusting the stack pointer twice and forcing the frame
pointer into existence.  It also removes the warning with -mwarn-dynamicstack
that is 

Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-06-22 Thread Jeff Law

On 05/06/2016 03:44 AM, Dominik Vogt wrote:

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 21f21c9..4d48afd 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c

...

@@ -1099,8 +1101,10 @@ expand_stack_vars (bool (*pred) (size_t), struct 
stack_vars_data *data)

   /* If there were any, allocate space.  */
   if (large_size > 0)
-   large_base = allocate_dynamic_stack_space (GEN_INT (large_size), 0,
-  large_align, true);
+   {
+ large_allocsize = GEN_INT (large_size);
+ get_dynamic_stack_size (_allocsize, 0, large_align, NULL);

...

See below.


@@ -1186,6 +1190,18 @@ expand_stack_vars (bool (*pred) (size_t), struct 
stack_vars_data *data)
  /* Large alignment is only processed in the last pass.  */
  if (pred)
continue;
+
+ if (large_allocsize && ! large_allocation_done)
+   {
+ /* Allocate space the virtual stack vars area in the prologue.
+  */
+ HOST_WIDE_INT loffset;
+
+ loffset = alloc_stack_frame_space (INTVAL (large_allocsize),
+PREFERRED_STACK_BOUNDARY);


1) Should this use PREFERRED_STACK_BOUNDARY or just STACK_BOUNDARY?
2) Is this the right place for rounding up, or should
   it be done above, maybe in get_dynamic_stack_size?

I think PREFERRED_STACK_BOUNDARY is the correct one to use.

I think rounding in either place is fine.  We'd like to avoid multiple 
roundings, but otherwise I don't think it really matters.


jeff


Not sure whether this is the right


+ large_base = get_dynamic_stack_base (loffset, large_align);
+ large_allocation_done = true;
+   }
  gcc_assert (large_base != NULL);

  large_alloc += alignb - 1;



diff --git a/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c 
b/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
new file mode 100644
index 000..e06a16c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
@@ -0,0 +1,14 @@
+/* Verify that run time aligned local variables are aloocated in the prologue
+   in one pass together with normal local variables.  */
+/* { dg-do compile } */
+/* { dg-options "-O0" } */
+
+extern void bar (void *, void *, void *);
+void foo (void)
+{
+  int i;
+  __attribute__ ((aligned(65536))) char runtime_aligned_1[512];
+  __attribute__ ((aligned(32768))) char runtime_aligned_2[1024];
+  bar (, _aligned_1, _aligned_2);
+}
+/* { dg-final { scan-assembler-times "cfi_def_cfa_offset" 2 { target { 
s390*-*-* } } } } */


I've no idea how to test this on other targets, or how to express
the test in a target independent way.  The scan-assembler-times
does not work on x86_64.
I wonder if you could force -fomit-frame-pointer and see if we still end 
up with a frame pointer?


jeff



Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-06-22 Thread Jeff Law

On 05/06/2016 03:37 AM, Dominik Vogt wrote:

Updated version of the patch described below.  Apart from fixing a
bug and adding a test, the new logic is now used always, for all
targets.  The discussion of the original patch starts here:

https://gcc.gnu.org/ml/gcc-patches/2015-11/msg03052.html

The new patch has been bootstrapped and regression tested on s390,
s390x and x86_64, but please check the questions/comments in the
follow up message.

On Wed, Nov 25, 2015 at 01:56:10PM +0100, Dominik Vogt wrote:

> The attached patch fixes a warning during Linux kernel compilation
> on S/390 due to -mwarn-dynamicstack and runtime alignment of stack
> variables with constant size causing cfun->calls_alloca to be set
> (even if alloca is not used at all).  The patched code places
> constant size runtime aligned variables in the "virtual stack
> vars" area instead of creating a "virtual stack dynamic" area.
>
> This behaviour is activated by defining
>
>   #define ALLOCATE_DYNAMIC_STACK_SPACE_IN_PROLOGUE 1
Is there some reason why we don't just to this unconditionally?  ie, if 
we know the size of dynamic space, why not just always handle that in 
the prologue?  Seems like a useful optimization for a variety of reasons.


Of course if we do this unconditionally, we definitely need to find a 
way to test it better.


In reality, I don't see where/how the patch uses 
ALLOCATE_DYNAMIC_STACK_SPACE_IN_PROLOGUE anyway and it seems to be 
enabled for all targets, which is what I want :-)





>
> in the backend; otherwise the old logic is used.
>
> The kernel uses runtime alignment for the page structure (aligned
> to 16 bytes), and apart from triggereing the alloca warning
> (-mwarn-dynamicstack), the current Gcc also generates inefficient
> code like
>
>   aghi %r15,-160  # prologue: create stack frame
>   lgr %r11,%r15   # prologue: generate frame pointer
>   aghi %r15,-32   # space for dynamic stack
>
> which could be simplified to
>
>   aghi %r15,-192
>
> (if later optimization passes are able to get rid of the frame
> pointer).  Is there a specific reason why the patched behaviour
> shouldn't be used for all platforms?
>
> --
>
> As the placement of runtime aligned stack variables with constant
> size is done completely in the middleend, I don't see a way to fix
> this in the backend.

Ciao

Dominik ^_^  ^_^

-- Dominik Vogt IBM Germany


0001-v2-ChangeLog


gcc/ChangeLog

* cfgexpand.c (expand_stack_vars): Implement synamic stack space
allocation in the prologue.
* explow.c (get_dynamic_stack_base): New function to return an address
expression for the dynamic stack base.
(get_dynamic_stack_size): New function to do the required dynamic stack
space size calculations.
(allocate_dynamic_stack_space): Use new functions.
(align_dynamic_address): Move some code from
allocate_dynamic_stack_space to new function.
* explow.h (get_dynamic_stack_base, get_dynamic_stack_size): Export.
gcc/testsuite/ChangeLog

* gcc.target/s390/warn-dynamicstack-1.c: New test.
* gcc.dg/stack-usage-2.c (foo3): Adapt expected warning.
stack-layout-dynamic-1.c: New test.


0001-v2-Allocate-constant-size-dynamic-stack-space-in-the-pr.patch


From e76a7e02f7862681d1b5344e64aca1b0a62cdc2c Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 25 Nov 2015 09:31:19 +0100
Subject: [PATCH] Allocate constant size dynamic stack space in the
 prologue ...

... and place it in the virtual stack vars area, if the platform supports it.
On S/390 this saves adjusting the stack pointer twice and forcing the frame
pointer into existence.  It also removes the warning with -mwarn-dynamicstack
that is triggered by cfun->calls_alloca == 1.

This fixes a problem with the Linux kernel which aligns the page structure to
16 bytes at run time using inefficient code and issuing a bogus warning.



@@ -1186,6 +1190,18 @@ expand_stack_vars (bool (*pred) (size_t), struct 
stack_vars_data *data)
  /* Large alignment is only processed in the last pass.  */
  if (pred)
continue;
+
+ if (large_allocsize && ! large_allocation_done)
+   {
+ /* Allocate space the virtual stack vars area in the prologue.
+  */

Line wrapping nit here.  Bring "prologue" down to the next line.

I really like this.  I think the big question is how do we test it.  I 
suspect our bootstrap and regression suite probably don't exercise this 
code is any significant way.


Jeff


Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-05-06 Thread Dominik Vogt
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 21f21c9..4d48afd 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
...
> @@ -1099,8 +1101,10 @@ expand_stack_vars (bool (*pred) (size_t), struct 
> stack_vars_data *data)
>  
>/* If there were any, allocate space.  */
>if (large_size > 0)
> - large_base = allocate_dynamic_stack_space (GEN_INT (large_size), 0,
> -large_align, true);
> + {
> +   large_allocsize = GEN_INT (large_size);
> +   get_dynamic_stack_size (_allocsize, 0, large_align, NULL);
...

See below.

> @@ -1186,6 +1190,18 @@ expand_stack_vars (bool (*pred) (size_t), struct 
> stack_vars_data *data)
> /* Large alignment is only processed in the last pass.  */
> if (pred)
>   continue;
> +
> +   if (large_allocsize && ! large_allocation_done)
> + {
> +   /* Allocate space the virtual stack vars area in the prologue.
> +*/
> +   HOST_WIDE_INT loffset;
> +
> +   loffset = alloc_stack_frame_space (INTVAL (large_allocsize),
> +  PREFERRED_STACK_BOUNDARY);

1) Should this use PREFERRED_STACK_BOUNDARY or just STACK_BOUNDARY?
2) Is this the right place for rounding up, or should 
   it be done above, maybe in get_dynamic_stack_size?

Not sure whether this is the right 

> +   large_base = get_dynamic_stack_base (loffset, large_align);
> +   large_allocation_done = true;
> + }
> gcc_assert (large_base != NULL);
>  
> large_alloc += alignb - 1;

> diff --git a/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c 
> b/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
> new file mode 100644
> index 000..e06a16c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
> @@ -0,0 +1,14 @@
> +/* Verify that run time aligned local variables are aloocated in the prologue
> +   in one pass together with normal local variables.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O0" } */
> +
> +extern void bar (void *, void *, void *);
> +void foo (void)
> +{
> +  int i;
> +  __attribute__ ((aligned(65536))) char runtime_aligned_1[512];
> +  __attribute__ ((aligned(32768))) char runtime_aligned_2[1024];
> +  bar (, _aligned_1, _aligned_2);
> +}
> +/* { dg-final { scan-assembler-times "cfi_def_cfa_offset" 2 { target { 
> s390*-*-* } } } } */

I've no idea how to test this on other targets, or how to express
the test in a target independent way.  The scan-assembler-times
does not work on x86_64.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-05-06 Thread Dominik Vogt
Updated version of the patch described below.  Apart from fixing a
bug and adding a test, the new logic is now used always, for all
targets.  The discussion of the original patch starts here:

https://gcc.gnu.org/ml/gcc-patches/2015-11/msg03052.html

The new patch has been bootstrapped and regression tested on s390,
s390x and x86_64, but please check the questions/comments in the
follow up message.

On Wed, Nov 25, 2015 at 01:56:10PM +0100, Dominik Vogt wrote:
> The attached patch fixes a warning during Linux kernel compilation
> on S/390 due to -mwarn-dynamicstack and runtime alignment of stack
> variables with constant size causing cfun->calls_alloca to be set
> (even if alloca is not used at all).  The patched code places
> constant size runtime aligned variables in the "virtual stack
> vars" area instead of creating a "virtual stack dynamic" area.
> 
> This behaviour is activated by defining
> 
>   #define ALLOCATE_DYNAMIC_STACK_SPACE_IN_PROLOGUE 1
> 
> in the backend; otherwise the old logic is used.
> 
> The kernel uses runtime alignment for the page structure (aligned
> to 16 bytes), and apart from triggereing the alloca warning
> (-mwarn-dynamicstack), the current Gcc also generates inefficient
> code like
> 
>   aghi %r15,-160  # prologue: create stack frame
>   lgr %r11,%r15   # prologue: generate frame pointer
>   aghi %r15,-32   # space for dynamic stack
> 
> which could be simplified to
> 
>   aghi %r15,-192
> 
> (if later optimization passes are able to get rid of the frame
> pointer).  Is there a specific reason why the patched behaviour
> shouldn't be used for all platforms?
> 
> --
> 
> As the placement of runtime aligned stack variables with constant
> size is done completely in the middleend, I don't see a way to fix
> this in the backend.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* cfgexpand.c (expand_stack_vars): Implement synamic stack space
allocation in the prologue.
* explow.c (get_dynamic_stack_base): New function to return an address
expression for the dynamic stack base.
(get_dynamic_stack_size): New function to do the required dynamic stack
space size calculations.
(allocate_dynamic_stack_space): Use new functions.
(align_dynamic_address): Move some code from
allocate_dynamic_stack_space to new function.
* explow.h (get_dynamic_stack_base, get_dynamic_stack_size): Export.
gcc/testsuite/ChangeLog

* gcc.target/s390/warn-dynamicstack-1.c: New test.
* gcc.dg/stack-usage-2.c (foo3): Adapt expected warning.
stack-layout-dynamic-1.c: New test.
>From e76a7e02f7862681d1b5344e64aca1b0a62cdc2c Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 25 Nov 2015 09:31:19 +0100
Subject: [PATCH] Allocate constant size dynamic stack space in the
 prologue ...

... and place it in the virtual stack vars area, if the platform supports it.
On S/390 this saves adjusting the stack pointer twice and forcing the frame
pointer into existence.  It also removes the warning with -mwarn-dynamicstack
that is triggered by cfun->calls_alloca == 1.

This fixes a problem with the Linux kernel which aligns the page structure to
16 bytes at run time using inefficient code and issuing a bogus warning.
---
 gcc/cfgexpand.c|  20 +-
 gcc/explow.c   | 232 ++---
 gcc/explow.h   |   9 +
 gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c  |  14 ++
 gcc/testsuite/gcc.dg/stack-usage-2.c   |   4 +-
 .../gcc.target/s390/warn-dynamicstack-1.c  |  17 ++
 6 files changed, 212 insertions(+), 84 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/warn-dynamicstack-1.c

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 21f21c9..4d48afd 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -1052,7 +1052,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
   size_t si, i, j, n = stack_vars_num;
   HOST_WIDE_INT large_size = 0, large_alloc = 0;
   rtx large_base = NULL;
+  rtx large_allocsize = NULL;
   unsigned large_align = 0;
+  bool large_allocation_done = false;
   tree decl;
 
   /* Determine if there are any variables requiring "large" alignment.
@@ -1099,8 +1101,10 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 
   /* If there were any, allocate space.  */
   if (large_size > 0)
-	large_base = allocate_dynamic_stack_space (GEN_INT (large_size), 0,
-		   large_align, true);
+	{
+	  large_allocsize = GEN_INT (large_size);
+	  get_dynamic_stack_size (_allocsize, 0, large_align, NULL);
+	}
 }
 
   for (si = 0; si < n; ++si)
@@ -1186,6 +1190,18 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 	  /* Large alignment is only processed in the last pass.  */