Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread H.J. Lu
On Wed, Aug 19, 2015 at 5:51 AM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 On Wed, Aug 19, 2015 at 05:23:41AM -0700, H.J. Lu wrote:
You might have a reason why you want the entry stack address 
instead of the
frame address, but you didn't really explain I think?  Or I missed 
it.
 
  What would a C program do with this, that it cannot do with the frame
  address, that would be useful and cannot be much better done in straight
  assembler?  Do you actually want to expose the argument pointer, maybe?
 
  Yes, we want to use the argument pointer as shown in testcases
  included in my patch.

 Where do we stand on this?  We need the hard stack address at
 function entry for x86 without using frame pointer.   I added
 __builtin_stack_top since __builtin_frame_address can't give
 us what we want.  Should __builtin_stack_top be added to
 middle-end or x86 backend?

 Sorry for not following up; I thought my suggestion was obvious.

 Can you do a __builtin_argument_pointer instead?  That should work
 for all targets, afaics?

To me, stack top is easier to understand and argument pointer isn't
very clear.  Does argument pointer exist when there is no argument?

But I can live with it.  I will update my patch.

Thanks.

-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread H.J. Lu
On Tue, Aug 4, 2015 at 1:50 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Aug 4, 2015 at 1:45 PM, Segher Boessenkool
 seg...@kernel.crashing.org wrote:
 On Tue, Aug 04, 2015 at 01:00:32PM -0700, H.J. Lu wrote:
 There is another issue with x86, maybe other targets.  You
 can't get the real stack top when stack is realigned and
 -maccumulate-outgoing-args isn't used since ix86_expand_prologue
 will create and return another stack frame for
 __builtin_frame_address and __builtin_return_address.
 It will be wrong for __builtin_stack_top, which should
 return the real stack address.

 That's why I asked:

   You might have a reason why you want the entry stack address instead 
   of the
   frame address, but you didn't really explain I think?  Or I missed it.

 What would a C program do with this, that it cannot do with the frame
 address, that would be useful and cannot be much better done in straight
 assembler?  Do you actually want to expose the argument pointer, maybe?


 Yes, we want to use the argument pointer as shown in testcases
 included in my patch.


Where do we stand on this?  We need the hard stack address at
function entry for x86 without using frame pointer.   I added
__builtin_stack_top since __builtin_frame_address can't give
us what we want.  Should __builtin_stack_top be added to
middle-end or x86 backend?

Thanks.

-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread Segher Boessenkool
On Wed, Aug 19, 2015 at 05:23:41AM -0700, H.J. Lu wrote:
You might have a reason why you want the entry stack address instead 
of the
frame address, but you didn't really explain I think?  Or I missed 
it.
 
  What would a C program do with this, that it cannot do with the frame
  address, that would be useful and cannot be much better done in straight
  assembler?  Do you actually want to expose the argument pointer, maybe?
 
  Yes, we want to use the argument pointer as shown in testcases
  included in my patch.
 
 Where do we stand on this?  We need the hard stack address at
 function entry for x86 without using frame pointer.   I added
 __builtin_stack_top since __builtin_frame_address can't give
 us what we want.  Should __builtin_stack_top be added to
 middle-end or x86 backend?

Sorry for not following up; I thought my suggestion was obvious.

Can you do a __builtin_argument_pointer instead?  That should work
for all targets, afaics?


Segher


Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread H.J. Lu
On Wed, Aug 19, 2015 at 6:00 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, Aug 19, 2015 at 5:51 AM, Segher Boessenkool
 seg...@kernel.crashing.org wrote:
 On Wed, Aug 19, 2015 at 05:23:41AM -0700, H.J. Lu wrote:
You might have a reason why you want the entry stack address 
instead of the
frame address, but you didn't really explain I think?  Or I missed 
it.
 
  What would a C program do with this, that it cannot do with the frame
  address, that would be useful and cannot be much better done in straight
  assembler?  Do you actually want to expose the argument pointer, maybe?
 
  Yes, we want to use the argument pointer as shown in testcases
  included in my patch.

 Where do we stand on this?  We need the hard stack address at
 function entry for x86 without using frame pointer.   I added
 __builtin_stack_top since __builtin_frame_address can't give
 us what we want.  Should __builtin_stack_top be added to
 middle-end or x86 backend?

 Sorry for not following up; I thought my suggestion was obvious.

 Can you do a __builtin_argument_pointer instead?  That should work
 for all targets, afaics?

 To me, stack top is easier to understand and argument pointer isn't
 very clear.  Does argument pointer exist when there is no argument?

 But I can live with it.  I will update my patch.


Here is a patch to add __builtin_argument_pointer.  I only have

 -- Built-in Function: void * __builtin_argument_pointer (void)
 This function returns the argument pointer.

as documentation.  Can you suggest a better description so that it can
be implemented also by other compilers?

Thanks.

-- 
H.J.
From 9af08fdda587e1876e09840499000e35cc841e96 Mon Sep 17 00:00:00 2001
From: H.J. Lu hjl.to...@gmail.com
Date: Tue, 21 Jul 2015 14:32:09 -0700
Subject: [PATCH] Add __builtin_argument_pointer

When __builtin_frame_address is used to retrieve the address of the
function stack frame, the frame pointer register is required, which
wastes one register and 2 instructions.  For x86-32, one less register
means significant negative impact on performance.  This patch adds a
new builtin function, __builtin_argument_pointer.  It returns the
argument pointer, which, on x86, can be used to compute the stack
address when the function is called by subtracting the size of integer
register.

gcc/

	PR target/66960
	* builtin-types.def (BT_FN_PTR_VOID): New function type.
	* builtins.c (expand_builtin): Handle BUILT_IN_ARGUMENT_POINTER.
	(is_simple_builtin): Likewise.
	* ipa-pure-const.c (special_builtin_state): Likewise.
	* builtins.def: Add BUILT_IN_ARGUMENT_POINTER.
	* function.h (function): Add argument_pointer_taken.
	* config/i386/i386.c (ix86_expand_prologue): Sorry if DRAP is
	used and the argument pointer has been taken.
	* doc/extend.texi: Document __builtin_argument_pointer.

gcc/testsuite/

	PR target/66960
	* gcc.target/i386/pr66960-1.c: New test.
	* gcc.target/i386/pr66960-2.c: Likewise.
	* gcc.target/i386/pr66960-3.c: Likewise.
	* gcc.target/i386/pr66960-4.c: Likewise.
	* gcc.target/i386/pr66960-5.c: Likewise.
---
 gcc/builtin-types.def |  1 +
 gcc/builtins.c|  5 +
 gcc/builtins.def  |  1 +
 gcc/config/i386/i386.c|  6 ++
 gcc/doc/extend.texi   |  4 
 gcc/function.h|  3 +++
 gcc/ipa-pure-const.c  |  1 +
 gcc/testsuite/gcc.target/i386/pr66960-1.c | 34 +++
 gcc/testsuite/gcc.target/i386/pr66960-2.c | 34 +++
 gcc/testsuite/gcc.target/i386/pr66960-3.c | 18 
 gcc/testsuite/gcc.target/i386/pr66960-4.c | 22 
 gcc/testsuite/gcc.target/i386/pr66960-5.c | 22 
 12 files changed, 151 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-5.c

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 0e34531..2b6b5ab 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -177,6 +177,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_COMPLEX_LONGDOUBLE_LONGDOUBLE,
 		 BT_COMPLEX_LONGDOUBLE, BT_LONGDOUBLE)
 DEF_FUNCTION_TYPE_1 (BT_FN_PTR_UINT, BT_PTR, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_PTR_SIZE, BT_PTR, BT_SIZE)
+DEF_FUNCTION_TYPE_1 (BT_FN_PTR_VOID, BT_PTR, BT_VOID)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_INT, BT_INT, BT_INT)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_UINT, BT_INT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_LONG, BT_INT, BT_LONG)
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 31969ca..b1cfa44 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6206,6 +6206,10 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 case BUILT_IN_CONSTANT_P:
   return 

Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread H.J. Lu
On Wed, Aug 19, 2015 at 9:58 AM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 On Wed, Aug 19, 2015 at 08:25:49AM -0700, H.J. Lu wrote:
 Here is a patch to add __builtin_argument_pointer.  I only have

 Sorry to be a pain but...  all the other builtins use _address
 instead of _pointer, it's probably best to follow that.

  -- Built-in Function: void * __builtin_argument_pointer (void)
  This function returns the argument pointer.

 as documentation.  Can you suggest a better description so that it can
 be implemented also by other compilers?

 Maybe something like (heavily cut'n'pasted):


 @deftypefn {Built-in Function} {void *} __builtin_argument_address (void)
 This function is similar to @code{__builtin_frame_address} with an
 argument of 0, but it returns the address of the incoming arguments to
 the current function rather than the address of its frame.

This doesn't make senses when there is no argument or arguments
are passed in registers.  To me, argument pointer is a virtual concept
and an implementation detail internal to GCC.  I am not sure if another
compiler can implement it based on this description.

 The exact definition of this address depends upon the processor and the
 calling convention.  Usually some arguments are passed in registers and
 the rest on the stack, and this builtin returns the address of the first
 argument that is on the stack.


 +  /* Can't use DRAP if the stack address has been taken.  */
 +  if (cfun-argument_pointer_taken)
 + sorry (%__builtin_argument_pointer% not supported with stack
 + realignment.  This may be worked around by adding
 + -maccumulate-outgoing-args.);

 This doesn't work with DRAP?  Pity :-(

With DRAP,  we do

  /* Replicate the return address on the stack so that return
 address can be reached via (argp - 1) slot.  This is needed
 to implement macro RETURN_ADDR_RTX and intrinsic function
 expand_builtin_return_addr etc.  */
  t = plus_constant (Pmode, crtl-drap_reg, -UNITS_PER_WORD);
  t = gen_frame_mem (word_mode, t);
  insn = emit_insn (gen_push (t));
  RTX_FRAME_RELATED_P (insn) = 1;

  /* For the purposes of frame and register save area addressing,
 we've started over with a new frame.  */
  m-fs.sp_offset = INCOMING_FRAME_SP_OFFSET;
  m-fs.realigned = true;

which doesn't work for __builtin_argument_pointer.

 The patch looks plausible, but I of course can not approve it.


Thanks.


-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread Segher Boessenkool
On Wed, Aug 19, 2015 at 10:08:01AM -0700, H.J. Lu wrote:
  Maybe something like (heavily cut'n'pasted):
 
 
  @deftypefn {Built-in Function} {void *} __builtin_argument_address (void)
  This function is similar to @code{__builtin_frame_address} with an
  argument of 0, but it returns the address of the incoming arguments to
  the current function rather than the address of its frame.
 
 This doesn't make senses when there is no argument or arguments
 are passed in registers.

Sure, but see the weasel-words below (The exact...)

 To me, argument pointer is a virtual concept
 and an implementation detail internal to GCC.  I am not sure if another
 compiler can implement it based on this description.

The same is true for frame_address, on many machines.

  The exact definition of this address depends upon the processor and the
  calling convention.  Usually some arguments are passed in registers and
  the rest on the stack, and this builtin returns the address of the first
  argument that is on the stack.


Segher


Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread H.J. Lu
On Wed, Aug 19, 2015 at 10:48 AM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 On Wed, Aug 19, 2015 at 10:08:01AM -0700, H.J. Lu wrote:
  Maybe something like (heavily cut'n'pasted):
 
 
  @deftypefn {Built-in Function} {void *} __builtin_argument_address (void)
  This function is similar to @code{__builtin_frame_address} with an
  argument of 0, but it returns the address of the incoming arguments to
  the current function rather than the address of its frame.

 This doesn't make senses when there is no argument or arguments
 are passed in registers.

 Sure, but see the weasel-words below (The exact...)

 To me, argument pointer is a virtual concept
 and an implementation detail internal to GCC.  I am not sure if another
 compiler can implement it based on this description.

 The same is true for frame_address, on many machines.

Stack frame is well understood unlike argument pointer which is
pretty vague.

  The exact definition of this address depends upon the processor and the
  calling convention.  Usually some arguments are passed in registers and
  the rest on the stack, and this builtin returns the address of the first
  argument that is on the stack.


 Segher



-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread Segher Boessenkool
On Wed, Aug 19, 2015 at 08:25:49AM -0700, H.J. Lu wrote:
 Here is a patch to add __builtin_argument_pointer.  I only have

Sorry to be a pain but...  all the other builtins use _address
instead of _pointer, it's probably best to follow that.

  -- Built-in Function: void * __builtin_argument_pointer (void)
  This function returns the argument pointer.
 
 as documentation.  Can you suggest a better description so that it can
 be implemented also by other compilers?

Maybe something like (heavily cut'n'pasted):


@deftypefn {Built-in Function} {void *} __builtin_argument_address (void)
This function is similar to @code{__builtin_frame_address} with an
argument of 0, but it returns the address of the incoming arguments to
the current function rather than the address of its frame.

The exact definition of this address depends upon the processor and the
calling convention.  Usually some arguments are passed in registers and
the rest on the stack, and this builtin returns the address of the first
argument that is on the stack.


 +  /* Can't use DRAP if the stack address has been taken.  */
 +  if (cfun-argument_pointer_taken)
 + sorry (%__builtin_argument_pointer% not supported with stack
 + realignment.  This may be worked around by adding
 + -maccumulate-outgoing-args.);

This doesn't work with DRAP?  Pity :-(

The patch looks plausible, but I of course can not approve it.

Thanks,


Segher


Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread Segher Boessenkool
On Wed, Aug 19, 2015 at 03:18:46PM -0700, H.J. Lu wrote:
 @deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
 This function is similar to @code{__builtin_frame_address} with an
 argument of 0, but it returns the address of the incoming arguments to
 the current function rather than the address of its frame.
 
 The exact definition of this address depends upon the processor and the
 calling convention.  Usually some arguments are passed in registers and
 the rest on the stack, and this builtin returns the address of the
 first argument which would be passed on the stack.
 @end deftypefn

That is fine by me.  Thanks!


Segher


Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread H.J. Lu
On Wed, Aug 19, 2015 at 3:10 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 On Wed, Aug 19, 2015 at 02:53:47PM -0700, H.J. Lu wrote:
 How about this

 @deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
 This function is similar to @code{__builtin_frame_address} with an
 argument of 0, but it returns the address of the incoming arguments to
 the current function rather than the address of its frame.  Unlike
 @code{__builtin_frame_address}, the frame pointer register isn't
 required.

 That last line isn't true, if your port uses INITIAL_FRAME_POINTER_RTX.
 Maybe it shouldn't be true otherwise either (but currently a hard frame
 pointer is forced, indeed).  Have we gone full circle now? ;-)

Let's drop it:


@deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
This function is similar to @code{__builtin_frame_address} with an
argument of 0, but it returns the address of the incoming arguments to
the current function rather than the address of its frame.

The exact definition of this address depends upon the processor and the
calling convention.  Usually some arguments are passed in registers and
the rest on the stack, and this builtin returns the address of the
first argument which would be passed on the stack.
@end deftypefn


-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread H.J. Lu
On Wed, Aug 19, 2015 at 10:53 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, Aug 19, 2015 at 10:48 AM, Segher Boessenkool
 seg...@kernel.crashing.org wrote:
 On Wed, Aug 19, 2015 at 10:08:01AM -0700, H.J. Lu wrote:
  Maybe something like (heavily cut'n'pasted):
 
 
  @deftypefn {Built-in Function} {void *} __builtin_argument_address (void)
  This function is similar to @code{__builtin_frame_address} with an
  argument of 0, but it returns the address of the incoming arguments to
  the current function rather than the address of its frame.

 This doesn't make senses when there is no argument or arguments
 are passed in registers.

 Sure, but see the weasel-words below (The exact...)

 To me, argument pointer is a virtual concept
 and an implementation detail internal to GCC.  I am not sure if another
 compiler can implement it based on this description.

 The same is true for frame_address, on many machines.

 Stack frame is well understood unlike argument pointer which is
 pretty vague.


How about this

@deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
This function is similar to @code{__builtin_frame_address} with an
argument of 0, but it returns the address of the incoming arguments to
the current function rather than the address of its frame.  Unlike
@code{__builtin_frame_address}, the frame pointer register isn't
required.

The exact definition of this address depends upon the processor and the
calling convention.  Usually some arguments are passed in registers and
the rest on the stack, and this builtin returns the address of the
first argument which would be passed on the stack.
@end deftypefn

-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-19 Thread Segher Boessenkool
On Wed, Aug 19, 2015 at 02:53:47PM -0700, H.J. Lu wrote:
 How about this
 
 @deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
 This function is similar to @code{__builtin_frame_address} with an
 argument of 0, but it returns the address of the incoming arguments to
 the current function rather than the address of its frame.  Unlike
 @code{__builtin_frame_address}, the frame pointer register isn't
 required.

That last line isn't true, if your port uses INITIAL_FRAME_POINTER_RTX.
Maybe it shouldn't be true otherwise either (but currently a hard frame
pointer is forced, indeed).  Have we gone full circle now? ;-)

 The exact definition of this address depends upon the processor and the
 calling convention.  Usually some arguments are passed in registers and
 the rest on the stack, and this builtin returns the address of the
 first argument which would be passed on the stack.
 @end deftypefn


Segher


Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread Mike Stump
On Aug 4, 2015, at 5:30 AM, H.J. Lu hjl.to...@gmail.com wrote:
 Where does this feature belong?

I prefer the middle end.


Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread H.J. Lu
On Tue, Aug 4, 2015 at 8:40 AM, Mike Stump mikest...@comcast.net wrote:
 On Aug 4, 2015, at 5:30 AM, H.J. Lu hjl.to...@gmail.com wrote:
 Where does this feature belong?

 I prefer the middle end.

Any comments on my middle-end patch?

Thanks.

-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread H.J. Lu
On Tue, Aug 4, 2015 at 10:43 AM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 On Tue, Aug 04, 2015 at 10:28:00AM -0700, H.J. Lu wrote:
  Any comments on my middle-end patch?
 
  So, if the answer is the same as frame_address (0), why not have the 
  fallback just expand to that?  Then, one can use this builtin everywhere 
  that frame address is used today.  People that want a faster, tighter port 
  can then implement the hook and achieve higher performance.

 The motivation of __builtin_stack_top is that frame_address requires a
 frame pointer register, which isn't desirable for x86.  __builtin_stack_top
 doesn't require a frame pointer register.

 If the target just returns frame_pointer_rtx from INITIAL_FRAME_ADDRESS_RTX,
 you don't get crtl-accesses_prior_frames set either, and as far as I can
 see everything works fine?  For __builtin_frame_address(0).

 You might have a reason why you want the entry stack address instead of the
 frame address, but you didn't really explain I think?  Or I missed it.


expand_builtin_return_addr sets

crtl-accesses_prior_frames = 1;

for __builtin_frame_address, which requires a frame pointer register.
__builtin_stack_top doesn't set crtl-accesses_prior_frames and frame
pointer register isn't required.

-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread Mike Stump
On Aug 4, 2015, at 8:44 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Aug 4, 2015 at 8:40 AM, Mike Stump mikest...@comcast.net wrote:
 On Aug 4, 2015, at 5:30 AM, H.J. Lu hjl.to...@gmail.com wrote:
 Where does this feature belong?
 
 I prefer the middle end.
 
 Any comments on my middle-end patch?

So, if the answer is the same as frame_address (0), why not have the fallback 
just expand to that?  Then, one can use this builtin everywhere that frame 
address is used today.  People that want a faster, tighter port can then 
implement the hook and achieve higher performance.

Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread H.J. Lu
On Tue, Aug 4, 2015 at 10:16 AM, Mike Stump mikest...@comcast.net wrote:
 On Aug 4, 2015, at 8:44 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Aug 4, 2015 at 8:40 AM, Mike Stump mikest...@comcast.net wrote:
 On Aug 4, 2015, at 5:30 AM, H.J. Lu hjl.to...@gmail.com wrote:
 Where does this feature belong?

 I prefer the middle end.

 Any comments on my middle-end patch?

 So, if the answer is the same as frame_address (0), why not have the fallback 
 just expand to that?  Then, one can use this builtin everywhere that frame 
 address is used today.  People that want a faster, tighter port can then 
 implement the hook and achieve higher performance.

The motivation of __builtin_stack_top is that frame_address requires a
frame pointer register, which isn't desirable for x86.  __builtin_stack_top
doesn't require a frame pointer register.

-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread Segher Boessenkool
On Tue, Aug 04, 2015 at 10:28:00AM -0700, H.J. Lu wrote:
  Any comments on my middle-end patch?
 
  So, if the answer is the same as frame_address (0), why not have the 
  fallback just expand to that?  Then, one can use this builtin everywhere 
  that frame address is used today.  People that want a faster, tighter port 
  can then implement the hook and achieve higher performance.
 
 The motivation of __builtin_stack_top is that frame_address requires a
 frame pointer register, which isn't desirable for x86.  __builtin_stack_top
 doesn't require a frame pointer register.

If the target just returns frame_pointer_rtx from INITIAL_FRAME_ADDRESS_RTX,
you don't get crtl-accesses_prior_frames set either, and as far as I can
see everything works fine?  For __builtin_frame_address(0).

You might have a reason why you want the entry stack address instead of the
frame address, but you didn't really explain I think?  Or I missed it.


Segher


Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread H.J. Lu
On Tue, Aug 4, 2015 at 11:50 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Aug 4, 2015 at 10:43 AM, Segher Boessenkool
 seg...@kernel.crashing.org wrote:
 On Tue, Aug 04, 2015 at 10:28:00AM -0700, H.J. Lu wrote:
  Any comments on my middle-end patch?
 
  So, if the answer is the same as frame_address (0), why not have the 
  fallback just expand to that?  Then, one can use this builtin everywhere 
  that frame address is used today.  People that want a faster, tighter 
  port can then implement the hook and achieve higher performance.

 The motivation of __builtin_stack_top is that frame_address requires a
 frame pointer register, which isn't desirable for x86.  __builtin_stack_top
 doesn't require a frame pointer register.

 If the target just returns frame_pointer_rtx from INITIAL_FRAME_ADDRESS_RTX,
 you don't get crtl-accesses_prior_frames set either, and as far as I can
 see everything works fine?  For __builtin_frame_address(0).

 You might have a reason why you want the entry stack address instead of the
 frame address, but you didn't really explain I think?  Or I missed it.


 expand_builtin_return_addr sets

 crtl-accesses_prior_frames = 1;

 for __builtin_frame_address, which requires a frame pointer register.
 __builtin_stack_top doesn't set crtl-accesses_prior_frames and frame
 pointer register isn't required.


BTW, x86 doesn't define INITIAL_FRAME_ADDRESS_RTX.

-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread H.J. Lu
On Tue, Aug 4, 2015 at 12:29 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 On Tue, Aug 04, 2015 at 11:50:00AM -0700, H.J. Lu wrote:
  The motivation of __builtin_stack_top is that frame_address requires a
  frame pointer register, which isn't desirable for x86.  
  __builtin_stack_top
  doesn't require a frame pointer register.
 
  If the target just returns frame_pointer_rtx from 
  INITIAL_FRAME_ADDRESS_RTX,
  you don't get crtl-accesses_prior_frames set either, and as far as I can
  see everything works fine?  For __builtin_frame_address(0).
 
  You might have a reason why you want the entry stack address instead of the
  frame address, but you didn't really explain I think?  Or I missed it.
 

 expand_builtin_return_addr sets

 crtl-accesses_prior_frames = 1;

 for __builtin_frame_address, which requires a frame pointer register.
 __builtin_stack_top doesn't set crtl-accesses_prior_frames and frame
 pointer register isn't required.

 Not if you have INITIAL_FRAME_ADDRESS_RTX.  I don't see why the generic code
 cannot just use frame_pointer_rtx (instead of hard_frame_pointer_rtx) for
 a count of 0; but making it target-specific is certainly more conservative.

 You say i386 doesn't have that target macro defined currently.  Yes I know;
 so change that?  Or change the generic code, but that is much more testing.

There is another issue with x86, maybe other targets.  You
can't get the real stack top when stack is realigned and
-maccumulate-outgoing-args isn't used since ix86_expand_prologue
will create and return another stack frame for
__builtin_frame_address and __builtin_return_address.
It will be wrong for __builtin_stack_top, which should
return the real stack address.

-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread H.J. Lu
On Tue, Aug 4, 2015 at 1:45 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 On Tue, Aug 04, 2015 at 01:00:32PM -0700, H.J. Lu wrote:
 There is another issue with x86, maybe other targets.  You
 can't get the real stack top when stack is realigned and
 -maccumulate-outgoing-args isn't used since ix86_expand_prologue
 will create and return another stack frame for
 __builtin_frame_address and __builtin_return_address.
 It will be wrong for __builtin_stack_top, which should
 return the real stack address.

 That's why I asked:

   You might have a reason why you want the entry stack address instead of 
   the
   frame address, but you didn't really explain I think?  Or I missed it.

 What would a C program do with this, that it cannot do with the frame
 address, that would be useful and cannot be much better done in straight
 assembler?  Do you actually want to expose the argument pointer, maybe?


Yes, we want to use the argument pointer as shown in testcases
included in my patch.


-- 
H.J.


Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread Segher Boessenkool
On Tue, Aug 04, 2015 at 01:00:32PM -0700, H.J. Lu wrote:
 There is another issue with x86, maybe other targets.  You
 can't get the real stack top when stack is realigned and
 -maccumulate-outgoing-args isn't used since ix86_expand_prologue
 will create and return another stack frame for
 __builtin_frame_address and __builtin_return_address.
 It will be wrong for __builtin_stack_top, which should
 return the real stack address.

That's why I asked:

   You might have a reason why you want the entry stack address instead of 
   the
   frame address, but you didn't really explain I think?  Or I missed it.

What would a C program do with this, that it cannot do with the frame
address, that would be useful and cannot be much better done in straight
assembler?  Do you actually want to expose the argument pointer, maybe?


Segher


Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread Segher Boessenkool
On Tue, Aug 04, 2015 at 11:50:00AM -0700, H.J. Lu wrote:
  The motivation of __builtin_stack_top is that frame_address requires a
  frame pointer register, which isn't desirable for x86.  __builtin_stack_top
  doesn't require a frame pointer register.
 
  If the target just returns frame_pointer_rtx from INITIAL_FRAME_ADDRESS_RTX,
  you don't get crtl-accesses_prior_frames set either, and as far as I can
  see everything works fine?  For __builtin_frame_address(0).
 
  You might have a reason why you want the entry stack address instead of the
  frame address, but you didn't really explain I think?  Or I missed it.
 
 
 expand_builtin_return_addr sets
 
 crtl-accesses_prior_frames = 1;
 
 for __builtin_frame_address, which requires a frame pointer register.
 __builtin_stack_top doesn't set crtl-accesses_prior_frames and frame
 pointer register isn't required.

Not if you have INITIAL_FRAME_ADDRESS_RTX.  I don't see why the generic code
cannot just use frame_pointer_rtx (instead of hard_frame_pointer_rtx) for
a count of 0; but making it target-specific is certainly more conservative.

You say i386 doesn't have that target macro defined currently.  Yes I know;
so change that?  Or change the generic code, but that is much more testing.


Segher


Re: [PATCH] Add __builtin_stack_top to x86 backend

2015-08-03 Thread Uros Bizjak
On Thu, Jul 30, 2015 at 8:41 PM, H.J. Lu hongjiu...@intel.com wrote:
 On Tue, Jul 21, 2015 at 02:45:39PM -0700, H.J. Lu wrote:
 When __builtin_frame_address is used to retrieve the address of the
 function stack frame, the frame pointer is always kept, which wastes one
 register and 2 instructions.  For x86-32, one less register means
 significant negative impact on performance.  This patch adds a new
 builtin function, __builtin_ia32_stack_top, to x86 backend.  It
 returns the stack address when the function is called.

 Any comments, feedbacks?


 Although this function is generic, but implementation is target
 specific.  I submitted a generic patch:

 https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01859.html

 So far there are no interests from other backends.  Here is a patch
 to implement __builtin_stack_top in x86 backend.  We can update x86
 backedn after it is added to middle-end.  OK for trunk?

I think that the discussion about generic implementation should come
to some conclusion first. From the discussion, here was no resolution
on which way to go.

Uros.