Re: System V Application Binary Interface 0.99.5

2013-02-04 Thread H.J. Lu
On Sun, Feb 3, 2013 at 12:09 PM, Jan Hubicka hubi...@ucw.cz wrote:
 On 02/01/2013 12:38 AM, Jan Hubicka wrote:
  Doing the extensions at caller side always is however IMO a preformance 
  bug in
  GCC.  We can definitly drop them at -Os, for non-PRS targets and for calls
  within compilation unit where we know that GCC is not really producing
  code like in Michael's testcase.

 Well we can, yeah, at the cost of breaking interworking with LLVM.
 Do we care?  ;-)

 Yes, we (at least I) care ;)
 The bug ought to be fixed at LLVM side, is there PR filled in?  For time being
 we can optimize local calls.  I remember that in the past I ran over 
 benchmarks
 where these extra casts was making us to lose compared to other compilers, so 
 it
 is not 100% uninteresting detail.


There are a couple bugs:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44490
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44532
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46942

-- 
H.J.


Re: System V Application Binary Interface 0.99.5

2013-02-03 Thread Jan Hubicka
 On 02/01/2013 12:38 AM, Jan Hubicka wrote:
  Doing the extensions at caller side always is however IMO a preformance bug 
  in
  GCC.  We can definitly drop them at -Os, for non-PRS targets and for calls
  within compilation unit where we know that GCC is not really producing
  code like in Michael's testcase.
 
 Well we can, yeah, at the cost of breaking interworking with LLVM.
 Do we care?  ;-)

Yes, we (at least I) care ;)
The bug ought to be fixed at LLVM side, is there PR filled in?  For time being
we can optimize local calls.  I remember that in the past I ran over benchmarks
where these extra casts was making us to lose compared to other compilers, so it
is not 100% uninteresting detail.

Honza
 
 Andrew.


Re: System V Application Binary Interface 0.99.5

2013-02-01 Thread Andrew Haley
On 02/01/2013 12:38 AM, Jan Hubicka wrote:
 Doing the extensions at caller side always is however IMO a preformance bug in
 GCC.  We can definitly drop them at -Os, for non-PRS targets and for calls
 within compilation unit where we know that GCC is not really producing
 code like in Michael's testcase.

Well we can, yeah, at the cost of breaking interworking with LLVM.
Do we care?  ;-)

Andrew.



Re: System V Application Binary Interface 0.99.5

2013-01-31 Thread Jan Hubicka
  Well, it's hardly an optimization if it's incorrect, and it seems to be
  incorrect.  As the old saying goes, I can make your code infinitely fast
  if you don't care about the results.
 
 It's incorrect to rely on the extension taking place.  It's not incorrect to
 do the extension.

The behaviour of extending everything twice (on caller and callee side) was in
i386 backend forever. Definitely before concept of partial register stalls as
invented.  My guess is that it all comes from history that original i386 PS ABI
was written with KR C in mind. I tried to remove the caler side extend in 90s,
but it never got into a mainline.

Honza
 
 Richard.
 
  Andrew.
 
 


Re: System V Application Binary Interface 0.99.5

2013-01-31 Thread Jan Hubicka
 On 01/30/2013 04:49 PM, Michael Matz wrote:
  Hmm?  GCC generates code that doesn't rely on the extension taking place.
 
 Sure, I didn't mean to suggest it was: it's LLVM that's incorrect.

Yes, that is LLVM bug.  I am surprised that it went unnoticed for so long,
but I guess it is difficult to have not extended call except for libffi
and hand written asm code.

Doing the extensions at caller side always is however IMO a preformance bug in
GCC.  We can definitly drop them at -Os, for non-PRS targets and for calls
within compilation unit where we know that GCC is not really producing
code like in Michael's testcase.

Honza


Re: System V Application Binary Interface 0.99.5

2013-01-30 Thread Michael Matz
Hi,

On Wed, 30 Jan 2013, Andrew Haley wrote:

 I'm looking at Section 3.2.3, Parameter Passing.
 http://artfiles.org/kernel.org/pub/scm/devel/binutils/hjl/x86-64-psabi.git/
 
 I still cannot tell whether parameters should or should not be sign- or
 zero-extended when they are moved into registers at a call.  I'm guessing
 not.

It's intentionally unspecified.

 Which is it?  This is important for interworking.

How?  You aren't allowed to access the bits outside the specified argument 
type (which must match on caller and callee side), so you can't observe 
them, so it's not required to specify their content.


Ciao,
Michael.


Re: System V Application Binary Interface 0.99.5

2013-01-30 Thread Andrew Haley
Hi,

On 01/30/2013 02:18 PM, Michael Matz wrote:

 On Wed, 30 Jan 2013, Andrew Haley wrote:
 
 I'm looking at Section 3.2.3, Parameter Passing.
 http://artfiles.org/kernel.org/pub/scm/devel/binutils/hjl/x86-64-psabi.git/

 I still cannot tell whether parameters should or should not be sign- or
 zero-extended when they are moved into registers at a call.  I'm guessing
 not.
 
 It's intentionally unspecified.

Aha!  It would have been nice if the psABI said so explicitly.  Quite
a few people have spent time trying to find this information.

 Which is it?  This is important for interworking.
 
 How?  You aren't allowed to access the bits outside the specified argument 
 type (which must match on caller and callee side), so you can't observe 
 them, so it's not required to specify their content.

OK, thanks.  It's clear now.

The problem is that LLVM assumes that values are extended at a call.  GCC
does that, but libffi doesn't.  So, calls via libffi to LLVM don't work
correctly.

Andrew.



Re: System V Application Binary Interface 0.99.5

2013-01-30 Thread Richard Biener
On Wed, Jan 30, 2013 at 3:22 PM, Andrew Haley a...@redhat.com wrote:
 Hi,

 On 01/30/2013 02:18 PM, Michael Matz wrote:

 On Wed, 30 Jan 2013, Andrew Haley wrote:

 I'm looking at Section 3.2.3, Parameter Passing.
 http://artfiles.org/kernel.org/pub/scm/devel/binutils/hjl/x86-64-psabi.git/

 I still cannot tell whether parameters should or should not be sign- or
 zero-extended when they are moved into registers at a call.  I'm guessing
 not.

 It's intentionally unspecified.

 Aha!  It would have been nice if the psABI said so explicitly.  Quite
 a few people have spent time trying to find this information.

 Which is it?  This is important for interworking.

 How?  You aren't allowed to access the bits outside the specified argument
 type (which must match on caller and callee side), so you can't observe
 them, so it's not required to specify their content.

 OK, thanks.  It's clear now.

 The problem is that LLVM assumes that values are extended at a call.  GCC
 does that, but libffi doesn't.  So, calls via libffi to LLVM don't work
 correctly.

It's an optimization to do so to avoid partial register stalls.

Richard.

 Andrew.



Re: System V Application Binary Interface 0.99.5

2013-01-30 Thread Andrew Haley
On 01/30/2013 03:46 PM, Richard Biener wrote:
 On Wed, Jan 30, 2013 at 3:22 PM, Andrew Haley a...@redhat.com wrote:
 Hi,

 On 01/30/2013 02:18 PM, Michael Matz wrote:

 On Wed, 30 Jan 2013, Andrew Haley wrote:

 I'm looking at Section 3.2.3, Parameter Passing.
 http://artfiles.org/kernel.org/pub/scm/devel/binutils/hjl/x86-64-psabi.git/

 I still cannot tell whether parameters should or should not be sign- or
 zero-extended when they are moved into registers at a call.  I'm guessing
 not.

 It's intentionally unspecified.

 Aha!  It would have been nice if the psABI said so explicitly.  Quite
 a few people have spent time trying to find this information.

 Which is it?  This is important for interworking.

 How?  You aren't allowed to access the bits outside the specified argument
 type (which must match on caller and callee side), so you can't observe
 them, so it's not required to specify their content.

 OK, thanks.  It's clear now.

 The problem is that LLVM assumes that values are extended at a call.  GCC
 does that, but libffi doesn't.  So, calls via libffi to LLVM don't work
 correctly.
 
 It's an optimization to do so to avoid partial register stalls.

Well, it's hardly an optimization if it's incorrect, and it seems to be
incorrect.  As the old saying goes, I can make your code infinitely fast
if you don't care about the results.

Andrew.




Re: System V Application Binary Interface 0.99.5

2013-01-30 Thread Richard Biener
On Wed, Jan 30, 2013 at 4:49 PM, Andrew Haley a...@redhat.com wrote:
 On 01/30/2013 03:46 PM, Richard Biener wrote:
 On Wed, Jan 30, 2013 at 3:22 PM, Andrew Haley a...@redhat.com wrote:
 Hi,

 On 01/30/2013 02:18 PM, Michael Matz wrote:

 On Wed, 30 Jan 2013, Andrew Haley wrote:

 I'm looking at Section 3.2.3, Parameter Passing.
 http://artfiles.org/kernel.org/pub/scm/devel/binutils/hjl/x86-64-psabi.git/

 I still cannot tell whether parameters should or should not be sign- or
 zero-extended when they are moved into registers at a call.  I'm guessing
 not.

 It's intentionally unspecified.

 Aha!  It would have been nice if the psABI said so explicitly.  Quite
 a few people have spent time trying to find this information.

 Which is it?  This is important for interworking.

 How?  You aren't allowed to access the bits outside the specified argument
 type (which must match on caller and callee side), so you can't observe
 them, so it's not required to specify their content.

 OK, thanks.  It's clear now.

 The problem is that LLVM assumes that values are extended at a call.  GCC
 does that, but libffi doesn't.  So, calls via libffi to LLVM don't work
 correctly.

 It's an optimization to do so to avoid partial register stalls.

 Well, it's hardly an optimization if it's incorrect, and it seems to be
 incorrect.  As the old saying goes, I can make your code infinitely fast
 if you don't care about the results.

It's incorrect to rely on the extension taking place.  It's not incorrect to
do the extension.

Richard.

 Andrew.




Re: System V Application Binary Interface 0.99.5

2013-01-30 Thread Andrew Haley
On 01/30/2013 03:51 PM, Richard Biener wrote:
 On Wed, Jan 30, 2013 at 4:49 PM, Andrew Haley a...@redhat.com wrote:
 On 01/30/2013 03:46 PM, Richard Biener wrote:
 On Wed, Jan 30, 2013 at 3:22 PM, Andrew Haley a...@redhat.com wrote:

 The problem is that LLVM assumes that values are extended at a call.  GCC
 does that, but libffi doesn't.  So, calls via libffi to LLVM don't work
 correctly.

 It's an optimization to do so to avoid partial register stalls.

 Well, it's hardly an optimization if it's incorrect, and it seems to be
 incorrect.  As the old saying goes, I can make your code infinitely fast
 if you don't care about the results.
 
 It's incorrect to rely on the extension taking place.  It's not incorrect to
 do the extension.

Sure, I understand that, but I am completely baffled as to how
extending at a call site avoids partial register stalls if a callee
cannot assume that a value is already extended.

Andrew.


Re: System V Application Binary Interface 0.99.5

2013-01-30 Thread Andrew Haley
On 01/30/2013 03:55 PM, Andrew Haley wrote:
  
  It's incorrect to rely on the extension taking place.  It's not incorrect 
  to
  do the extension.
 Sure, I understand that, but I am completely baffled as to how
 extending at a call site avoids partial register stalls if a callee
 cannot assume that a value is already extended.

Ah, sorry.  Thinking about it some more, if the register is extended
at the call site, the partial register stall will be avoided whether
or not the callee extends.  So, we're correct to extend at the call
site, and correct to extend in the callee.  LLVM isn't correct not
to extend at the callee.

Thanks,
Andrew.



Re: System V Application Binary Interface 0.99.5

2013-01-30 Thread Michael Matz
Hi,

On Wed, 30 Jan 2013, Andrew Haley wrote:

  It's an optimization to do so to avoid partial register stalls.
 
  Well, it's hardly an optimization if it's incorrect, and it seems to 
  be incorrect.

Hmm?  GCC generates code that doesn't rely on the extension taking place.

  As the old saying goes, I can make your code 
  infinitely fast if you don't care about the results.
  
  It's incorrect to rely on the extension taking place.  It's not 
  incorrect to do the extension.
 
 Sure, I understand that, but I am completely baffled as to how extending 
 at a call site avoids partial register stalls if a callee cannot assume 
 that a value is already extended.

Accessing the whole register in the callee (which would induce a 
partial reg stall if the caller wouldn't have written to the whole 
register first, one way is by extending) doesn't mean that it also relies 
on the content of those bits, it can (and must) ignore them.

E.g. assume this function:

  uint8 andme (uint8 a, uint8 b) { return a  b; }

One correct implementation for this function is:

movl%esi, %eax
andl%edi, %eax
ret

So the function accesses the upper unspecified bits, but doesn't rely on 
them (because also the upper bits of the return value are unspecified).
If the caller would have set only the low 8 bits of the arguments (like it 
is allowed to do) that write would have incurred a partial reg stall, or 
alternatively the above two reads of the full reg would have incurred such 
a stall (on some architectures).  So GCC chooses to overwrite the full 
register by some unspecified method (extending simply happens to be the 
most straight forward method).


Ciao,
Michael.


Re: System V Application Binary Interface 0.99.5

2013-01-30 Thread Andrew Haley
On 01/30/2013 04:49 PM, Michael Matz wrote:
 Hmm?  GCC generates code that doesn't rely on the extension taking place.

Sure, I didn't mean to suggest it was: it's LLVM that's incorrect.

Thanks for the explanation.

Andrew.