Re: System V Application Binary Interface 0.99.5
On Sun, Feb 3, 2013 at 12:09 PM, Jan Hubicka hubi...@ucw.cz wrote: On 02/01/2013 12:38 AM, Jan Hubicka wrote: Doing the extensions at caller side always is however IMO a preformance bug in GCC. We can definitly drop them at -Os, for non-PRS targets and for calls within compilation unit where we know that GCC is not really producing code like in Michael's testcase. Well we can, yeah, at the cost of breaking interworking with LLVM. Do we care? ;-) Yes, we (at least I) care ;) The bug ought to be fixed at LLVM side, is there PR filled in? For time being we can optimize local calls. I remember that in the past I ran over benchmarks where these extra casts was making us to lose compared to other compilers, so it is not 100% uninteresting detail. There are a couple bugs: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44490 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44532 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46942 -- H.J.
Re: System V Application Binary Interface 0.99.5
On 02/01/2013 12:38 AM, Jan Hubicka wrote: Doing the extensions at caller side always is however IMO a preformance bug in GCC. We can definitly drop them at -Os, for non-PRS targets and for calls within compilation unit where we know that GCC is not really producing code like in Michael's testcase. Well we can, yeah, at the cost of breaking interworking with LLVM. Do we care? ;-) Yes, we (at least I) care ;) The bug ought to be fixed at LLVM side, is there PR filled in? For time being we can optimize local calls. I remember that in the past I ran over benchmarks where these extra casts was making us to lose compared to other compilers, so it is not 100% uninteresting detail. Honza Andrew.
Re: System V Application Binary Interface 0.99.5
On 02/01/2013 12:38 AM, Jan Hubicka wrote: Doing the extensions at caller side always is however IMO a preformance bug in GCC. We can definitly drop them at -Os, for non-PRS targets and for calls within compilation unit where we know that GCC is not really producing code like in Michael's testcase. Well we can, yeah, at the cost of breaking interworking with LLVM. Do we care? ;-) Andrew.
Re: System V Application Binary Interface 0.99.5
Well, it's hardly an optimization if it's incorrect, and it seems to be incorrect. As the old saying goes, I can make your code infinitely fast if you don't care about the results. It's incorrect to rely on the extension taking place. It's not incorrect to do the extension. The behaviour of extending everything twice (on caller and callee side) was in i386 backend forever. Definitely before concept of partial register stalls as invented. My guess is that it all comes from history that original i386 PS ABI was written with KR C in mind. I tried to remove the caler side extend in 90s, but it never got into a mainline. Honza Richard. Andrew.
Re: System V Application Binary Interface 0.99.5
On 01/30/2013 04:49 PM, Michael Matz wrote: Hmm? GCC generates code that doesn't rely on the extension taking place. Sure, I didn't mean to suggest it was: it's LLVM that's incorrect. Yes, that is LLVM bug. I am surprised that it went unnoticed for so long, but I guess it is difficult to have not extended call except for libffi and hand written asm code. Doing the extensions at caller side always is however IMO a preformance bug in GCC. We can definitly drop them at -Os, for non-PRS targets and for calls within compilation unit where we know that GCC is not really producing code like in Michael's testcase. Honza
Re: System V Application Binary Interface 0.99.5
Hi, On Wed, 30 Jan 2013, Andrew Haley wrote: I'm looking at Section 3.2.3, Parameter Passing. http://artfiles.org/kernel.org/pub/scm/devel/binutils/hjl/x86-64-psabi.git/ I still cannot tell whether parameters should or should not be sign- or zero-extended when they are moved into registers at a call. I'm guessing not. It's intentionally unspecified. Which is it? This is important for interworking. How? You aren't allowed to access the bits outside the specified argument type (which must match on caller and callee side), so you can't observe them, so it's not required to specify their content. Ciao, Michael.
Re: System V Application Binary Interface 0.99.5
Hi, On 01/30/2013 02:18 PM, Michael Matz wrote: On Wed, 30 Jan 2013, Andrew Haley wrote: I'm looking at Section 3.2.3, Parameter Passing. http://artfiles.org/kernel.org/pub/scm/devel/binutils/hjl/x86-64-psabi.git/ I still cannot tell whether parameters should or should not be sign- or zero-extended when they are moved into registers at a call. I'm guessing not. It's intentionally unspecified. Aha! It would have been nice if the psABI said so explicitly. Quite a few people have spent time trying to find this information. Which is it? This is important for interworking. How? You aren't allowed to access the bits outside the specified argument type (which must match on caller and callee side), so you can't observe them, so it's not required to specify their content. OK, thanks. It's clear now. The problem is that LLVM assumes that values are extended at a call. GCC does that, but libffi doesn't. So, calls via libffi to LLVM don't work correctly. Andrew.
Re: System V Application Binary Interface 0.99.5
On Wed, Jan 30, 2013 at 3:22 PM, Andrew Haley a...@redhat.com wrote: Hi, On 01/30/2013 02:18 PM, Michael Matz wrote: On Wed, 30 Jan 2013, Andrew Haley wrote: I'm looking at Section 3.2.3, Parameter Passing. http://artfiles.org/kernel.org/pub/scm/devel/binutils/hjl/x86-64-psabi.git/ I still cannot tell whether parameters should or should not be sign- or zero-extended when they are moved into registers at a call. I'm guessing not. It's intentionally unspecified. Aha! It would have been nice if the psABI said so explicitly. Quite a few people have spent time trying to find this information. Which is it? This is important for interworking. How? You aren't allowed to access the bits outside the specified argument type (which must match on caller and callee side), so you can't observe them, so it's not required to specify their content. OK, thanks. It's clear now. The problem is that LLVM assumes that values are extended at a call. GCC does that, but libffi doesn't. So, calls via libffi to LLVM don't work correctly. It's an optimization to do so to avoid partial register stalls. Richard. Andrew.
Re: System V Application Binary Interface 0.99.5
On 01/30/2013 03:46 PM, Richard Biener wrote: On Wed, Jan 30, 2013 at 3:22 PM, Andrew Haley a...@redhat.com wrote: Hi, On 01/30/2013 02:18 PM, Michael Matz wrote: On Wed, 30 Jan 2013, Andrew Haley wrote: I'm looking at Section 3.2.3, Parameter Passing. http://artfiles.org/kernel.org/pub/scm/devel/binutils/hjl/x86-64-psabi.git/ I still cannot tell whether parameters should or should not be sign- or zero-extended when they are moved into registers at a call. I'm guessing not. It's intentionally unspecified. Aha! It would have been nice if the psABI said so explicitly. Quite a few people have spent time trying to find this information. Which is it? This is important for interworking. How? You aren't allowed to access the bits outside the specified argument type (which must match on caller and callee side), so you can't observe them, so it's not required to specify their content. OK, thanks. It's clear now. The problem is that LLVM assumes that values are extended at a call. GCC does that, but libffi doesn't. So, calls via libffi to LLVM don't work correctly. It's an optimization to do so to avoid partial register stalls. Well, it's hardly an optimization if it's incorrect, and it seems to be incorrect. As the old saying goes, I can make your code infinitely fast if you don't care about the results. Andrew.
Re: System V Application Binary Interface 0.99.5
On Wed, Jan 30, 2013 at 4:49 PM, Andrew Haley a...@redhat.com wrote: On 01/30/2013 03:46 PM, Richard Biener wrote: On Wed, Jan 30, 2013 at 3:22 PM, Andrew Haley a...@redhat.com wrote: Hi, On 01/30/2013 02:18 PM, Michael Matz wrote: On Wed, 30 Jan 2013, Andrew Haley wrote: I'm looking at Section 3.2.3, Parameter Passing. http://artfiles.org/kernel.org/pub/scm/devel/binutils/hjl/x86-64-psabi.git/ I still cannot tell whether parameters should or should not be sign- or zero-extended when they are moved into registers at a call. I'm guessing not. It's intentionally unspecified. Aha! It would have been nice if the psABI said so explicitly. Quite a few people have spent time trying to find this information. Which is it? This is important for interworking. How? You aren't allowed to access the bits outside the specified argument type (which must match on caller and callee side), so you can't observe them, so it's not required to specify their content. OK, thanks. It's clear now. The problem is that LLVM assumes that values are extended at a call. GCC does that, but libffi doesn't. So, calls via libffi to LLVM don't work correctly. It's an optimization to do so to avoid partial register stalls. Well, it's hardly an optimization if it's incorrect, and it seems to be incorrect. As the old saying goes, I can make your code infinitely fast if you don't care about the results. It's incorrect to rely on the extension taking place. It's not incorrect to do the extension. Richard. Andrew.
Re: System V Application Binary Interface 0.99.5
On 01/30/2013 03:51 PM, Richard Biener wrote: On Wed, Jan 30, 2013 at 4:49 PM, Andrew Haley a...@redhat.com wrote: On 01/30/2013 03:46 PM, Richard Biener wrote: On Wed, Jan 30, 2013 at 3:22 PM, Andrew Haley a...@redhat.com wrote: The problem is that LLVM assumes that values are extended at a call. GCC does that, but libffi doesn't. So, calls via libffi to LLVM don't work correctly. It's an optimization to do so to avoid partial register stalls. Well, it's hardly an optimization if it's incorrect, and it seems to be incorrect. As the old saying goes, I can make your code infinitely fast if you don't care about the results. It's incorrect to rely on the extension taking place. It's not incorrect to do the extension. Sure, I understand that, but I am completely baffled as to how extending at a call site avoids partial register stalls if a callee cannot assume that a value is already extended. Andrew.
Re: System V Application Binary Interface 0.99.5
On 01/30/2013 03:55 PM, Andrew Haley wrote: It's incorrect to rely on the extension taking place. It's not incorrect to do the extension. Sure, I understand that, but I am completely baffled as to how extending at a call site avoids partial register stalls if a callee cannot assume that a value is already extended. Ah, sorry. Thinking about it some more, if the register is extended at the call site, the partial register stall will be avoided whether or not the callee extends. So, we're correct to extend at the call site, and correct to extend in the callee. LLVM isn't correct not to extend at the callee. Thanks, Andrew.
Re: System V Application Binary Interface 0.99.5
Hi, On Wed, 30 Jan 2013, Andrew Haley wrote: It's an optimization to do so to avoid partial register stalls. Well, it's hardly an optimization if it's incorrect, and it seems to be incorrect. Hmm? GCC generates code that doesn't rely on the extension taking place. As the old saying goes, I can make your code infinitely fast if you don't care about the results. It's incorrect to rely on the extension taking place. It's not incorrect to do the extension. Sure, I understand that, but I am completely baffled as to how extending at a call site avoids partial register stalls if a callee cannot assume that a value is already extended. Accessing the whole register in the callee (which would induce a partial reg stall if the caller wouldn't have written to the whole register first, one way is by extending) doesn't mean that it also relies on the content of those bits, it can (and must) ignore them. E.g. assume this function: uint8 andme (uint8 a, uint8 b) { return a b; } One correct implementation for this function is: movl%esi, %eax andl%edi, %eax ret So the function accesses the upper unspecified bits, but doesn't rely on them (because also the upper bits of the return value are unspecified). If the caller would have set only the low 8 bits of the arguments (like it is allowed to do) that write would have incurred a partial reg stall, or alternatively the above two reads of the full reg would have incurred such a stall (on some architectures). So GCC chooses to overwrite the full register by some unspecified method (extending simply happens to be the most straight forward method). Ciao, Michael.
Re: System V Application Binary Interface 0.99.5
On 01/30/2013 04:49 PM, Michael Matz wrote: Hmm? GCC generates code that doesn't rely on the extension taking place. Sure, I didn't mean to suggest it was: it's LLVM that's incorrect. Thanks for the explanation. Andrew.