Thanks David. I thought -mmininal-toc might have been a better workaround as well :-) .
Is there a Bugzilla number for this issue? -----Original Message----- From: David Edelsohn [mailto:dje....@gmail.com] Sent: Tuesday, July 28, 2009 9:46 AM To: Tim Crook Subject: Re: How to figure out the gcc -dP output? Tim, I do not fully understand the complete explanation of the original problem. You mention extraneous lwz and TOC. I think you are referring to a bug in GCC 4.1 that incorrectly emitted loads after the TOC already had been changed for an indirect call. GCSE probably is producing code that requires a constant and GCC needs to place that constant in the TOC. The late creation of the TOC reference is not scheduled correctly. GCSE is an optimization. -mminimal-toc is an option to avoid TOC overflow. Both of these are work-arounds to the problem. Disabling GCSE probably will slow down the application. -mminimal-toc probably will have less of a performance impact. As I mentioned to Chris when I spoke with him last week, I would recomment upgrading to a newer version of GCC because GCC 4.1 no longer is maintained. Many bug fixes, such as one for the problem you are encountering, are incorporated into newer releases. David > I found a possible compiler workaround, compiling with -mminimal-toc. Would > I get better performance by using this, instead of turning off gcse? On Fri, Jul 24, 2009 at 4:34 PM, Tim Crook<tcr...@adobe.com> wrote: > Hello there. > > I am trying to track down a problem with gcc 4.1 which has to do with > inlining and templates on PowerPC. Is there any documentation I can look > related to the output generated with -fdump? I am getting extraneous lwz > (load word and zero extend) instructions inserted when calling various > methods - after $toc (r2) has been switched to the destination method's > global data, just before the method call with the bctrl instruction. This lwz > instruction causes a crash on IBM AIX when 32-bit shared libraries are loaded > non-contiguously in memory. It looks like various code blocks are not being > combined correctly when code is inlined - the extra lwz is being left behind. > > I have figured out that turning off gcse optimizations will stop this > behavior, but doing this causes a performance hit. I would prefer not to > upgrade the compiler at this time. With the compiler dump using -fdump, I am > looking for a better way to work around this problem. > > Tim Crook. >