Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
tbp wrote: On 3/13/06, Andrew Pinski [EMAIL PROTECTED] wrote: Actually the best way of improving the inline heuristics is to get a real testcase (and not some benchmark) where the inline heuristics is messed up. Ah, you mean a brand new testcase because PR-21195 wasn't good enough? show up in GCC 4.1 except for Wait wait. PR/21195 is about inlining the SSE builtins. These are special because, for example, you probably would prefer GDB to not step into them, but just execute them. As Andrew said, it is only an implementation choice (subject to revision) that they are implemented as inline functions at all. For example, if an older GCC had a similar bug with Altivec intrinsics, it would have showed up only in C++ (because Altivec intrinsics were never implemented as inlines in C) and would not show up anymore in GCC 4.1 except for a handful of intrinsics (because most Altivec intrinsics are not inlines at all anymore). memset/memcpy is different from SSE builtins because the choice of whether to inline or not is target dependent, and because glibc also decides whether or not to provide its own inlining, depending on the GCC version you're using. So the best way to report the problem is to file a *preprocessed* testcase into Bugzilla (i.e. the output of gcc -E testcase.c testcase.i or equivalently gcc -save-temps testcase.c, and to include the output of gcc -v testcase.c -O2 of the bug report. Using preprocessed source code at least makes sure that the glibc choices are not influencing the comparison between 3.4.x and 4.0.x. This information is present in the how to file a bug chapter of the manual. Your case seems to be different, because it involves inlining user routines. Again, you need to give us the preprocessed source code for us to look at your bug effectively. Paolo
Re: [PATCH] Add new target-hook truncated_to_mode
bool truncated_to_mode (enum machine_mode mode, rtx x) { if (REG_P (x) rtl_hooks.reg_truncated_to_mode (mode, x)) return true; gcc_assert (!TRULY_NOOP_TRUNCATION (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (GET_MODE (x))); return num_sign_bit_copies (x, GET_MODE (x)) GET_MODE_BITSIZE (GET_MODE (x)) - GET_MODE_BITSIZE (mode); } In the MIPS case, you would have n_s_b_c (x, GET_MODE (x)) 64 - 32. This wouldn't work for DI-HI truncation for example. There too only the upper 33 bits have to match for the TRUNCATE to be unnecessary. See comment around truncsdi in mips.md. If this is so, SImode should be passed to reg_truncated_to_mode as well, instead of HImode, shouldn't it? What about this logic: int n = num_sign_bit_copies (x, GET_MODE (x)); int dest_bits; enum machine_mode next_mode = mode; do { mode = next_mode; dest_bits = GET_MODE_BITSIZE (mode); /* If it is a no-op to truncate to MODE from a wider mode (e.g. to HI from SI on MIPS), we can check a weaker condition. */ next_mode = GET_MODE_WIDER_MODE (mode); } while (next_mode != VOIDmode TRULY_NOOP_TRUNCATION (GET_MODE_BITSIZE (next_mode), dest_bits); return (REG_P (x) rtl_hooks.reg_truncated_to_mode (mode, x)) || n GET_MODE_BITSIZE (GET_MODE (x)) - dest_bits); On MIPS, we would not test HImode but SImode since TRULY_NOOP_TRUNCATION (32, 16) == true. To me, this is a clue that the TRULY_NOOP_TRUNCATION macro is insufficient and could be replaced by another one. For example (for MIPS -- SHmedia is the same with s/MIPS64/SHMEDIA/): /* Return the mode to which we should truncate an INMODE value before operating on it in OUTMODE. For example, on MIPS we should truncate a 64-bit value to 32-bits when operating on it in SImode or a narrower mode. We return INMODE if no such truncation is necessary and we can just pretend that the value is already truncated. */ #define WIDEST_NECESSARY_TRUNCATION(outmode, inmode) \ (TARGET_MIPS64 \ GET_MODE_BITSIZE (inmode) = 32 \ GET_MODE_BITSIZE (outmode) 32 ? SImode : inmode) Since all uses of TRULY_NOOP_TRUNCATION (except one in convert.c which could be changed to use TYPE_MODE) are of the form TRULY_NOOP_TRUNCATION (GET_MODE_BITSIZE (x), GET_MODE_BITSIZE (y)), you could change them to WIDEST_NECESSARY_TRUNCATION (x, y) != y We could also take the occasion to remove all the defines of TRULY_NOOP_TRUNCATION to 1, and put a default definition in defaults.h! You can then proceed to implement truncated_to_mode as mode = WIDEST_NECESSARY_TRUNCATION (mode, GET_MODE (x)); gcc_assert (mode != GET_MODE (x)); return (REG_P (x) rtl_hooks.reg_truncated_to_mode (mode, x)) || num_sign_bit_copies (x, GET_MODE (x)) GET_MODE_BITSIZE (GET_MODE (x)) - GET_MODE_BITSIZE (mode); What do you think? Paolo
bootstrap broken on tunk for combined source tree
bootstrap compiler gcc-4.1.0 binutils-2.16.1 build system: i686-pc-linux-gnu Configuring stage 2 in ./libiberty configure: creating cache ./config.cache checking whether to enable maintainer-specific portions of Makefiles... no checking for makeinfo... makeinfo --split-size=500 --split-size=500 checking for perl... perl checking build system type... i686-pc-linux-gnu checking host system type... i686-pc-linux-gnu checking for i686-pc-linux-gnu-ar... ar checking for i686-pc-linux-gnu-ranlib... ranlib checking for i686-pc-linux-gnu-gcc... /SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2/./prev-gcc/xgcc -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2/./prev-gcc/ -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/install/i686-pc-linux-gnu/bin/ checking for suffix of object files... configure: error: cannot compute suffix of object files: cannot compile See `config.log' for more details. gmake[2]: *** [configure-stage2-libiberty] Error 1 gmake[2]: Leaving directory `/disk1/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2' gmake[1]: *** [stage2-bubble] Error 2 gmake[1]: Leaving directory `/disk1/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2' gmake: *** [all] Error 2 config.log in libiberty contains: configure:2272: /SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2/./prev-gcc/xgcc -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2/./prev-gcc/ -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/install/i686-pc-linux-gnu/bin/ -c -g -O2 conftest.c 5 lt-as-new: error while loading shared libraries: libbfd-2.16.1.so: cannot open shared object file: No such file or directory It looks like we have a wrong LD_LIBRARY_PATH setting. Any thoughts ? Rainer -- Rainer Emrich TECOSIM GmbH Im Eichsfeld 3 65428 Rüsselsheim Phone: +49(0)6142/8272 12 Mobile: +49(0)163/56 949 20 Fax.: +49(0)6142/8272 49 Web: www.tecosim.com
Re: bootstrap broken on tunk for combined source tree
config.log in libiberty contains: configure:2272: /SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2/./prev-gcc/xgcc -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2/./prev-gcc/ -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/install/i686-pc-linux-gnu/bin/ -c -g -O2 conftest.c 5 lt-as-new: error while loading shared libraries: libbfd-2.16.1.so: cannot open shared object file: No such file or directory It looks like we have a wrong LD_LIBRARY_PATH setting. It should work; I surely tested it before enabling toplevel bootstrap. The toplevel configure also has HOST_LIB_PATH_bfd = \ $$r/$(HOST_SUBDIR)/bfd/.:$$r/$(HOST_SUBDIR)/prev-bfd/.: Could you try sticking an echo $LD_LIBRARY_PATH in the libiberty configure script? Paolo
Re: [PATCH] Add new target-hook truncated_to_mode
Paolo Bonzini [EMAIL PROTECTED] writes: bool truncated_to_mode (enum machine_mode mode, rtx x) { if (REG_P (x) rtl_hooks.reg_truncated_to_mode (mode, x)) return true; gcc_assert (!TRULY_NOOP_TRUNCATION (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (GET_MODE (x))); return num_sign_bit_copies (x, GET_MODE (x)) GET_MODE_BITSIZE (GET_MODE (x)) - GET_MODE_BITSIZE (mode); } In the MIPS case, you would have n_s_b_c (x, GET_MODE (x)) 64 - 32. This wouldn't work for DI-HI truncation for example. There too only the upper 33 bits have to match for the TRUNCATE to be unnecessary. See comment around truncsdi in mips.md. If this is so, SImode should be passed to reg_truncated_to_mode as well, instead of HImode, shouldn't it? What about this logic: int n = num_sign_bit_copies (x, GET_MODE (x)); int dest_bits; enum machine_mode next_mode = mode; do { mode = next_mode; dest_bits = GET_MODE_BITSIZE (mode); /* If it is a no-op to truncate to MODE from a wider mode (e.g. to HI from SI on MIPS), we can check a weaker condition. */ next_mode = GET_MODE_WIDER_MODE (mode); } while (next_mode != VOIDmode TRULY_NOOP_TRUNCATION (GET_MODE_BITSIZE (next_mode), dest_bits); return (REG_P (x) rtl_hooks.reg_truncated_to_mode (mode, x)) || n GET_MODE_BITSIZE (GET_MODE (x)) - dest_bits); It looks like you're introducing a new assumption here: that we can ignore TRULY_NOOP_TRUNCATE (X, Y) if the upper X-Y bits are all filled with sign bits. I realise that's true for both SH and MIPS, but the current documentation of TRULY_NOOP_TRUNCATE doesn't guarantee it. For example, I could imagine some future port wanting to preserve zero extension instead of sign extension. That still fits TRLULY_NOOP_TRUNCATION as currently defined, but the code above would then be wrong. And... On MIPS, we would not test HImode but SImode since TRULY_NOOP_TRUNCATION (32, 16) == true. To me, this is a clue that the TRULY_NOOP_TRUNCATION macro is insufficient and could be replaced by another one. For example (for MIPS -- SHmedia is the same with s/MIPS64/SHMEDIA/): /* Return the mode to which we should truncate an INMODE value before operating on it in OUTMODE. For example, on MIPS we should truncate a 64-bit value to 32-bits when operating on it in SImode or a narrower mode. We return INMODE if no such truncation is necessary and we can just pretend that the value is already truncated. */ #define WIDEST_NECESSARY_TRUNCATION(outmode, inmode) \ (TARGET_MIPS64 \ GET_MODE_BITSIZE (inmode) = 32 \ GET_MODE_BITSIZE (outmode) 32 ? SImode : inmode) Since all uses of TRULY_NOOP_TRUNCATION (except one in convert.c which could be changed to use TYPE_MODE) are of the form TRULY_NOOP_TRUNCATION (GET_MODE_BITSIZE (x), GET_MODE_BITSIZE (y)), you could change them to WIDEST_NECESSARY_TRUNCATION (x, y) != y We could also take the occasion to remove all the defines of TRULY_NOOP_TRUNCATION to 1, and put a default definition in defaults.h! You can then proceed to implement truncated_to_mode as mode = WIDEST_NECESSARY_TRUNCATION (mode, GET_MODE (x)); gcc_assert (mode != GET_MODE (x)); return (REG_P (x) rtl_hooks.reg_truncated_to_mode (mode, x)) || num_sign_bit_copies (x, GET_MODE (x)) GET_MODE_BITSIZE (GET_MODE (x)) - GET_MODE_BITSIZE (mode); What do you think? ...I think the same applies to this macro too. That's one reason why I prefer the alternative hook that I described: it makes the sign extension explicit. The other reason is that it would allow the middle-end to remove redundant sign extensions. (Note that WIDEST_NECESSARY_TRUNCATION(X, Y) == Z does _not_ imply that sign-extension of a Z-bit value to X bits comes for free. On MIPS, it isn't true for X==128, just X==64.) Richard
Re: bootstrap broken on tunk for combined source tree
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Paolo Bonzini schrieb: config.log in libiberty contains: configure:2272: /SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2/./prev-gcc/xgcc -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2/./prev-gcc/ -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/install/i686-pc-linux-gnu/bin/ -c -g -O2 conftest.c 5 lt-as-new: error while loading shared libraries: libbfd-2.16.1.so: cannot open shared object file: No such file or directory It looks like we have a wrong LD_LIBRARY_PATH setting. It should work; I surely tested it before enabling toplevel bootstrap. The toplevel configure also has HOST_LIB_PATH_bfd = \ $$r/$(HOST_SUBDIR)/bfd/.:$$r/$(HOST_SUBDIR)/prev-bfd/.: Could you try sticking an echo $LD_LIBRARY_PATH in the libiberty configure script? Paolo Your right, the LD_LIBRARY_PATH includes ./bfd/. and ./prev-bfd/., but the shared library is in ./prev-bfd/.libs !!! ./prev-bfd/.libs/libbfd-2.16.1.so Rainer - -- Rainer Emrich TECOSIM GmbH Im Eichsfeld 3 65428 Rüsselsheim Phone: +49(0)6142/8272 12 Mobile: +49(0)163/56 949 20 Fax.: +49(0)6142/8272 49 Web: www.tecosim.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEFUqR3s6elE6CYeURAjdUAKDXBW99he2UO9fkpfksg3aMFZnaWwCgzX4F 0hAmYJ01L1WYvjF0nhdvVL8= =PBN5 -END PGP SIGNATURE-
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Paolo Bonzini [EMAIL PROTECTED] wrote: Wait wait. PR/21195 is about inlining the SSE builtins. No. PR/21195 was really about inline heuristic going ballistic. Those intrinsics are thin wrappers around builtins, and ultimately resolve to a couple of operations. Typical C++ (accessors/ctors) also presents lots of such small functions. And guess what, same cause same symptom. There's no sensible metric by which code i've quoted in previous mail makes sense. Size? Nope. Execution time? Certainly not. Again whether or not SSE ops are involved was and is still irrelevant. Your case seems to be different, because it involves inlining user routines. Again, you need to give us the preprocessed source code for us to look at your bug effectively. Thanks for the tip, but i'll pass. I've done my duty already. Months ago there was 2 options for fixing PR/21195: a) Fix the inlining heuristic. b) Kludge all intrinsics with always_inline. I've tried to argue a bit but to no avail. So, while you remain convinced everything's fine with the inliner, i'll keep tagging every function in my code with always_inline/noinline where performance matters.
Re: bootstrap broken on tunk for combined source tree
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Paolo Bonzini schrieb: config.log in libiberty contains: configure:2272: /SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2/./prev-gcc/xgcc -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.2/gcc-4.2/./prev-gcc/ -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/install/i686-pc-linux-gnu/bin/ -c -g -O2 conftest.c 5 lt-as-new: error while loading shared libraries: libbfd-2.16.1.so: cannot open shared object file: No such file or directory It looks like we have a wrong LD_LIBRARY_PATH setting. It should work; I surely tested it before enabling toplevel bootstrap. The toplevel configure also has HOST_LIB_PATH_bfd = \ $$r/$(HOST_SUBDIR)/bfd/.:$$r/$(HOST_SUBDIR)/prev-bfd/.: Could you try sticking an echo $LD_LIBRARY_PATH in the libiberty configure script? Paolo And the same is true for prev-opcodes: prev-opcodes/.libs/libopcodes-2.16.1.so Rainer - -- Rainer Emrich TECOSIM GmbH Im Eichsfeld 3 65428 Rüsselsheim Phone: +49(0)6142/8272 12 Mobile: +49(0)163/56 949 20 Fax.: +49(0)6142/8272 49 Web: www.tecosim.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEFU7o3s6elE6CYeURAlKFAKCx59Q93kErIQAVw55e7MkNq9oGbACfShyo 3xoQzpN6pKpTIYG2ChipZEs= =RGzx -END PGP SIGNATURE-
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, tbp [EMAIL PROTECTED] wrote: On 3/13/06, Paolo Bonzini [EMAIL PROTECTED] wrote: Wait wait. PR/21195 is about inlining the SSE builtins. No. PR/21195 was really about inline heuristic going ballistic. Those intrinsics are thin wrappers around builtins, and ultimately resolve to a couple of operations. Typical C++ (accessors/ctors) also presents lots of such small functions. And guess what, same cause same symptom. Starting with gcc 4.1.0 we have inline heuristics in place that will _always_ inline such simple wrappers. So, if this still happens, there is a bug in the heuristics and that should be reported. Before 4.1.0 the heuristics were bogus and wrappers were not inlined all the time. So, can you verify you are happy with the heuristics in 4.1.0 (not talking about inlining of memcpy/memset that are really not function inlining, but the SSE/altivec inline function implementations). Richard.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: Starting with gcc 4.1.0 we have inline heuristics in place that will _always_ inline such simple wrappers. So, if this still happens, there is a bug in the heuristics and that should be reported. Before 4.1.0 the heuristics were bogus and wrappers were not inlined all the time. So, can you verify you are happy with the heuristics in 4.1.0 No i'm not, and i've used a pristine 4.1.0 in http://gcc.gnu.org/ml/gcc/2006-03/msg00410.html I haven't tried that particular testcase on 4.2.x, but some weeks ago i had to go thru all my code again to put always_inline in some forgotten places because i was seeing even empty ctors not being inlined (to the effect of having a call to a ret). So in this regard, 4.1.0 4.2.x still exhibit that kind of behaviour. It seems to trigger when some particular threshold is met, either for a function or unit, then nothing at all gets inlined but functions tagged with always_inline; of course major performance regression ensues.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, tbp [EMAIL PROTECTED] wrote: On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: Starting with gcc 4.1.0 we have inline heuristics in place that will _always_ inline such simple wrappers. So, if this still happens, there is a bug in the heuristics and that should be reported. Before 4.1.0 the heuristics were bogus and wrappers were not inlined all the time. So, can you verify you are happy with the heuristics in 4.1.0 No i'm not, and i've used a pristine 4.1.0 in http://gcc.gnu.org/ml/gcc/2006-03/msg00410.html For the testcase in this message, I get (I removed the always_inline) all wrappers inlined to bloatit. Of course bloatit does not get inlined w/o always_inline - it's a huge function and not a simple wrapper. With always_inline on it, the wrappers are no longer inlined - this is a bug and should be reported. Of course from 4.1.0 on you can easier stick an __attribute__((flatten)) on the function you want everything inlined to (finalblow) and get everything inlined into it. Can you report a bugzilla for the bad interaction between always_inline and inlining of simple wrappers? Thanks, Richard.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: On 3/13/06, tbp [EMAIL PROTECTED] wrote: On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: Starting with gcc 4.1.0 we have inline heuristics in place that will _always_ inline such simple wrappers. So, if this still happens, there is a bug in the heuristics and that should be reported. Before 4.1.0 the heuristics were bogus and wrappers were not inlined all the time. So, can you verify you are happy with the heuristics in 4.1.0 No i'm not, and i've used a pristine 4.1.0 in http://gcc.gnu.org/ml/gcc/2006-03/msg00410.html For the testcase in this message, I get (I removed the always_inline) all wrappers inlined to bloatit. Of course bloatit does not get inlined w/o always_inline - it's a huge function and not a simple wrapper. With always_inline on it, the wrappers are no longer inlined - this is a bug and should be reported. Of course from 4.1.0 on you can easier stick an __attribute__((flatten)) on the function you want everything inlined to (finalblow) and get everything inlined into it. I see the bug and will have a fix in a moment. Richard.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: Of course from 4.1.0 on you can easier stick an __attribute__((flatten)) on the function you want everything inlined to (finalblow) and get everything inlined into it. But that's not really what i'm after: i expect trivial functions to get inlined no matter what at a given -Ox. With always_inline on it, the wrappers are no longer inlined - this is a bug and should be reported. Can you report a bugzilla for the bad interaction between always_inline and inlining of simple wrappers? I will report it again then.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: I see the bug and will have a fix in a moment. You made my day. Or you're about to. Unless you're lying and i'll have to curse you for 7 generations.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, tbp [EMAIL PROTECTED] wrote: On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: I see the bug and will have a fix in a moment. You made my day. Or you're about to. Unless you're lying and i'll have to curse you for 7 generations. http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00739.html ;)
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00739.html /me ventilates. You're my hero.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, tbp [EMAIL PROTECTED] wrote: On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00739.html /me ventilates. You're my hero. A double+ hero on top of that. http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00737.html I think i've hit that one that one too; reported here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26650 Well, i can always dream.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, tbp [EMAIL PROTECTED] wrote: On 3/13/06, tbp [EMAIL PROTECTED] wrote: On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00739.html /me ventilates. You're my hero. A double+ hero on top of that. http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00737.html I think i've hit that one that one too; reported here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26650 I don't think this is related, and a quick check with the patch shows still unaligned moves to the stack. Richard.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: I don't think this is related, and a quick check with the patch shows still unaligned moves to the stack. Patience is a virtue i guess :) Is there good chances your inlining fix will hit mainline soon?
[M32C-ELF] : Improper follow-up of bss section
Hi, I have downloaded latest GCC and Binutils sources from FSF for M32C port. Using these sources, I could successfully build the cross toolchain i.e. m32c-elf-*. I have observed the following behavior while building an application, Case 1 - Initialized global variables are not present in the application. (data section is empty). If I specify the locations of data section and bss section in the linker script in the following manner and build the application, .data 0x0400 : { _data = .; *(.data) *(.data.*) _edata = .; } .bss : { _bss = .; *(.bss) *(COMMON) _ebss = .; _end = .; } the bss section is located at the location 0x00 instead of 0x000400. The value of the variable _bss is 0x00. The value of the variables _ebss and _end is 0x00. This can be verified from the map file. In this case the location counter is not incremented properly. In case of H8 and SH tool chains, the bss section follows the data section correctly. Case 2 - One initialized global variable is present in the application (E.g. int i = 1;). If I build the application with the above mentioned linker script, the bss section is located at 0x000402. The value of the variable _bss is 0x000402. The value of the variables _ebss and _end is 0x000402. This can be verified from the map file. Thus, for the proper follow-up of the bss section (i.e. to increment the location counter correctly), the data section should not be empty. Is this behavior expected? Case 3 - No initialized global variable is present in the application (data section is empty) but following linker script is used, MEMORY { ram (rw) : o = 0x400, l = 31k rom (rx) : o = 0x000E000, l = 256k } .data 0x0400 : { _data = .; *(.data) *(.data.*) _edata = .; } ram .bss : { _bss = .; *(.bss) *(COMMON) _ebss = .; _end = .; } ram In this case the bss section follows the data section correctly i.e. bss section is located at address 0x000400 and not 0x00 (as in case 1). In this case the location counter is incremented correctly. The above behavior is observed for all m32c targets, i.e. r8c, m16c, m32c and m32cm. Is this behavior expected? Linker script similar to the script specified in case 1, works properly with H8 and SH tool chains (modified according to their memory maps). Why does it not work with M32C tool chain? Do I need to use MEMORY command in the linker script as in case 3. Thanks in advance. Regards, Ina Pandit KPIT Cummins InfoSystems Ltd. Pune, India Free download of GNU based tool-chains for Renesas' SH and H8 Series. The following site also offers free technical support to its users. Visit http://www.kpitgnutools.com for details. Latest versions of KPIT GNU tools were released on February 1, 2006.
Re: [M32C-ELF] : Improper follow-up of bss section
This is all binutils specific, nothing to do with gcc as such. Please re-post your queries to the binutils list. Andrew.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
Is there a bugzilla entry describing the bug Richard is fixing? If not, it'd be nice to have, if for no other reason than it would show up naturally when people look for bugs fixed in gcc-4.1.1. I can create one, but it'd be better if someone actually involved in the action did. - Dan -- Wine for Windows ISVs: http://kegel.com/wine/isv
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Dan Kegel [EMAIL PROTECTED] wrote: Is there a bugzilla entry describing the bug Richard is fixing? If not, it'd be nice to have, if for no other reason than it would show up naturally when people look for bugs fixed in gcc-4.1.1. I can create one, but it'd be better if someone actually involved in the action did. I can do it. Richard.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Richard Guenther [EMAIL PROTECTED] wrote: On 3/13/06, Dan Kegel [EMAIL PROTECTED] wrote: Is there a bugzilla entry describing the bug Richard is fixing? If not, it'd be nice to have, if for no other reason than it would show up naturally when people look for bugs fixed in gcc-4.1.1. I can create one, but it'd be better if someone actually involved in the action did. I can do it. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26667 Richard.
emit-rtl.c: 5048 assert
All- I'm having problems with an assert on line 5048 of emit-rtl.c gcc_assert (i MAX_RECOG_OPERANDS); The assert is in the copy_insn_1() function and is asserted when the number of copied scratch registers exceeds MAX_RECOG_OPERANDS. For my particular machine (IA-64) this number is 30. This happens when I make a call to duplicate_block() in my code. I've just started the debugging process and was just wondering if there was a simple way (besides recursing through the expression) to check for the number of scatch registers used by an instruction? Thanks, Chad
Re: Problem with pex-win32.c
Here is a sample program which does the right thing (no spurious console windows, all output visible) when run either from a console or from a console-free environment, such as a Cygwin xterm. This is the code we'll be working into libiberty -- unless someone has a better solution! -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713 #include windows.h #include stdio.h int main() { HANDLE stdin_handle; HANDLE stdout_handle; HANDLE stderr_handle; DWORD dwCreationFlags; OSVERSIONINFO version_info; STARTUPINFO si; PROCESS_INFORMATION pi; /* Replace these with handles for pipes, etc. */ stdin_handle = GetStdHandle (STD_INPUT_HANDLE); stdout_handle = GetStdHandle (STD_OUTPUT_HANDLE); stderr_handle = GetStdHandle (STD_ERROR_HANDLE); version_info.dwOSVersionInfoSize = sizeof (version_info); GetVersionEx (version_info); if (version_info.dwPlatformId == VER_PLATFORM_WIN32_WINDOWS) /* On Windows 95/98/ME the CREATE_NO_WINDOW flag is not supported, so we cannot avoid creating a console window. */ dwCreationFlags = 0; else { HANDLE conout_handle; /* Determine whether or not we have an associated console. */ conout_handle = CreateFile(CONOUT$, GENERIC_WRITE, FILE_SHARE_WRITE, /*lpSecurityAttributes=*/NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, /*hTemplateFile=*/NULL); if (conout_handle == INVALID_HANDLE_VALUE) /* There is no console associated with this process. Since the child is a console process, the OS would normally create a new console Window for the child. Since we'll be redirecting the child's standard streams, we do not need the console window. */ dwCreationFlags = CREATE_NO_WINDOW; else { /* There is a console associated with the process, so the OS will not create a new console. And, if we use CREATE_NO_WINDOW in this situation, the child will have no associated console. Therefore, if the child's standard streams are connected to the console, the output will be discarded. */ CloseHandle(conout_handle); dwCreationFlags = 0; } } /* Since the child will be a console process, it will, by default, connect standard input/output to its console. However, we want the child to use the handles specifically designated above. In addition, if there is no console (such as when we are running in a Cygwin X window), then we must redirect the child's input/output, as there is no console for the child to use. */ memset (si, 0, sizeof (si)); si.cb = sizeof (si); si.dwFlags = STARTF_USESTDHANDLES; si.hStdInput = stdin_handle; si.hStdOutput = stdout_handle; si.hStdError = stderr_handle; fprintf (stderr, About to invoke child.\n); fflush (stderr); /* Start the child. */ CreateProcess (child.exe, child.exe, NULL, NULL, /*bInheritHandles=*/TRUE, dwCreationFlags, /*lpEnvironment=*/NULL, /*lpCurrentDirectory=*/NULL, si, pi); WaitForSingleObject (pi.hProcess, INFINITE); CloseHandle (pi.hProcess); CloseHandle (pi.hThread); fprintf (stderr, Child done.\n); fflush (stderr); }
Re: GCC Port (gcc backend) for Microchip PICMicro microcontroller
On Mar 13, 2006, at 5:29 AM, Colm O' Flaherty wrote: I've been thinking a bit more about this (no code yet: I was busy trying to find and fix a bug in gpsim), and I'm still not sure what the optimal development mode is.. by this, I mean.. what should the proposed PIC port of GCC produce? If 100% of the ports produce assemble files, then, you'll want to produce assembly files. 100% of the ports produce assembly. There are pros and cons to both approaches. Producing a hex file is (a lot?) more work, and would duplicate the work of gputils, but would leave gcc as a standalone tool, which I presume is desirable! Nope. The issue here is that that gcc would then become bound to gputils, Not a problem, though, we'd prefer that you did up a binutils port as well. The reason is that those utilities have a certain feature set that other tools don't have, and that feature set is used and it useful to the compiler and users. Also, it is possible to do up a port first to gputils and then later to enhance it to target binutils, while retaining the ability to still target gputils, if people find that interesting. The real issue here, for me, is the level of duplication / overlap with the SDCC project. Don't worry, they can come join us and stop duplicating our work after you get a port going.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On Mar 13, 2006, at 12:16 AM, Paolo Bonzini wrote: PR/21195 is about inlining the SSE builtins. These are special because, for example, you probably would prefer GDB to not step into them, but just execute them. :-) We have an APPLE LOCAL patch to remove the debug information associated with them so that the debugger never steps `into' them. :- ( attr (__nodebug)
Re: Problem with pex-win32.c
Mark Mitchell wrote at http://gcc.gnu.org/ml/gcc/2006-03/msg00441.html Here is a sample program which does the right thing (no spurious console windows, all output visible) when run either from a console or from a console-free environment, such as a Cygwin xterm. This is the code we'll be working into libiberty -- unless someone has a better solution! In my experience, following test is not necessary. Win9x just ignores the CREATE_NO_WINDOWS flag so setting it is a harmless no-op on these platforms. version_info.dwOSVersionInfoSize = sizeof (version_info); GetVersionEx (version_info); if (version_info.dwPlatformId == VER_PLATFORM_WIN32_WINDOWS) /* On Windows 95/98/ME the CREATE_NO_WINDOW flag is not supported, so we cannot avoid creating a console window. */ dwCreationFlags = 0; See also http://gcc.gnu.org/ml/java-patches/2003-q4/msg00260.html Danny
Re: Problem with pex-win32.c
Danny Smith wrote: In my experience, following test is not necessary. Win9x just ignores the CREATE_NO_WINDOWS flag so setting it is a harmless no-op on these platforms. It's OK with me not to do it; I just didn't have those platforms to use for testing, and it seems more pedantically correct to check the version. But, I'm sure not going to argue for keeping that block of code in there if that stands in the way of making progress! See also http://gcc.gnu.org/ml/java-patches/2003-q4/msg00260.html Lovely, we're all reinventing the same wheels. All the more reason to get this into libiberty... :-) Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Question about use of C++ copy constructor
Hi, Didn't see a reply yet, so I'll chime in. The relevant text appears in gcc-3.4's release notes: When binding an rvalue of class type to a reference, the copy constructor of the class must be accessible. PR 12226 seems to be the mother bug related to this (many dupes). Fang foo.cc: In function ??void foo(const B)??: foo.cc:3: error: ??B::B(const B)?? is private foo.cc:13: error: within this context I don't understand why, as I don't see the copy constructor being used anywhere. It seems to me this code should create a temporary for the duration of the statement, and pass the temporary as a reference to b.fn. This code compiles with icc with no errors. What is wrong with this code? class B { private: B(const B); void operator=(const B); public: B(); void fn(const B other) const; }; void foo (const B b) { b.fn(B()); }
Re: -fmudflap and -fmudflapth
Rafael Espíndola [EMAIL PROTECTED] writes: Use `-fmudflapth' instead of `-fmudflap' to compile and to link if your program is multi-threaded. [...but...] gate_mudflap (void) { return flag_mudflap != 0 } Maybe something broke this, but -fmudflapth used to imply setting both flag_mudflap and flag_mudflap_threads. - FChE
Re: Question about use of C++ copy constructor
Also see CWG issue 391: http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#391 which will make our behavior non-conforming in C++0X. -Howard On Mar 13, 2006, at 4:02 PM, David Fang wrote: Hi, Didn't see a reply yet, so I'll chime in. The relevant text appears in gcc-3.4's release notes: When binding an rvalue of class type to a reference, the copy constructor of the class must be accessible. PR 12226 seems to be the mother bug related to this (many dupes). Fang foo.cc: In function ¡Ævoid foo(const B)¡Ç: foo.cc:3: error: ¡ÆB::B(const B)¡Ç is private foo.cc:13: error: within this context I don't understand why, as I don't see the copy constructor being used anywhere. It seems to me this code should create a temporary for the duration of the statement, and pass the temporary as a reference to b.fn. This code compiles with icc with no errors. What is wrong with this code? class B { private: B(const B); void operator=(const B); public: B(); void fn(const B other) const; }; void foo (const B b) { b.fn(B()); }
Re: gcc 4.1
The appropriate place for such stuff is gcc@gcc.gnu.org Am Montag, 13.03.06 um 17:19 Uhr schrieb Helge Hess: Hi, new gcc release, new warnings ;-) Am I the only one who gets those: DOMElement.m:283: warning: pointer type mismatch in conditional expression For stuff like: objs[1] = _ns ? _ns : (id)null; or return [pathes isNotNull] ? pathes : nil; Bug to be reported or a feature? With libFoundation I get similiar things for constant NSString's, like: myString ? myString : @ (apparently not so with gstep-base) Greets, Helge -- http://docs.opengroupware.org/Members/helge/ OpenGroupware.org ___ Discuss-gnustep mailing list Discuss-gnustep@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnustep
Re: Ada subtypes and base types
On Mon, 2006-02-27 at 20:08 +0100, Waldek Hebisch wrote: What do you mean by abuse? TYPE_MAX_VALUE means maximal value allowed by given type. As long as you're *absolutely* clear that a variable with a restricted range can never hold a value outside that the restricted range in a conforming program, then I'll back off the abuse label and merely call it pointless :-) The scheme you're using promoting to a base type before all arithmetic creates lots of useless type conversions and means that the optimizers can't utilize TYPE_MIN_VALUE/TYPE_MAX_VALUE anyway. ie, you don't gain anything over keeping that knowledge in the front-end. jeff
Re: scripting interface to GCC ?
Mike == Mike Mattie [EMAIL PROTECTED] writes: Mike Has anyone ever tried to build a scripting interface into the guts of Mike GCC with something like SWIG ? I've heard of a couple efforts along these lines -- once with Scheme and once with Python. I don't know if either used SWIG. Neither one was submitted to GCC. Both were obscure enough that, even now, I can't really be sure they ever existed :-) Tom
Line insn notes in modulo-sched
Hi Ayal, The SMS implementation in GCC, in modulo-sched.c, uses line notes to find insn locations, see find_line_note. Why are you using line notes instead of insn locators? Line notes are on the list of Things That Should Not Be, and insn locators replace them. Is there a reason for modulo-sched to rely on loop notes, or is this just an oversight? Gr. Steven
Re: Question about use of C++ copy constructor
David Fang [EMAIL PROTECTED] writes: The relevant text appears in gcc-3.4's release notes: When binding an rvalue of class type to a reference, the copy constructor of the class must be accessible. Thanks. I see that I have managed to ask about a frequently reported bug. Sorry about the noise. Ian
Re: Line insn notes in modulo-sched
Hi Ayal, The SMS implementation in GCC, in modulo-sched.c, uses line notes to find insn locations, see find_line_note. Why are you using line notes instead of insn locators? Line notes are on the list of Things That Should Not Be, and insn locators replace them. Is there a reason for modulo-sched to rely on loop notes, or is this just an oversight? And in addition the line notes should not exist anymore at modulo-sched time Honza Gr. Steven
Re: gcc 4.1
On Mar 13, 2006, at 2:05 PM, Lars Sonchocky-Helldorf wrote: The appropriate place for such stuff is gcc@gcc.gnu.org No, not really. gcc-help is more appropriate. Am I the only one who gets those: DOMElement.m:283: warning: pointer type mismatch in conditional expression I doubt it. For stuff like: objs[1] = _ns ? _ns : (id)null; or return [pathes isNotNull] ? pathes : nil; And here all information that I can use to answer the question has been stripped.
Re: Ada subtypes and base types
Jeffrey A Law wrote: On Mon, 2006-02-27 at 20:08 +0100, Waldek Hebisch wrote: What do you mean by abuse? TYPE_MAX_VALUE means maximal value allowed by given type. As long as you're *absolutely* clear that a variable with a restricted range can never hold a value outside that the restricted range in a conforming program, then I'll back off the abuse label and merely call it pointless :-) The scheme you're using promoting to a base type before all arithmetic creates lots of useless type conversions and means that the optimizers can't utilize TYPE_MIN_VALUE/TYPE_MAX_VALUE anyway. ie, you don't gain anything over keeping that knowledge in the front-end. Pascal arithmetic essentially is untyped: operators take integer arguments and are supposed to give mathematically correct result (provided all intermediate results are representable in machine arithmetic, overflow is treated as user error). OTOH for proper type checking front end have to track ranges associated to variables. So useless type conversions are needed due to Pascal standard and backend constraints. I think that it is easy for back end to make good use of TYPE_MIN_VALUE/TYPE_MAX_VALUE. Namely, consider the assignment x := y + z * w; where variables y, z and w have values in the interval [0,7] and x have values in [0,1000]. Pascal converts the above to the following C like code: int tmp = (int) y + (int) z * (int) w; x = (tmp 0 || tmp 1000)? (Range_Check_Error (), 0) : tmp; I expect VRP to deduce that tmp will have values in [0..56] and eliminate range check. Also, it should be clear that in the assigment above artithmetic can be done using any convenient precision. In principle Pascal front end could deduce more precise types (ranges), but that would create some extra type conversions and a lot of extra types. Moreover, I assume that VRP can do better job at tracking ranges then Pascal front end. -- Waldek Hebisch [EMAIL PROTECTED]
gcc autovectorization question
Hi All, I am trying to use the latest autovectorization gcc code to generate functionally correct SSE instructions, and I have the following questions: Where is the latest stable gcc version with autovector? (is this 4.1.0?) and where is the latest development code for this? (off of the SVN?) Initially, I want to use code that will generate functional code for my application. After I familiarize myself with the existing code, I plan to contribute to this work. thx Tom Yeh
Re: scripting interface to GCC ?
On Mon, 2006-03-13 at 16:25 -0700, Tom Tromey wrote: Mike == Mike Mattie [EMAIL PROTECTED] writes: Mike Has anyone ever tried to build a scripting interface into the guts of Mike GCC with something like SWIG ? I've heard of a couple efforts along these lines -- once with Scheme and once with Python. I don't know if either used SWIG. Neither one was submitted to GCC. Both were obscure enough that, even now, I can't really be sure they ever existed :-) I will admit i SWIG'd gcc, once, and generated python wrappers, but this was before tree-ssa, so it wasn't all that useful. There is some appeal to being able to prototype passes, etc, in python. It's a ton of work to get the bindings going though. Maybe SWIG is much better than it was, in which case, cool! Tom
[Bug target/26662] optimization on hashtable iterator produces bad code
--- Comment #2 from fang at csl dot cornell dot edu 2006-03-13 08:47 --- forgot to add: works with i686-suse-linux g++-3.3.3 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26662
[Bug middle-end/26643] Linux matroxfb_probe miscompiled
--- Comment #9 from matz at suse dot de 2006-03-13 08:57 --- -fno-ivopts fixes it. The problem is how bitfield refs are dealt with it seems. With -fno-ivopts the final_cleanup pass looks like so at the interesting point: D.2180 = BIT_FIELD_REF *pdev, 32, 544 4294967295; ... if ((BIT_FIELD_REF *b, 32, 0 4294967295) != D.2180) goto L3; else goto L1; ivopts lead to this code at that point: D.2180 = BIT_FIELD_REF *pdev, 32, 544 4294967295; ... if ((MEM[base: (long unsigned int *) b] 4294967295) != D.2180) goto L3; else goto L1; Now BIT_FIELD_REF*b,32,0 extract exactly the 32 bit at address 'b'. But MEM[base: (long unsigned int *) b] extracts the 64 bit at that address. The masking afterwards selects the lower 32bit from that, but ppc being a big endian target this extracts the wrong half. Let's CC Zdenek for this. -- matz at suse dot de changed: What|Removed |Added CC||rakdver at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26643
[Bug middle-end/26630] [4.0 Regression] Incorrect result when subtracting, casting to short and back to int, adding and multiplying
--- Comment #7 from rguenth at gcc dot gnu dot org 2006-03-13 09:02 --- Subject: Bug 26630 Author: rguenth Date: Mon Mar 13 09:02:40 2006 New Revision: 111990 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=111990 Log: 2006-03-13 Richard Guenther [EMAIL PROTECTED] PR middle-end/26630 * gcc.dg/torture/pr26630.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/torture/pr26630.c Modified: trunk/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26630
[Bug middle-end/26630] [4.0 Regression] Incorrect result when subtracting, casting to short and back to int, adding and multiplying
--- Comment #8 from rguenth at gcc dot gnu dot org 2006-03-13 09:09 --- Subject: Bug 26630 Author: rguenth Date: Mon Mar 13 09:09:13 2006 New Revision: 111992 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=111992 Log: 2006-03-13 Richard Guenther [EMAIL PROTECTED] PR middle-end/26630 * gcc.dg/torture/pr26630.c: New testcase. Added: branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/torture/pr26630.c Modified: branches/gcc-4_1-branch/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26630
[Bug middle-end/26659] [4.2 Regression] gcc.target/powerpc/ppc-vector-memset.c fails on the mainline
--- Comment #4 from rguenth at gcc dot gnu dot org 2006-03-13 09:30 --- Interestingly, on i686 (which has BIGGEST_ALIGNMENT 128), I get with the testcase: Breakpoint 3, get_pointer_alignment (exp=0xb7c8b320, max_align=128) (gdb) call debug_tree(exp) addr_expr 0xb7c8b320 type pointer_type 0xb7c95c38 type integer_type 0xb7c95284 int sizes-gimplified public SI size integer_cst 0xb7c863f0 constant invariant 32 unit size integer_cst 0xb7c86180 constant invariant 4 align 32 symtab 0 alias set -1 precision 32 min integer_cst 0xb7c863a8 -2147483648 max integer_cst 0xb7c863c0 2147483647 pointer_to_this pointer_type 0xb7c95c38 unsigned SI size integer_cst 0xb7c863f0 32 unit size integer_cst 0xb7c86180 4 align 32 symtab 0 alias set -1 invariant arg 0 var_decl 0xb7c92108 x type array_type 0xb7d2da10 type integer_type 0xb7c95284 int sizes-gimplified BLK size integer_cst 0xb7c86db0 constant invariant 256 unit size integer_cst 0xb7c86168 constant invariant 32 align 32 symtab 0 alias set -1 domain integer_type 0xb7cebac8 addressable used BLK file t.c line 5 size integer_cst 0xb7c86db0 256 unit size integer_cst 0xb7c86168 32 align 32 context function_decl 0xb7d2bd80 foo attributes tree_list 0xb7d38048 (mem/s/c:BLK (plus:SI (reg/f:SI 54 virtual-stack-vars) (const_int -32 [0xffe0])) [0 x+0 S32 A32]) i.e. the decl does not have alignment of 128, but 32. And we get the same alignment before and after the patch (32). Off to a ppc machine... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26659
[Bug middle-end/26659] [4.2 Regression] gcc.target/powerpc/ppc-vector-memset.c fails on the mainline
--- Comment #5 from rguenth at gcc dot gnu dot org 2006-03-13 10:01 --- On powerpc64-linux I get (gdb) call debug_tree(exp) addr_expr 0xf7d7a620 type pointer_type 0xf7ea0070 type integer_type 0xf7e90460 int sizes-gimplified public SI size integer_cst 0xf7e929e0 constant invariant 32 unit size integer_cst 0xf7e926a0 constant invariant 4 align 32 symtab 0 alias set -1 precision 32 min integer_cst 0xf7e92980 -2147483648 max integer_cst 0xf7e929a0 2147483647 pointer_to_this pointer_type 0xf7ea0070 unsigned DI size integer_cst 0xf7e92b00 constant invariant 64 unit size integer_cst 0xf7e92b20 constant invariant 8 align 64 symtab 0 alias set -1 invariant arg 0 var_decl 0xf7d7b460 x type array_type 0xf7d7b3f0 type integer_type 0xf7e90460 int sizes-gimplified BLK size integer_cst 0xf7e9e660 constant invariant 256 unit size integer_cst 0xf7e92680 constant invariant 32 align 32 symtab 0 alias set -1 domain integer_type 0xf7f16ee0 addressable used BLK file t.c line 5 size integer_cst 0xf7e9e660 256 unit size integer_cst 0xf7e92680 32 align 128 context function_decl 0xf7f7d400 foo attributes tree_list 0xf7d7a5c0 (mem/s/c:BLK (reg/f:DI 115 virtual-stack-vars) [0 x+0 S32 A128]) so the decl has an alignment of 128 -- I wonder why i686 is different here. So, I have a fix. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26659
[Bug target/26662] optimization on hashtable iterator produces bad code
--- Comment #3 from rguenth at gcc dot gnu dot org 2006-03-13 10:40 --- You should report this to apple, because as 4.0 and 4.1 are not affected and both the 3.3 and the 3.4 branch are now closed, this PR will be just closed as fixed. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED Target Milestone|--- |4.0.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26662
[Bug middle-end/26659] [4.2 Regression] gcc.target/powerpc/ppc-vector-memset.c fails on the mainline
--- Comment #6 from patchapp at dberlin dot org 2006-03-13 11:20 --- Subject: Bug number PR26659 A patch for this bug has been added to the patch tracker. The mailing list url for the patch is http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00737.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26659
[Bug rtl-optimization/26254] [4.2 Regression] FAIL: gcc.c-torture/compile/20011109-1.c,-O1
--- Comment #8 from rakdver at gcc dot gnu dot org 2006-03-13 12:28 --- Subject: Bug 26254 Author: rakdver Date: Mon Mar 13 12:28:09 2006 New Revision: 111998 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=111998 Log: PR rtl-optimization/26254 * loop-invariant.c (seq_insns_valid_p): New function. (move_invariant_reg): Only emit new code if it is valid. Modified: trunk/gcc/ChangeLog trunk/gcc/loop-invariant.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26254
[Bug rtl-optimization/26663] New: Wrong generated code with gcc-3.4.x -O2 (ok with -O1 or gcc-3.3.6)
The attached C++ code converts a floating point number from our internal representation to IEEE format. When run, the program should print a value of 144. This works with gcc-3.3.6 at any optimization level (-O1 -- -O3). With gcc-3.4.x and optimization level -O2 or -O3, it prints 42 instead. I.e., it seems the iFloat variable in CMdilFloat::AsFloat() is never modified. Here's the output of the code with various vanilla gcc's, along with their configuration: $ uname -a Linux xxx.xxx.com 2.6.9-22.0.2.ELsmp #1 SMP Thu Jan 5 17:13:01 EST 2006 i686 i686 i386 GNU/Linux $ gcc-3.3.6/bin/g++ -v Reading specs from /home/steffenz/gcc/install/gcc-3.3.6/lib/gcc-lib/i686-pc-linux-gnu/3.3.6/specs Configured with: ../gcc-3.3.6/configure --prefix=/home/steffenz/gcc/install/gcc-3.3.6 --disable-nls --enable-languages=c,c++ Thread model: posix gcc version 3.3.6 $ gcc-3.3.6/bin/g++ -O1 -o test test.cpp ./test 144 $ gcc-3.3.6/bin/g++ -O2 -o test test.cpp ./test 144 $ gcc-3.3.6/bin/g++ -O3 -o test test.cpp ./test 144 $ gcc-3.4.0/bin/g++ -v Reading specs from /home/steffenz/gcc/install/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.0/specs Configured with: ../gcc-3.4.0/configure --prefix=/home/steffenz/gcc/install/gcc-3.4.0 --disable-nls --enable-languages=c,c++ Thread model: posix gcc version 3.4.0 $ gcc-3.4.0/bin/g++ -O1 -o test test.cpp ./test 144 $ gcc-3.4.0/bin/g++ -O2 -o test test.cpp ./test 42 $ gcc-3.4.0/bin/g++ -O3 -o test test.cpp ./test 42 $ gcc-3.4.6/bin/g++ -v Reading specs from /home/steffenz/gcc/install/gcc-3.4.6/lib/gcc/i686-pc-linux-gnu/3.4.6/specs Configured with: ../gcc-3.4.6/configure --prefix=/home/steffenz/gcc/install/gcc-3.4.6 --disable-nls --enable-languages=c,c++ Thread model: posix gcc version 3.4.6 $ gcc-3.4.6/bin/g++ -O1 -o test test.cpp ./test 144 $ gcc-3.4.6/bin/g++ -O2 -o test test.cpp ./test 42 $ gcc-3.4.6/bin/g++ -O3 -o test test.cpp ./test 42 -- Summary: Wrong generated code with gcc-3.4.x -O2 (ok with -O1 or gcc-3.3.6) Product: gcc Version: 3.4.6 Status: UNCONFIRMED Severity: critical Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: steffen dot zimmermann at philips dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26663
[Bug rtl-optimization/26663] Wrong generated code with gcc-3.4.x -O2 (ok with -O1 or gcc-3.3.6)
--- Comment #1 from steffen dot zimmermann at philips dot com 2006-03-13 13:12 --- Created an attachment (id=11038) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11038action=view) C++ source code to reproduce the bug -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26663
[Bug rtl-optimization/26663] Wrong generated code with gcc-3.4.x -O2 (ok with -O1 or gcc-3.3.6)
--- Comment #2 from rguenth at gcc dot gnu dot org 2006-03-13 13:37 --- This is invalid as it violates C/C++ aliasing rules. *** This bug has been marked as a duplicate of 21920 *** -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26663
[Bug c/21920] alias violating
--- Comment #86 from rguenth at gcc dot gnu dot org 2006-03-13 13:37 --- *** Bug 26663 has been marked as a duplicate of this bug. *** -- rguenth at gcc dot gnu dot org changed: What|Removed |Added CC||steffen dot zimmermann at ||philips dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21920
[Bug rtl-optimization/26254] [4.2 Regression] FAIL: gcc.c-torture/compile/20011109-1.c,-O1
--- Comment #9 from pinskia at gcc dot gnu dot org 2006-03-13 13:54 --- Fixed. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26254
[Bug target/26664] New: use of rjmp on devices with more than 2kb flash
I compile the following program with avr-gcc -mmcu=atmega8515 -o rjmp-test rjmp-test.c program: const char dummy[7000] __attribute__((__progmem__)); int main() {} - Error: The atmega8515 has 8Kb of flash memory. So gcc should not use the rjmp/rcall instructions since they can only access 2Kb. But avr-objdump shows this: Disassembly of section .text: __vectors: 0: bc cd rjmp.-1160 ; 0xfb7a __eeprom_e 2: d5 cd rjmp.-1110 ; 0xfbae __eeprom_e 4: d4 cd rjmp.-1112 ; 0xfbae __eeprom_e 6: d3 cd rjmp.-1114 ; 0xfbae __eeprom_e 8: d2 cd rjmp.-1116 ; 0xfbae __eeprom_e a: d1 cd rjmp.-1118 ; 0xfbae __eeprom_e c: d0 cd rjmp.-1120 ; 0xfbae __eeprom_e ... Discussion: All interrupt handlers are situated behind the 7000 byte array. Therefore the jumps of the interrupt vectors have to span at least 7000 bytes. That is not possible with the rjmp instruction which can span at most 2Kb. This results in the strange negative offsets. Solution: Jumps which span more than 2Kb have to be encoded using jmp not rjmp. -- Summary: use of rjmp on devices with more than 2kb flash Product: gcc Version: 4.0.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: hochstein at algo dot informatik dot tu-darmstadt dot de GCC build triplet: i686-pc-linux GCC host triplet: avr GCC target triplet: avr http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26664
[Bug rtl-optimization/26207] [4.2 Regression] file names in rtl dumps don't match reality
--- Comment #2 from pinskia at gcc dot gnu dot org 2006-03-13 14:07 --- Fixed by: 2006-03-13 Kazu Hirata [EMAIL PROTECTED] * doc/invoke.texi: Update dump file names. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26207
[Bug rtl-optimization/26663] Wrong generated code with gcc-3.4.x -O2 (ok with -O1 or gcc-3.3.6)
--- Comment #3 from pinskia at gcc dot gnu dot org 2006-03-13 14:15 --- * (u_32 *) pFloat = 0x7FFF; // NaN That is violating C/C++ aliasing rules. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26663
[Bug middle-end/18859] [4.0/4.1/4.2 Regression] ACATS ICE c37305a at -O0: in tree_low_cst, at tree.c:3839
--- Comment #13 from ebotcazou at gcc dot gnu dot org 2006-03-13 14:18 --- Subject: Bug 18859 Author: ebotcazou Date: Mon Mar 13 14:18:24 2006 New Revision: 112000 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=112000 Log: PR middle-end/18859 * gimplify.c (gimplify_switch_expr): Discard empty ranges. * stmt.c (expand_case): Likewise. Added: trunk/gcc/testsuite/gcc.dg/switch-9.c Modified: trunk/gcc/ChangeLog trunk/gcc/gimplify.c trunk/gcc/stmt.c trunk/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18859
[Bug middle-end/18859] [4.0/4.1/4.2 Regression] ACATS ICE c37305a at -O0: in tree_low_cst, at tree.c:3839
--- Comment #14 from ebotcazou at gcc dot gnu dot org 2006-03-13 14:23 --- Subject: Bug 18859 Author: ebotcazou Date: Mon Mar 13 14:23:15 2006 New Revision: 112001 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=112001 Log: PR middle-end/18859 * stmt.c (expand_case): Discard empty ranges. Added: branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/switch-9.c Modified: branches/gcc-4_1-branch/gcc/ChangeLog branches/gcc-4_1-branch/gcc/stmt.c branches/gcc-4_1-branch/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18859
[Bug middle-end/18859] [4.0/4.1/4.2 Regression] ACATS ICE c37305a at -O0: in tree_low_cst, at tree.c:3839
--- Comment #15 from ebotcazou at gcc dot gnu dot org 2006-03-13 14:26 --- Subject: Bug 18859 Author: ebotcazou Date: Mon Mar 13 14:26:02 2006 New Revision: 112002 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=112002 Log: PR middle-end/18859 * stmt.c (expand_case): Discard empty ranges. Added: branches/gcc-4_0-branch/gcc/testsuite/gcc.dg/switch-9.c Modified: branches/gcc-4_0-branch/gcc/ChangeLog branches/gcc-4_0-branch/gcc/stmt.c branches/gcc-4_0-branch/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18859
[Bug middle-end/18859] [4.0/4.1/4.2 Regression] ACATS ICE c37305a at -O0: in tree_low_cst, at tree.c:3839
--- Comment #16 from ebotcazou at gcc dot gnu dot org 2006-03-13 14:30 --- Fixed everywhere at last. -- ebotcazou at gcc dot gnu dot org changed: What|Removed |Added URL||http://gcc.gnu.org/ml/gcc- ||patches/2006- ||03/msg00572.html Status|ASSIGNED|RESOLVED Resolution||FIXED Target Milestone|4.1.1 |4.0.4 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18859
[Bug target/26664] use of rjmp on devices with more than 2kb flash
--- Comment #1 from pinskia at gcc dot gnu dot org 2006-03-13 14:34 --- Why do you think this is a GCC bug and not a binutils one? GCC does not produce __vectors as far as I can tell. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26664
[Bug target/26664] use of rjmp on devices with more than 2kb flash
--- Comment #2 from pinskia at gcc dot gnu dot org 2006-03-13 14:37 --- On second thought this should be closed as I did a grep for __vectors and found nothing in the GCC source. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26664
[Bug rtl-optimization/26663] Wrong generated code with gcc-3.4.x -O2 (ok with -O1 or gcc-3.3.6)
--- Comment #4 from jbeitaharon at intrusic dot com 2006-03-13 14:40 --- Subject: Re: Wrong generated code with gcc-3.4.x -O2 (ok with -O1 or gcc-3.3.6) Please take me off the CC list for this distribution. I don't need the encouragement of knowing that many people experience similar frustrations with gcc's poor (or shall I say, uninformative) warning messages in the case of aliasing rule violations. Thanks, Jonathan pinskia at gcc dot gnu dot org wrote: --- Comment #3 from pinskia at gcc dot gnu dot org 2006-03-13 14:15 --- * (u_32 *) pFloat = 0x7FFF; // NaN That is violating C/C++ aliasing rules. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26663
[Bug c++/26665] New: Gcc segmentation fault
My code is: //--- #include iostream template int I struct _integer { enum { _value = I }; typedef _integer_value _type; }; #define _I(Int) \ typename _integerInt::_type template class type void hi(type) { std::cout std::endl; } int main() { hi(_I(1)()); //Bug report here. return 0; } //--- And my gcc command is: g++ -O2 test.cpp -o test //--- The bug report is: test.cpp: In function int main(): test.cpp:72: internal compiler error: Segmentation fault -- Summary: Gcc segmentation fault Product: gcc Version: 4.0.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: babydavid at sjtu dot edu dot cn http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26665
[Bug target/26664] use of rjmp on devices with more than 2kb flash
--- Comment #3 from hochstein at algo dot informatik dot tu-darmstadt dot de 2006-03-13 14:54 --- This is no gcc problem. The vectors are generated by avr-libc. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26664
[Bug c++/26665] Gcc segmentation fault
--- Comment #1 from pinskia at gcc dot gnu dot org 2006-03-13 15:02 --- This was fixed in 4.0.3 see PR 23797. *** This bug has been marked as a duplicate of 23797 *** -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26665
[Bug c++/23797] [3.4 Regression] ICE on typename outside template
--- Comment #19 from pinskia at gcc dot gnu dot org 2006-03-13 15:02 --- *** Bug 26665 has been marked as a duplicate of this bug. *** -- pinskia at gcc dot gnu dot org changed: What|Removed |Added CC||babydavid at sjtu dot edu ||dot cn http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23797
[Bug middle-end/26643] [4.0/4.1/4.2 Regression] Linux matroxfb_probe miscompiled
-- pinskia at gcc dot gnu dot org changed: What|Removed |Added CC||pinskia at gcc dot gnu dot ||org Summary|Linux matroxfb_probe|[4.0/4.1/4.2 Regression] |miscompiled |Linux matroxfb_probe ||miscompiled Target Milestone|--- |4.0.4 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26643
[Bug target/26655] [4.0/4.1/4.2 Regression] ICE in ix86_secondary_memory_needed, at config/i386/i386.c:16446
--- Comment #10 from pinskia at gcc dot gnu dot org 2006-03-13 15:34 --- Actually this is a regression from 3.0.4. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added CC||pinskia at gcc dot gnu dot ||org Known to fail|3.3.3 4.1.0 4.0.0 3.4.0 |3.3.3 4.1.0 4.0.0 3.4.0 ||3.2.3 Known to work||3.0.4 2.95.3 Summary|ICE in |[4.0/4.1/4.2 Regression] ICE |ix86_secondary_memory_needed|in |, at|ix86_secondary_memory_needed |config/i386/i386.c:16446|, at ||config/i386/i386.c:16446 Target Milestone|--- |4.0.4 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26655
[Bug tree-optimization/26626] [4.2 Regression] ICE in in add_virtual_operand
--- Comment #10 from mueller at gcc dot gnu dot org 2006-03-13 16:17 --- it looks to me that this commit exposed/introduced the ICE: r111300 | dberlin | 2006-02-20 14:38:01 +0100 (Mon, 20 Feb 2006) | 22 lines Changed paths: M /trunk/gcc/ChangeLog M /trunk/gcc/passes.c M /trunk/gcc/tree-flow.h M /trunk/gcc/tree-pass.h M /trunk/gcc/tree-sra.c M /trunk/gcc/tree-ssa-alias.c M /trunk/gcc/tree-ssa-forwprop.c M /trunk/gcc/tree-ssa-operands.c M /trunk/gcc/tree.h 2006-02-20 Daniel Berlin [EMAIL PROTECTED] * tree.h (struct tree_memory_tag): Add is_used_alone member. (TMT_USED_ALONE): New macro. * tree-pass.h (PROP_tmt_usage): New property. (TODO_update_tmt_usage): New todo. * tree-ssa-alias.c (updating_used_alone): New variable. (recalculate_used_alone): New function. (compute_may_aliases): Set updating_used_alone, call recalculate_used_alone. * tree-sra.c (pass_sra): Note that this pass destroys PROP_tmt_usage, and add TODO_update_tmt_usage. * tree-ssa-forwprop.c (pass_forwprop): Ditto. * tree-flow.h (updating_used_alone): Prototype. (recalculate_used_alone): Ditto. * passes.c (execute_todo): Add code to set updating_used_alone, and call recalculate. * tree-ssa-operands.c (add_virtual_operand): Only append bare def for clobber if used alone, and add assert to verify used_alone status. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26626
[Bug tree-optimization/26667] New: Inlining always_inline functions causes further inlining that reduces function size to fail
Basically, consider the following case (shortened, full testcase will be attached): static __inline __m128 __attribute__((__always_inline__)) _mm_max_ps (__m128 __A, __m128 __B) { return (__m128) __builtin_ia32_maxps ((__v4sf)__A, (__v4sf)__B); } static __m128 mm_max_ps(const __m128 a, const __m128 b) { return _mm_max_ps(a,b); } ... more wrappers ... static bool __attribute__((always_inline)) bloatit(const __m128 a, const __m128 b) { const __m128 v0 = mm_max_ps(a,b), v1 = mm_min_ps(a,b), v2 = mm_mul_ps(a,b), v3 = mm_div_ps(a,b), g0 = mm_or_ps(_mm_or_ps(_mm_or_ps(v0,v1), v2), v3), v4 = mm_min_ps(mm_or_ps(a,b),mm_div_ps(b,a)), v5 = mm_max_ps(mm_min_ps(a,mm_div_ps(b,a)), mm_or_ps(b, mm_div_ps(b,g0))), g1 = mm_or_ps(g0,mm_or_ps(v4,v5)); return mm_movemask_ps(g1); } bool finalblow(const __m128 a, const __m128 b, const __m128 c, const __m128 d, const __m128 e, const __m128 f) { return bloatit(a,b) bloatit(c,d) bloatit(e,f) bloatit(a,c) bloatit(b,d) bloatit(c,e) bloatit(d,f) bloatit(b,a) bloatit(d,c) bloatit(f,e) bloatit(c,a) bloatit(d,b) bloatit(e,c) bloatit(f,d); } what happens is that as a first pass, all always_inline functions are inlined, so bloatit will be inlined into finalblow causing the size of finalblow after inlining to be greater than the max-function-growth limit. After that we now decide to look at the mm_* routines used in bloatit and decide if we can inline them into finalblow - which we do _not_ do because finalblow is already bigger than it may get due to the function-growth limit. Even if we correctly figure out that inlining the mm_* functions will _decrease_ the size of finalblow. Bad. We also incorrectly count the number of calls to mm_* in finalblow, which we count to be zero (0). -- Summary: Inlining always_inline functions causes further inlining that reduces function size to fail Product: gcc Version: 4.1.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rguenth at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26667
[Bug tree-optimization/26667] Inlining always_inline functions causes further inlining that reduces function size to fail
--- Comment #1 from rguenth at gcc dot gnu dot org 2006-03-13 16:22 --- Created an attachment (id=11039) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11039action=view) testcase Compile with -O3 -msse2, look at the optimized ipa-inline dump to see Deciding on inlining. Starting with 873 insns. Inlining always_inline functions: Considering bool bloatit(float __vector__, float __vector__) 413 insns (always inline) Inlined into bool finalblow(float __vector__, float __vector__, float __vector__, float __vector__, float __vector__, float __vector__) which now has 753 insns. ... Inlined into bool finalblow(float __vector__, float __vector__, float __vector__, float __vector__, float __vector__, float __vector__) which now has 5810 insns. Inlined for a net change of +5033 insns. Deciding on smaller functions: Considering inline candidate int mm_movemask_ps(float __vector__). Considering inline candidate float __vector__ mm_or_ps(float __vector__, float __vector__). Considering inline candidate float __vector__ mm_div_ps(float __vector__, float __vector__). Considering inline candidate float __vector__ mm_mul_ps(float __vector__, float __vector__). Considering inline candidate float __vector__ mm_min_ps(float __vector__, float __vector__). Considering inline candidate float __vector__ mm_max_ps(float __vector__, float __vector__). Considering float __vector__ mm_or_ps(float __vector__, float __vector__) with 16 insns to be inlined into bool bloatit(float __vector__, float __vector__) Estimated growth after inlined into all callees is -576 insns. Estimated badness is -147456. Not inlining into bool bloatit(float __vector__, float __vector__):--param large-function-growth limit reached. ... etc. ... doh! -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26667
[Bug tree-optimization/26667] Inlining always_inline functions causes further inlining that reduces function size to fail
--- Comment #2 from rguenth at gcc dot gnu dot org 2006-03-13 16:25 --- Patch for this was posted, it's not really a regression fix, though in 4.0.3 we inlined differently. Still I'd like to see the patch in 4.1.1. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rguenth at gcc dot gnu dot |dot org |org URL||http://gcc.gnu.org/ml/gcc- ||patches/2006- ||03/msg00739.html Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2006-03-13 16:25:17 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26667
[Bug libfortran/26661] Sequential formatted read goes too far
--- Comment #1 from jvdelisle at gcc dot gnu dot org 2006-03-13 16:34 --- Created an attachment (id=11040) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11040action=view) Example test case -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26661
[Bug libfortran/26661] Sequential formatted read goes too far
--- Comment #2 from jvdelisle at gcc dot gnu dot org 2006-03-13 16:36 --- Created an attachment (id=11041) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11041action=view) Example data file -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26661
[Bug testsuite/26344] [4.2 Regression] three testsuite failures in gcc.dg/tree-ssa/
--- Comment #15 from law at redhat dot com 2006-03-13 16:37 --- Subject: Re: New: [4.2 Regression] three testsuite failures in gcc.dg/tree-ssa/ On Fri, 2006-02-17 at 18:21 +, pinskia at gcc dot gnu dot org wrote: FAIL: gcc.dg/tree-ssa/20030730-1.c scan-tree-dump-times if 0 FAIL: gcc.dg/tree-ssa/20030730-2.c scan-tree-dump-times if 0 FAIL: gcc.dg/tree-ssa/20030807-2.c scan-tree-dump-times if 0 Last week's patch fixed 20030730-1.c and 2000730-2.c. This patch fixes 20030807-2.c. The existing VRP code never visits statements with virtual operands. In general, that's a good thing -- fewer statements to visit means less work and assignments with virtual operands are highly unlikely to produce useful ranges. However, there is one exception, when the RHS is a call to a built-in function. In that case we may be able to determine non-null ranges (builtin-alloca) and in some cases we can determine non-negative ranges. This patch allows VRP to visit assignments with virtual operands in this one case (RHS is a call to a built-in function). Once that's done the existing machinery will automatically discover ranges created by calls to these special built-in functions. Bootstrapped and regression tested on i686-pc-linux-gnu. --- Comment #16 from law at redhat dot com 2006-03-13 16:37 --- Created an attachment (id=11042) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11042action=view) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26344
[Bug libfortran/26661] Sequential formatted read goes too far
-- jvdelisle at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2006-03-13 16:37:35 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26661
[Bug target/26445] SSE byte-by-byte load instruction fails to compile
--- Comment #7 from gchernis11 at msn dot com 2006-03-13 16:43 --- Subject: RE: SSE byte-by-byte load instruction fails to compile Please let me know what is the status of this bug Please reply, Greg Chernis From: pinskia at gcc dot gnu dot org [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: [Bug target/26445] SSE byte-by-byte load instruction fails to compile Date: 3 Mar 2006 20:54:42 - --- Comment #5 from pinskia at gcc dot gnu dot org 2006-03-03 20:54 --- Reducing. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added CC||pinskia at gcc dot gnu dot ||org Keywords||ice-on-valid-code http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26445 --- You are receiving this mail because: --- You reported the bug, or are watching the reporter. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26445
[Bug tree-optimization/26626] [4.2 Regression] ICE in in add_virtual_operand
--- Comment #11 from dberlin at gcc dot gnu dot org 2006-03-13 16:52 --- Subject: Re: [4.2 Regression] ICE in in add_virtual_operand On Mon, 2006-03-13 at 16:17 +, mueller at gcc dot gnu dot org wrote: --- Comment #10 from mueller at gcc dot gnu dot org 2006-03-13 16:17 --- it looks to me that this commit exposed/introduced the ICE: Yes, we already know that :) Thanks though. This is just another case of us catching more illegal code with this ICE (as we used to with the NMT ice). The only solution in these cases it to remove the assert and let it generate bad code, but I want to fix other issues first before removing the assert. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26626
[Bug tree-optimization/26626] [4.2 Regression] ICE in in add_virtual_operand
--- Comment #12 from mueller at gcc dot gnu dot org 2006-03-13 16:56 --- ah, I see. I'm fine with working around the ICE locally and let you guys figure out how to fix the actual cause :) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26626
[Bug java/26390] Problem dispatching method call when method does not exist in superclass
--- Comment #5 from tromey at gcc dot gnu dot org 2006-03-13 17:10 --- Created an attachment (id=11043) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11043action=view) reduced test case I'm attaching a reduced test case. If you remove one of the intermediate classes, the output is correct. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26390
[Bug java/26390] Problem dispatching method call when method does not exist in superclass
--- Comment #6 from tromey at gcc dot gnu dot org 2006-03-13 17:12 --- I added a main to the reduced test case: public static void main(String[] args) { SwingFramePeer s = new SwingFramePeer(); s.setBounds(); } With this I can almost reproduce the original bug: opsy. gcj --main=pr26390 -o P pr26390.java /tmp/cczetk2m.o(.text+0xd): In function `void pr26390$SwingFramePeer::setBounds()': pr26390.java: undefined reference to `void pr26390$WindowPeer::setBounds()' I say almost since from the original report I would expect the missing method to come from SwingWindowPeer. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26390
[Bug middle-end/25989] gomp ICE with -O2 and schedule(guided)
-- jakub at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |jakub at gcc dot gnu dot org |dot org | Status|NEW |ASSIGNED Last reconfirmed|2006-01-27 12:50:51 |2006-03-13 17:15:39 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25989
[Bug libfortran/19303] Unformatted record header is 4-bytes on 32-bit targets
--- Comment #20 from tkoenig at gcc dot gnu dot org 2006-03-13 17:53 --- I'll take this, implementing the simplistic approach (generating an error for 2GB record sizes). This should keep the complexity down. -- tkoenig at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |tkoenig at gcc dot gnu dot |dot org |org Status|NEW |ASSIGNED Last reconfirmed|2006-02-05 20:01:55 |2006-03-13 17:53:20 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19303
[Bug target/26508] 4.1.0 doesn't build in 64bit on PA-RISC
--- Comment #19 from h dot m dot brand at xs4all dot nl 2006-03-13 18:03 --- As development perl was unable to complete it's testsuite with the installed 4.1.0/64, I went back to 4.0.3, where all works well. FYI pack and udp failures. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26508
[Bug target/26508] 4.1.0 doesn't build in 64bit on PA-RISC
--- Comment #20 from dave at hiauly1 dot hia dot nrc dot ca 2006-03-13 18:10 --- Subject: Re: 4.1.0 doesn't build in 64bit on PA-RISC As development perl was unable to complete it's testsuite with the installed 4.1.0/64, I went back to 4.0.3, where all works well. FYI pack and udp failures. There may not be another 4.0 release, so it would be useful if you could determine why these tests fail and submit PRs. Dave -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26508
[Bug java/26390] Problem dispatching method call when method does not exist in superclass
--- Comment #7 from tromey at gcc dot gnu dot org 2006-03-13 19:07 --- The bug here is that gcj implements method inheritance incorrectly. In particular it does not consider SwingWindowPeer.setBounds as a candidate method for the super.setBounds() call, because it has no notion that there is a method named SwingWindowPeer.setBounds. Instead it only considers SwingComponentPeer.setBounds -- which is not maximal and thus not selected. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26390
[Bug middle-end/25989] gomp ICE with -O2 and schedule(guided)
--- Comment #3 from jakub at gcc dot gnu dot org 2006-03-13 19:36 --- Subject: Bug 25989 Author: jakub Date: Mon Mar 13 19:36:19 2006 New Revision: 112023 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=112023 Log: PR middle-end/25989 * omp-low.c (expand_omp_for_generic): Mark istart0 and iend0 as addressable. * gcc.dg/gomp/pr25989.c: New test. Added: trunk/gcc/testsuite/gcc.dg/gomp/pr25989.c Modified: trunk/gcc/ChangeLog trunk/gcc/omp-low.c trunk/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25989
[Bug c++/26669] lost temporary
--- Comment #1 from igodard at pacbell dot net 2006-03-13 20:19 --- Created an attachment (id=11045) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11045action=view) compiler output -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26669
[Bug c++/26669] New: lost temporary
Compiling and running the attached code produces: ~/ootbc/common/test$ build/wideIntTest 0x12345678:0x8000:0x01234567:0x81234567:0x8000bc50 The first four hex outputs shows the expression done one subexpression at a time, and produces the correct result. The last hex shows the expression result as a single expression, and the result is wrong and the values contains what looks like an address. I guess that a temporary in the expression evaluation got overwritten. Ivan -- Summary: lost temporary Product: gcc Version: 4.0.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: igodard at pacbell dot net http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26669
[Bug target/26662] optimization on hashtable iterator produces bad code
--- Comment #5 from fang at csl dot cornell dot edu 2006-03-13 20:51 --- Filed to Apple. For the record, this is Bug #4476031 on the Radar, for those who might have access. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26662
[Bug rtl-optimization/25569] [4.2 Regression] FAIL: gfortran.dg/g77/20010610.f -O3 -fomit-frame-pointer -funroll-loops
--- Comment #7 from law at redhat dot com 2006-03-13 21:17 --- Subject: Re: [4.2 Regression] FAIL: gfortran.dg/g77/20010610.f -O3 -fomit-frame-pointer -funroll-loops On Wed, 2006-03-08 at 00:07 +, janis at gcc dot gnu dot org wrote: --- Comment #5 from janis at gcc dot gnu dot org 2006-03-08 00:07 --- A regression hunt on powerpc64-linux using the C test case from comment #4 identified this patch: http://gcc.gnu.org/viewcvs?view=revrev=110705 r110705 | law | 2006-02-07 18:31:27 + (Tue, 07 Feb 2006) That date doesn't match up with when the Fortran test started failing so I ran another regression hunt using it, which identified this patch: http://gcc.gnu.org/viewcvs?view=revrev=108425 r108425 | law | 2005-12-12 19:59:16 + (Mon, 12 Dec 2005) What I suspect is going on here is that we've got a latent bug in the RTL IV code. The patches referenced above merely expose instances of that underlying latent bug. I'm still trying to learn my way around the RTL IV code, if I can't figure it out pretty quickly I'll have to hand this off to Zdenek. jeff -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25569
[Bug libstdc++/19664] libstdc++ headers should have pop/push of the visibility around the declarations
--- Comment #90 from pluto at agmk dot net 2006-03-13 20:17 --- with 4.1.1 snapshot + patches I get an arts crash today on my x86_64 box. Starting program: /usr/bin/artsd Program received signal SIGSEGV, Segmentation fault. 0x2af641d8 in __gnu_cxx::__mt_allocArts::Notification*, __gnu_cxx::__common_pool_policy__gnu_cxx::__pool, true ::allocate () from /usr/lib64/libartsflow.so.1 (gdb) bt #0 0x2af641d8 in __gnu_cxx::__mt_allocArts::Notification*, __gnu_cxx::__common_pool_policy__gnu_cxx::__pool, true ::allocate () from /usr/lib64/libartsflow.so.1 #1 0x2ba4a908 in std::_Deque_baseArts::Notification, std::allocatorArts::Notification ::_M_initialize_map () from /usr/lib64/libmcop.so.1 #2 0x2ba49145 in Arts::NotificationManager::NotificationManager () from /usr/lib64/libmcop.so.1 #3 0x2ba2660a in Arts::Dispatcher::Dispatcher () from /usr/lib64/libmcop.so.1 #4 0x0041c227 in main (argc=1, argv=0x7fe818f8) at artsd.cc:275 arts works fine with visibiliy feature disabled. $ gcc -v Reading specs from /usr/lib64/gcc/x86_64-pld-linux/4.1.1/specs Target: x86_64-pld-linux Configured with: ../configure --prefix=/usr --with-local-prefix=/usr/local --libdir=/usr/lib64 --libexecdir=/usr/lib64 --infodir=/usr/share/info --mandir=/usr/share/man --x-libraries=/usr/lib64 --enable-shared --enable-threads=posix --enable-languages=c,c++,fortran,objc,obj-c++,ada,java --enable-c99 --enable-long-long --enable-multilib --enable-nls --disable-werror --with-gnu-as --with-gnu-ld --with-demangler-in-ld --with-system-zlib --with-slibdir=/lib64 --without-system-libunwind --enable-cmath --with-long-double-128 --with-gxx-include-dir=/usr/include/c++/4.1.1 --disable-libstdcxx-pch --enable-__cxa_atexit --enable-libstdcxx-allocator=mt --with-qt4dir=/usr/lib64/qt4 --disable-libjava-multilib --enable-libgcj --enable-libgcj-multifile --enable-libgcj-database --enable-gtk-cairo --enable-java-awt=qt,gtk,xlib --enable-jni --enable-xmlj --enable-alsa --enable-dssi x86_64-pld-linux Thread model: posix gcc version 4.1.1 20060308 (prerelease) (PLD-Linux) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19664
[Bug c++/26669] lost temporary
--- Comment #2 from igodard at pacbell dot net 2006-03-13 20:20 --- Created an attachment (id=11046) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11046action=view) source code (compressed) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26669
[Bug target/26662] optimization on hashtable iterator produces bad code
--- Comment #4 from fang at csl dot cornell dot edu 2006-03-13 20:21 --- (In reply to comment #3) You should report this to apple, because as 4.0 and 4.1 are not affected and both the 3.3 and the 3.4 branch are now closed, this PR will be just closed as fixed. Ok, I accept, as expected. But can I get an opinion on the validity of my code before I report this to them, especially regarding my use of hash_map::iterator? (perhaps a confirmation of reproducibility too, at least on powerpc-apple-darwin7-g++-3.3, from someone who has it?) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26662
[Bug rtl-optimization/25569] [4.2 Regression] FAIL: gfortran.dg/g77/20010610.f -O3 -fomit-frame-pointer -funroll-loops
--- Comment #8 from dave at hiauly1 dot hia dot nrc dot ca 2006-03-13 21:30 --- Subject: Re: [4.2 Regression] FAIL: gfortran.dg/g77/20010610.f -O3 -fomit-frame-pointer -funroll-loops r108425 | law | 2005-12-12 19:59:16 + (Mon, 12 Dec 2005) What I suspect is going on here is that we've got a latent bug in the RTL IV code. The patches referenced above merely expose instances of that underlying latent bug. We have the following IV bug still open on the PA: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26244 Dave -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25569