[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-29 Thread tg at mirbsd dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 Thorsten Glaser tg at mirbsd dot org changed: What|Removed |Added CC||tg at mirbsd dot

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread j...@jak-linux.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #1 from Julian Andres Klode j...@jak-linux.org 2010-10-26 14:30:24 UTC --- Created attachment 22162 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=22162 Clang's assember Attaching the assembler output from clang, it should help

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 Paolo Carlini paolo.carlini at oracle dot com changed: What|Removed |Added CC|

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread j...@jak-linux.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #3 from Julian Andres Klode j...@jak-linux.org 2010-10-26 14:32:27 UTC --- System information: Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-5'

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #4 from Jonathan Wakely redi at gcc dot gnu.org 2010-10-26 14:47:12 UTC --- GCC's output is significantly faster at -O3 or without the noinline attribute

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread j...@jak-linux.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #5 from Julian Andres Klode j...@jak-linux.org 2010-10-26 14:53:24 UTC --- (In reply to comment #4) GCC's output is significantly faster at -O3 or without the noinline attribute I just tested and at -O3, gcc-4.4 creates slow code

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #6 from Dominique d'Humieres dominiq at lps dot ens.fr 2010-10-26 14:59:18 UTC --- You get this kind of speedup if the compiler knows that the result of the loop is sum=(b*(b-1)-a*(a-1))/2 In which case the timing is meaningless (it

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread j...@jak-linux.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #7 from Julian Andres Klode j...@jak-linux.org 2010-10-26 15:00:37 UTC --- (In reply to comment #5) (In reply to comment #4) GCC's output is significantly faster at -O3 or without the noinline attribute I just tested and at

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread j...@jak-linux.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #8 from Julian Andres Klode j...@jak-linux.org 2010-10-26 15:25:56 UTC --- (In reply to comment #6) You get this kind of speedup if the compiler knows that the result of the loop is sum=(b*(b-1)-a*(a-1))/2 In which case the

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #9 from Jonathan Wakely redi at gcc dot gnu.org 2010-10-26 15:28:51 UTC --- (In reply to comment #8) Since the optimization seems to be mostly there in -O3, it's just a matter of enabling it in -O2. Or if you want all

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #11 from Paolo Carlini paolo.carlini at oracle dot com 2010-10-26 15:42:58 UTC --- Can we please stop talking about nano and giga numbers like kids? If an optimization like complete loop unrolling is involved of course very small or

Re: [Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread Andrew Pinski
On Oct 26, 2010, at 7:30 AM, j...@jak-linux.org gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #1 from Julian Andres Klode j...@jak-linux.org 2010-10-26 14:30:24 UTC --- Created attachment 22162 --

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread pinskia at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #12 from pinskia at gmail dot com pinskia at gmail dot com 2010-10-26 15:56:20 UTC --- On Oct 26, 2010, at 7:30 AM, j...@jak-linux.org gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 ---

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #13 from Dominique d'Humieres dominiq at lps dot ens.fr 2010-10-26 16:36:05 UTC --- This multiplication transformation is incorrect if the loop wraps (unsigned always wraps; never overflows). I think this is wrong: wrapping is

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #15 from Dominique d'Humieres dominiq at lps dot ens.fr 2010-10-26 17:15:31 UTC --- For sum += 2 or sum += b sccp handles this, so I wonder whether it couldn't handle even the sum += a case. 2 and b are constants while a is not.

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #16 from Jakub Jelinek jakub at gcc dot gnu.org 2010-10-26 18:43:40 UTC --- chrec_apply is called with {a_4(D), +, {a_4(D) + 1, +, 1}_1}_1 chrec and ~a_4(D) + b_5(D) in x. I wonder if this can be fixed just by recognizing such special

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #17 from Dominique d'Humieres dominiq at lps dot ens.fr 2010-10-26 18:53:49 UTC --- Note that clang seems to know the general result: \sum_{i=a}^b p(i)=P(b), where p(i) is a given polynomial of degree n and P(x) a polynomial of degree

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #18 from Jakub Jelinek jakub at gcc dot gnu.org 2010-10-26 19:11:59 UTC --- I guess you mean LLVM instead of clang, I'm pretty sure the FE doesn't perform this optimization. Anyway, given: #define F(n, exp) \ unsigned long

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread joseph at codesourcery dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #19 from joseph at codesourcery dot com joseph at codesourcery dot com 2010-10-26 20:29:56 UTC --- On Tue, 26 Oct 2010, dominiq at lps dot ens.fr wrote: --- Comment #13 from Dominique d'Humieres dominiq at lps dot ens.fr

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #20 from Jakub Jelinek jakub at gcc dot gnu.org 2010-10-26 21:00:11 UTC --- If I translate the assembly back to C, it seems it is performing part of the arithmetics in TImode: unsigned long f (unsigned long a, unsigned long b) { if

[Bug c/46186] Clang creates code running 1600 times faster than gcc's

2010-10-26 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46186 --- Comment #21 from Dominique d'Humieres dominiq at lps dot ens.fr 2010-10-26 21:06:48 UTC --- I guess you mean LLVM instead of clang, Yes, if you prefer. I was referring to the command I used. F (6, a * a * a * a * a + 2 * a * a * a + 5 *