------- Comment #1 from rguenth at gcc dot gnu dot org 2007-11-08 21:45 ------- Confirmed.
Also, on 64bit x86_64 we don't see that this computes the modulus, but do foobar: .LFB2: movl $1000000000, %esi movq %rdi, %rax xorl %edx, %edx divq %rsi imulq $-1000000000, %rax, %rax addq %rdi, %rax ret for unsigned long long foobar(unsigned long long ns) { return ns % 1000000000L; } we produce instead foobar2: .LFB3: movl $1000000000, %edx movq %rdi, %rax movq %rdx, %rcx xorl %edx, %edx divq %rcx movq %rdx, %rax ret which is smaller and faster. Likewise the 32bit variant: foobar2: pushl %ebp movl %esp, %ebp subl $8, %esp pushl $0 pushl $1000000000 pushl 12(%ebp) pushl 8(%ebp) call __umoddi3 addl $16, %esp leave ret which would make this argument moot (ok, only by cheating ;)). The problem is supposedly that we don't fold (chrec_apply (varying_loop = 1 ) (chrec = {ns_2(D), +, 0x0ffffffffc4653600}_1) (x = ns_2(D) /[fl] 1000000000) (res = ns_2(D) + (ns_2(D) /[fl] 1000000000) * 0x0ffffffffc4653600)) which is ns_2 - (ns_2 / 1000000000) * 1000000000. -- rguenth at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenth at gcc dot gnu dot | |org BugsThisDependsOn| |32044 Status|UNCONFIRMED |NEW Component|c |tree-optimization Ever Confirmed|0 |1 Keywords| |missed-optimization Last reconfirmed|0000-00-00 00:00:00 |2007-11-08 21:45:30 date| | Target Milestone|--- |4.3.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34027