[Bug driver/114658] branch "releases/gcc-13" builds "gcc version 14.0.1 (experimental)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114658 felix-gcc at fefe dot de changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #3 from felix-gcc at fefe dot de --- ok it looks like it was my fault (surprise) and I fixed it. Here's what I did: $ git checkout master Switched to branch 'master' Your branch is up to date with 'origin/master'. $ git branch -D releases/gcc-13 Deleted branch releases/gcc-13 (was 32fb04adae9). $ git checkout releases/gcc-13 Updating files: 100% (40334/40334), done. branch 'releases/gcc-13' set up to track 'origin/releases/gcc-13'. Switched to a new branch 'releases/gcc-13' $ cat gcc/BASE-VER 13.2.1 Sorry again for the noise. Hope this helps the next git noob :)
[Bug driver/114658] branch "releases/gcc-13" builds "gcc version 14.0.1 (experimental)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114658 --- Comment #2 from felix-gcc at fefe dot de --- I'm probably doing something really stupid wrong, sorry for the noise. Here's what I'm doing: $ git checkout releases/gcc-13 Switched to branch 'releases/gcc-13' $ git branch master * releases/gcc-13 $ cat gcc/BASE-VER 14.0.1
[Bug driver/114658] New: branch "releases/gcc-13" builds "gcc version 14.0.1 (experimental)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114658 Bug ID: 114658 Summary: branch "releases/gcc-13" builds "gcc version 14.0.1 (experimental)" Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: driver Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- Not sure how and where to file this bug, sorry. I'm trying to build the current stable release branch, i.e. 13.2 with bug fixes from git. So I do git checkout releases/gcc-13 and build gcc, but the result doesn't say it is gcc 13.2.1, it says it's gcc 14.0.1 (experimental). Shouldn't this branch contain the non-experimental version?
[Bug other/107614] build goes through but make install fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107614 felix-gcc at fefe dot de changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from felix-gcc at fefe dot de --- oops sorry my build script was at fault.
[Bug other/107614] New: build goes through but make install fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107614 Bug ID: 107614 Summary: build goes through but make install fails Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- I'm trying to build the current gcc git and install it to /opt/gcc so it doesn't clash with the system gcc. This is on x86_64 Linux. The build goes through but make install fails in x86_64-pc-linux-gnu/libsanitizer/lsan: /usr/bin/mkdir -p '/tmp/fefix/usr/lib64/../lib64' /opt/diet/bin/install -c -m 644 liblsan_preinit.o '/tmp/fefix/usr/lib64/../lib64' /usr/bin/mkdir -p '/tmp/fefix/usr/lib64/../lib64' /bin/sh ../libtool --mode=install /opt/diet/bin/install -c liblsan.la '/tmp/fefix/usr/lib64/../lib64' libtool: install: error: cannot install `liblsan.la' to a directory not ending in /opt/gcc/lib64/../lib64 /tmp/fefix is my $DESTDIR for this make install. gcc make install is trying to install liblsal to /usr/lib64 but libtool refuses because that's not under /opt/gcc/lib64 where the rest of gcc goes.
[Bug ipa/105728] dead store to static var not optimized out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105728 --- Comment #4 from felix-gcc at fefe dot de --- If you do have a printf that references debug_cnt, it wouldn't be removed, right? If you expect unreferenced variables to not be optimized out, you can always compile without optimizer. For local variables even that doesn't help with clang already. OTOH we do have attributes "used" and "unused". They could be extended to variables.
[Bug c/105728] New: dead store to static var not optimized out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105728 Bug ID: 105728 Summary: dead store to static var not optimized out Product: gcc Version: 11.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- Consider this piece of test code: int dummy1() { static int removeme = 0; if (removeme) { return 0; } removeme = 1; return 0; } int dummy2() { static int removeme = 0; if (!removeme) removeme = 1; return 0; } int dummy3() { static int removeme = 0; removeme = 1; return 0; } To me, all of these do the same thing and should generate the same code. As nobody else can see removeme, and we aren't leaking its address, shouldn't the compiler be able to deduce that all accesses to removeme are inconsequential and can be removed? My gcc 11.3 generates a condidion and a store and a return 0 for dummy1, the same thing for dummy2, but for dummy3 it understands that it only needs to emit a return 0.
[Bug analyzer/100294] New: need attribute takes_ownership
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100294 Bug ID: 100294 Summary: need attribute takes_ownership Product: gcc Version: 11.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: analyzer Assignee: dmalcolm at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- Now that -fanalyzer is here to track leaking memory, I need a way to tell gcc that a function takes ownership of a pointer. You could call it takes_ownership or maybe free. Here's my setup: I have a function that does I/O batching for you. You have a batch as a context variable, then you add buffers to it, then you write the whole batch to a descriptor (or callback). The idea is that the descriptor can point to a non-blocking socket and the abstraction takes care of repeatedly writing the next bit in the vector after a partial write. Anyway, I have a function that adds a buffer to the batch, and I have a function that adds a buffer to the batch plus the batch takes ownership of the pointer, i.e. when you are done with the batch and close it, all those pointers will be freed. -fanalyze now (rightly) complains that I'm leaking the memory of the pointer I gave to the function that takes ownership. I need a way to either say "takes ownership" or maybe, even better, a way to say how the free will happen, so malloc+free matching in gcc 11 can apply.
[Bug c/98460] New: _builtin_cpu_supports("sha") missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98460 Bug ID: 98460 Summary: _builtin_cpu_supports("sha") missing Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- gcc offers a useful builtin around the cpuid instruction on x86 and x86_64, which can be used to check for specific instruction set extensions, e.g. if (_builtin_cpu_supports("avx2")) I need to check for the SHA-NI extension, which does not appear to be supported. However, checking for AES-NI is supported with the string "aes".
[Bug analyzer/95000] -fanalyzer confused by switch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95000 --- Comment #2 from felix-gcc at fefe dot de --- The false positive also happens if you fix that. In fact, my original (much longer) code does not try to write to read-only memory. I put that in my test case in the hope that somebody would mention it, so I can point out that gcc -fanalyzer could warn about it, but doesn't. So thank you for falling for my trap. :-)
[Bug analyzer/95000] New: -fanalyzer confused by switch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95000 Bug ID: 95000 Summary: -fanalyzer confused by switch Product: gcc Version: 10.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: analyzer Assignee: dmalcolm at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- Consider this contrived test code: void proof(char* x) { char* y=0; switch (*x) { case 'a': y="foo"; case 'b': if (*x=='a') *y='b'; } } -fanalyzer will warn about the *y='b' statement, that y might be NULL here. However if *x=='a' then we got here via the case 'a' case which initialized it. Other than this minor false positive issue thank you for -fanalyzer! It has already found a few bugs for me!
[Bug c/94444] __attribute__((access(...))) ignored for memcpy when compiling with -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9 --- Comment #4 from felix-gcc at fefe dot de --- Sure, here's a test case: #include __attribute__((access(read_only,2,3), access(write_only,1,3))) extern void* memcpy(void* dest, const void* src, size_t len); int main() { char buf[10]; memcpy(buf,"fnordfnord",11); // should reject or at least warn } $ gcc -c t.c t.c: In function ‘main’: t.c:8:3: warning: ‘memcpy’ writing 11 bytes into a region of size 10 overflows the destination [-Wstringop-overflow=] 8 | memcpy(buf,"fnordfnord",11); // should reject or at least warn | ^~~ t.c:4:14: note: in a call to function ‘memcpy’ declared with attribute ‘write_only (1, 3)’ 4 | extern void* memcpy(void* dest, const void* src, size_t len); | ^~ $ gcc -c t.c -Os $ gcc -v [...] gcc version 10.0.1 20200401 (experimental) (GCC)
[Bug c/94444] New: __attribute__((access(...))) ignored for memcpy when compiling with -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9 Bug ID: 9 Summary: __attribute__((access(...))) ignored for memcpy when compiling with -Os Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- I read about the access attribute and proceeded to add the annotations to the string.h of my little libc. It works. But it stops working when I compile with -Os. I suspect it's because gcc is switching to __buildin_memcpy then?
[Bug c++/93703] global const getting lost in g++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93703 --- Comment #2 from felix-gcc at fefe dot de --- OK that answers half of the mystery, but why is foo not mangled?
[Bug c++/93703] New: global const getting lost in g++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93703 Bug ID: 93703 Summary: global const getting lost in g++ Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- I suspect this is my mistake, not gcc's, but I'm out of ideas so sorry for wasting your time if that is true. Here is my situation: $ cat a.cc const int foo = 23; $ cat b.cc extern const int foo; int main() { return foo; } $ g++ -o x a.cc b.cc /usr/lib64/gcc/x86_64-pc-linux-gnu/9.2.0/../../../../x86_64-pc-linux-gnu/bin/ld: /tmp/ccXS1J5J.o: in function `main': b.cc:(.text+0x6): undefined reference to `foo' collect2: error: ld returned 1 exit status $ Wait, what? $ g++ -c a.cc b.cc && nm a.o b.o a.o: r _ZL3foo b.o: U foo T main $ What the...? Why is foo mangled in a.o but not in b.o? I tried this with clang++ too and there it's even worse: a.o is empty and does not even export a mangled foo. What am I missing?
[Bug rtl-optimization/93328] New: missed optimization opportunity in deserialization code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93328 Bug ID: 93328 Summary: missed optimization opportunity in deserialization code Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- Deserialization code often deals with endianness and alignment. However, in some cases, the protocol endianness is the same as the host endianness, and your platform does not care about alignment or has an unaligned load instruction. Take this code, for example: unsigned int foo(const unsigned char* c) { return c[0] + c[1]*0x100 + c[2]*0x1 + c[3]*0x100; } On i386 or x86_64, this could just be compiled into a single load. In fact, clang does compile this into a single load. gcc however turns it into four loads and three shifts. For some use cases this optimization could be a huge improvement. In fact, even if the alignment does not match, this could be a huge improvement. The compiler could turn it into an load + bswap. In fact, clang does compile the big endian version into a load + bswap.
[Bug bootstrap/80656] mips64-linux cross build fails: Link tests are not allowed after GCC_NO_EXECUTABLES
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80656 --- Comment #1 from felix-gcc at fefe dot de --- Turns out my libc was installed incorrectly. Retrying now. I'm still getting this build error in libgomp and libstdc++.
[Bug bootstrap/80656] New: mips64-linux cross build fails: Link tests are not allowed after GCC_NO_EXECUTABLES
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80656 Bug ID: 80656 Summary: mips64-linux cross build fails: Link tests are not allowed after GCC_NO_EXECUTABLES Product: gcc Version: 7.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- The build fails at least in libquadmath or libssp. checking whether the /tmp/build/./gcc/xgcc -B/tmp/build/./gcc/ -B/opt/cross/mips64-linux/bin/ -B/opt/cross/mips64-linux/lib/ -isystem /opt/cross/mips64-linux/include -isystem /opt/cross/mips64-linux/sys-include linker (/tmp/build/./gcc/collect-ld) supports shared libraries... yes checking dynamic linker characteristics... configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES. make[1]: *** [Makefile:11673: configure-target-libssp] Error 1 make[1]: Leaving directory '/tmp/build' make: *** [Makefile:894: all] Error 2
[Bug c/79459] New: Please add enable_if and diagnose_if attributes (from clang)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79459 Bug ID: 79459 Summary: Please add enable_if and diagnose_if attributes (from clang) Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- clang supports several advanced __attribute__ cases that gcc does not, and two strike me as particularly useful: enable_if and diagnose_if. You can find the documentation on those here: https://clang.llvm.org/docs/AttributeReference.html#diagnose-if https://clang.llvm.org/docs/AttributeReference.html#enable-if Basically, diagnose_if allows to add custom warning messages, think of it as a superset of the nonnull attribute. This could be a great tool to improve code quality, if applied to an API from a library. enable_if is much more complex and probably a lot harder to implement. It allows to have a special case version of a library function and use the attribute to tell the compiler to call it if the compiler knows the special case is true. For example, one could have a special memset version that does not need to do alignment handling if the compiler can tell that the destination buffer is 16 byte aligned.
[Bug c/69960] "initializer element is not constant"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69960 --- Comment #4 from felix-gcc at fefe dot de --- So which part of it is not constant, you would say? It all looks constant to me. It only operates on constants. If 3+4 is constant, why should this not be constant?
[Bug c/69960] "initializer element is not constant"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69960 --- Comment #2 from felix-gcc at fefe dot de --- uh, yes in C. $ cat test.c #define TOLOWER(x) (x&~0x20) #define Word(s) \ s[1] ? s[2] ? s[3] ? \ (TOLOWER(s[0]) << 24) + (TOLOWER(s[1]) << 16) + (TOLOWER(s[2]) << 8) + TOLOWER(s[3]) : \ (TOLOWER(s[0]) << 16) + (TOLOWER(s[1]) << 8) + TOLOWER(s[2]) : \ (TOLOWER(s[0]) << 8) + TOLOWER(s[1]) : \ TOLOWER(s[0]) const unsigned int _the = Word("the"); $ clang -c test.c $ clang --version clang version 3.9.0 (trunk 261746)
[Bug c/69960] New: "initializer element is not constant"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69960 Bug ID: 69960 Summary: "initializer element is not constant" Product: gcc Version: 5.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- This is the code: #define TOLOWER(x) (x&~0x20) #define Word(s) \ s[1] ? s[2] ? s[3] ? \ (TOLOWER(s[0]) << 24) + (TOLOWER(s[1]) << 16) + (TOLOWER(s[2]) << 8) + TOLOWER(s[3]) : \ (TOLOWER(s[0]) << 16) + (TOLOWER(s[1]) << 8) + TOLOWER(s[2]) : \ (TOLOWER(s[0]) << 8) + TOLOWER(s[1]) : \ TOLOWER(s[0]) const unsigned int _the = Word("the"); When compiling, this happens: test.c:9:32: error: initializer element is not constant const unsigned int _the = Word("the"); ^ test.c:3:3: note: in definition of macro ‘Word’ s[1] ? s[2] ? s[3] ? \ ^ How is this not constant? clang thinks it is constant.
[Bug other/69280] New: Where did -fno-plt go?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69280 Bug ID: 69280 Summary: Where did -fno-plt go? Product: gcc Version: 5.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- I was looking for a way to create statically linked PIE ELF binaries under Linux. Since these are statically linked and have no shared library, I am looking for a way to tell gcc to not use GOT or PLT for symbol lookup. Just use PC-relative addressing and assume all referenced symbols are local to this shared object. When googling, I found a page in the official gcc documentation online mentioning -fno-plt: https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html This appears to be from an older gcc version? My local gcc 5.3 does not recognize the option. Am I supposed to use -fvisibility now for this? Thanks, Felix
[Bug libstdc++/57716] New: std::thread does not compile with vectorint as argument
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57716 Bug ID: 57716 Summary: std::thread does not compile with vectorint as argument Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de When trying out the std::thread support in g++ 4.8.1, I tried this test program: #include thread #include vector using namespace std; void thethread(vectorint b) { // do something } int main() { vectorint x { 1,2,3 }; thethread(x); // works thread foo(thethread,x); // compiler error foo.join(); } calling the thread directly works, but doing it via the thread initialization fails with this error message: In file included from /usr/include/c++/4.8.1/thread:39:0, from t.cc:1: /usr/include/c++/4.8.1/functional: In instantiation of ‘struct std::_Bind_simplevoid (*(std::vectorint))(std::vectorint)’: /usr/include/c++/4.8.1/thread:137:47: required from ‘std::thread::thread(_Callable, _Args ...) [with _Callable = void ()(std::vectorint); _Args = {std::vectorint, std::allocatorint }]’ t.cc:13:25: required from here /usr/include/c++/4.8.1/functional:1697:61: error: no type named ‘type’ in ‘class std::result_ofvoid (*(std::vectorint))(std::vectorint)’ typedef typename result_of_Callable(_Args...)::type result_type; ^ /usr/include/c++/4.8.1/functional:1727:9: error: no type named ‘type’ in ‘class std::result_ofvoid (*(std::vectorint))(std::vectorint)’ _M_invoke(_Index_tuple_Indices...) ^ I also tried 4.8.0, same issue. A fried tried this with g++ 4.6.3 and it worked there, so it appears to be a regression. Or maybe Ubuntu fixed something in their branch of g++. I'm using stock gcc, the friend tried the gcc from Ubuntu 12.04. Note that the issue goes away if I change the function to take a pointer instead of a reference to a vectorint.
[Bug libstdc++/57716] std::thread does not compile with vectorint as argument
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57716 felix-gcc at fefe dot de changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WONTFIX --- Comment #1 from felix-gcc at fefe dot de --- Never mind, this appears to be by design. http://stackoverflow.com/questions/15235885/invalid-initialization-of-non-const-reference-with-c11-thread
[Bug rtl-optimization/56719] New: missed optimization: i 0xffff || i*4 0xffff
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56719 Bug #: 56719 Summary: missed optimization: i 0x || i*4 0x Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: felix-...@fefe.de This is the test code: int foo(unsigned int i) { if (i 0x || i*4 0x) baz(); } gcc -O2 generates a cmp, a shift, and another cmp. Why does this not generate a single cmp with 0x3fff?
[Bug middle-end/56719] missed optimization: i 0xffff || i*4 0xffff
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56719 --- Comment #3 from felix-gcc at fefe dot de 2013-03-25 14:41:10 UTC --- @comment 1: maybe it's me but that does not make any sense. 3fff is wrong and the correct value is 3fff? Huh? @comment 2: I extracted this code from a piece of commercial production software compiled with gcc. Not sure where you draw the line but to me that makes it relevant :-)
[Bug middle-end/56719] missed optimization: i 0xffff || i*4 0xffff
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56719 --- Comment #5 from felix-gcc at fefe dot de 2013-03-25 15:06:02 UTC --- Yes. However I'd hope that fixing this case would mean that gcc also catches the case where it is split to multiple if statements. I think this statement came about because they had a range check and someone pointed out that checking foo*40x could be circumvented via an integer overflow if foo is untrusted and very large. There are smarter ways to do this but it's not completely mind-bogglingly incomprehensible why this code would come about. I have in fact been advocating for a while that programmers should rather spell out their security checks as plainly as possible and let the compiler optimize them and remove superfluous checks. See http://www.fefe.de/source-code-optimization.pdf if you are interested.
[Bug middle-end/56719] missed optimization: i 0xffff || i*4 0xffff
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56719 --- Comment #7 from felix-gcc at fefe dot de 2013-03-25 16:01:14 UTC --- I filed this bug because I was under the impression that gcc was already supposed to optimize this out as part of the value range optimizations. You probably know better than me whether the required effort would be disproportionate. I'd still vote for supporting this case because then I can go around and tell people to worry about writing readable code instead of worrying about code that the compiler will compile well.
[Bug rtl-optimization/56711] New: spectaculary bad code generated for __uint128_t
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56711 Bug #: 56711 Summary: spectaculary bad code generated for __uint128_t Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: felix-...@fefe.de Consider this function: size_t scan_ulong(const char* src,unsigned long int* dest) { register const char *tmp=src; register unsigned long int l=0; register unsigned char c; while ((c=*tmp-'0')10) { __uint128_t x=(__uint128_t)l*10+c; if ((unsigned long)x != x) break; l=(unsigned long)x; ++tmp; } if (tmp-src) *dest=l; return tmp-src; } I'm compiling this with gcc -Os -c test.c on an amd64-linux box. The code gcc generates is 92 bytes long, the one from clang only 65. What is happening here? What are all that code doing that gcc is generating there?
[Bug libstdc++/55815] New: switch hash function of libstdc++ hash tables to siphash
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55815 Bug #: 55815 Summary: switch hash function of libstdc++ hash tables to siphash Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: felix-...@fefe.de Hash functions traditionally used by language runtimes for hash tables do not assume that input values will be chosen maliciously to cause collisions and degrade performance. This has become a published attack vector on internet facing hash tables as used in, for example, web services or even memory cache code in front of a database or so. libsupc++ implements the Murmur hash, which was specifically targeted in a recent paper attacking hash functions. See https://131002.net/siphash/ for the attack code that produces collisions in Murmur2 and Murmur3. libsupc++ should switch the hash function to siphash, the function proposed by the authors of this attack. The same bug should be filed against other user facing hash table implementations in gcc. I can think of Java and Go, but there might be others. It may even make sense to replace the hash code gcc itself uses, as there are now web pages where you can paste code and see which code gcc generates for it, turning this problem into a security issue if someone pastes code with colliding symbols to exploit this problem.
[Bug bootstrap/53240] New: gcc 4.7 cross compiler build fails in libssp; link test not allowed after GCC_NO_EXECUTABLES
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53240 Bug #: 53240 Summary: gcc 4.7 cross compiler build fails in libssp; link test not allowed after GCC_NO_EXECUTABLES Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: major Priority: P3 Component: bootstrap AssignedTo: unassig...@gcc.gnu.org ReportedBy: felix-...@fefe.de checking whether -lc should be explicitly linked in... no checking dynamic linker characteristics... configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES. make[1]: *** [configure-target-libssp] Error 1 make[1]: Leaving directory `/home/leitner/cross-build/arm/obj/gcc3' make: *** [all] Error 2 I ran the original configure with --target=arm-linux-gnueabi --prefix=/opt/cross --with-sysroot=/opt/cross/arm-linux-gnueabi --enable-languages=c,c++ The eglibc documentation on how to get a cross toolchain recommends disabling SSP, but I need SSP! That's an integral security feature, it needs to work!
[Bug bootstrap/53240] gcc 4.7 cross compiler build fails in libssp; link test not allowed after GCC_NO_EXECUTABLES
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53240 --- Comment #2 from felix-gcc at fefe dot de 2012-05-04 22:30:45 UTC --- I was talking about the second gcc. Turns out the steps until then broke something.
[Bug inline-asm/39590] New: inline asm %z on amd64 says ll instead of q
I am trying to write an inline asm statement that atomically adds a number to a memory variable. Here's what I came up with: #define atomic_add(mem,val) asm volatile (lock; add%z0 %1, %0: +m (mem): ir (val)) This appears to work fine on x86, but in 64-bit mode %z returns ll instead of q for 64-bit values like size_t, and thus the assembler complains like this: t.c:53: Error: no such instruction: `addll $3,x.5802(%rip)' If I understand %z correctly, this is exactly what it is meant for... right? Please make it return q in this case. -- Summary: inline asm %z on amd64 says ll instead of q Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: inline-asm AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39590
[Bug inline-asm/39590] inline asm %z on amd64 says ll instead of q
--- Comment #2 from felix-gcc at fefe dot de 2009-03-30 19:54 --- Uh, I did. Use the macro like this: int foo=2; atomic_add(foo,3); then try size_t as type of foo and compile on x86_64. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39590
[Bug inline-asm/39590] inline asm %z on amd64 says ll instead of q
--- Comment #4 from felix-gcc at fefe dot de 2009-03-30 20:27 --- #include stddef.h #define atomic_add(mem,val) asm volatile (lock; add%z0 %1, %0: +m (mem): ir (val)) int main() { size_t foo; atomic_add(foo,23); } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39590
[Bug inline-asm/39590] inline asm %z on amd64 says ll instead of q
--- Comment #6 from felix-gcc at fefe dot de 2009-03-30 22:10 --- 'z' is for x87 insns. Uh, what?! Let me quote the relevant documentation (gcc/config/i386/i386.md): ;; The special asm out single letter directives following a '%' are: ;; 'z' mov%z1 would be movl, movw, or movb depending on the mode of ;; operands[1]. No mention of floating point. You have to check size of size_t and use proper suffix in your code. No. The whole point of %z is that you can write asm statements in a way that does not specify the argument size explicitly in the statement. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39590
[Bug c/38688] New: __sync_lock_test_and_set does not actually lock
I was looking for something like MSVC's InterlockedIncrement in gcc and found the __sync_lock_test_and_set builtin. I wrote a small test program: #include stdio.h int l; int main() { printf(%d\n,__sync_lock_test_and_set(l,1)); } and when I look at the disassembly I get xchgl l(%rip), %esi in 64-bit mode and xchgl l, %eax in 32-bit mode. Notably missing is the lock prefix. I was expecting a lock prefix since the builtin is called __sync_LOCK_test_and_set. Should there not be a lock here? -- Summary: __sync_lock_test_and_set does not actually lock Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38688
[Bug c/38688] __sync_lock_test_and_set does not actually lock
--- Comment #1 from felix-gcc at fefe dot de 2009-01-01 15:58 --- All I really want is a way to access lock cmpxchg and lock xadd, really :-) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38688
[Bug c/38688] __sync_lock_test_and_set does not actually lock
--- Comment #2 from felix-gcc at fefe dot de 2009-01-01 16:01 --- Sorry, I just found out that xchg has an implicit lock. Never mind about this bug. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38688
[Bug libgcj/38438] New: build error in x86_64-unknown-linux-gnu/32/libjava
I have had this error for months now. When I try to build the current svn HEAD I get this compile error: make[5]: Entering directory `/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/32/libjava' /bin/sh ./libtool --tag=GCJ --mode=link /home/leitner/tmp/gcc/build/gcc/gcj -B/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/32/libjava/ -B/home/leitner/tmp/gcc/build/gcc/ -L/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/32/libjava -ffloat-store -fomit-frame-pointer -g -O2 -m32 -m32 -o grmic --main=gnu.classpath.tools.rmic.Main -rpath /opt/gcc/lib64/../lib -shared-libgcc -L/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/32/libjava/.libs libgcj-tools.la libtool: link: /home/leitner/tmp/gcc/build/gcc/gcj -B/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/32/libjava/ -B/home/leitner/tmp/gcc/build/gcc/ -ffloat-store -fomit-frame-pointer -g -O2 -m32 -m32 -o .libs/grmic --main=gnu.classpath.tools.rmic.Main -shared-libgcc -L/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/32/libjava/.libs -L/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/32/libjava ./.libs/libgcj-tools.so -Wl,-rpath -Wl,/opt/gcc/lib64/../lib /tmp/cc6R4nVe.o: In function `main': /tmp/ccM10rwb.i:11: undefined reference to `gnu::classpath::tools::rmic::Main::class$' collect2: ld returned 1 exit status make[5]: *** [grmic] Error 1 make[5]: Leaving directory `/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/32/libjava' Is there a way to tell gcc I don't need the -m32 version of gcj? Or, preferably, to not get this error message? Unfortunately make install runs into this error and aborts before it finishes building and installing the 64-bit java runtime. -- Summary: build error in x86_64-unknown-linux-gnu/32/libjava Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgcj AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: x86_64-pc-linux-gnu GCC host triplet: x86_64-pc-linux-gnu GCC target triplet: x86_64-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38438
[Bug libgcj/38438] build error in x86_64-unknown-linux-gnu/32/libjava
--- Comment #1 from felix-gcc at fefe dot de 2008-12-07 21:46 --- I tried building gcc with --disable-multilib but fails at the same location in the 64-bit libjava build as well: /bin/sh ./libtool --tag=GCJ --mode=link /home/leitner/tmp/gcc/build/gcc/gcj -B/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/libjava/ -B/home/leitner/tmp/gcc/build/gcc/ -L/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/libjava -fomit-frame-pointer -g -O2 -o grmiregistry --main=gnu.classpath.tools.rmiregistry.Main -rpath /opt/gcc/lib64/../lib64 -shared-libgcc -L/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/libjava/.libs libgcj-tools.la libtool: link: /home/leitner/tmp/gcc/build/gcc/gcj -B/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/libjava/ -B/home/leitner/tmp/gcc/build/gcc/ -fomit-frame-pointer -g -O2 -o .libs/grmiregistry --main=gnu.classpath.tools.rmiregistry.Main -shared-libgcc -L/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/libjava/.libs -L/home/leitner/tmp/gcc/build/x86_64-unknown-linux-gnu/libjava ./.libs/libgcj-tools.so -Wl,-rpath -Wl,/opt/gcc/lib64/../lib64 /tmp/cc4zZz3D.o: In function `main': /tmp/ccs43OKj.i:11: undefined reference to `gnu::classpath::tools::rmic::Main::class$' collect2: ld returned 1 exit status make[3]: *** [grmic] Error 1 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38438
[Bug libgcj/38438] build error in x86_64-unknown-linux-gnu/32/libjava
--- Comment #3 from felix-gcc at fefe dot de 2008-12-07 23:38 --- I have the sources in ~/tmp/gcc, and I build in ~/tmp/gcc/build using ../configure. I can see the .in files you mention in libjava/classpath/tools and I can see some binaries in build/x86_64-unknown-linux-gnu/libjava/classpath/tools: -rw-r--r--1 leitner users 64571 Dec 8 00:25 Makefile -rwxr-xr-x1 leitner users2067 Dec 8 00:25 gappletviewer -rwxr-xr-x1 leitner users2052 Dec 8 00:25 gjar -rwxr-xr-x1 leitner users2064 Dec 8 00:25 gjarsigner -rwxr-xr-x1 leitner users2056 Dec 8 00:25 gjavah -rwxr-xr-x1 leitner users2060 Dec 8 00:25 gkeytool -rwxr-xr-x1 leitner users2078 Dec 8 00:25 gnative2ascii -rwxr-xr-x1 leitner users2054 Dec 8 00:25 gorbd -rwxr-xr-x1 leitner users2054 Dec 8 00:25 grmic -rwxr-xr-x1 leitner users2054 Dec 8 00:25 grmid -rwxr-xr-x1 leitner users2068 Dec 8 00:25 grmiregistry -rwxr-xr-x1 leitner users2069 Dec 8 00:25 gserialver -rwxr-xr-x1 leitner users2064 Dec 8 00:25 gtnameserv I am using automake 1.10.2, libtool 2.2.6, autoconf 2.63, make 3.81 and binutils 2.19 in case any of that matters. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38438
[Bug libgcj/38438] build error in x86_64-unknown-linux-gnu/32/libjava
--- Comment #4 from felix-gcc at fefe dot de 2008-12-08 00:17 --- This was apparently caused by a conflict with my userland. When I used vanilla coreutils, the build works. This will be horrible to debug. Sorry for the noise. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38438
[Bug c/37238] New: gcc miscompiles simple memcpy loop
This code: void byte_copy(void* out, size_t len, const void* in) { char* s=out; const char* t=in; const char* u=t+len; for (;;) { if (t==u) break; *s=*t; ++s; ++t; if (t==u) break; *s=*t; ++s; ++t; if (t==u) break; *s=*t; ++s; ++t; if (t==u) break; *s=*t; ++s; ++t; } } gcc produces wrong code with -O1 or higher, but correct code with -O0. Here is some simple test code: #include assert.h #include string.h char buf[4096]; char text[128]; int main() { memset(buf,0,sizeof(buf)); strcpy(text,this is a test!\n); byte_copy(buf,16,text); assert(!memcmp(buf,this is a test!\n\0,18)); return 0; } -- Summary: gcc miscompiles simple memcpy loop Product: gcc Version: 4.3.1 Status: UNCONFIRMED Severity: critical Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37238
[Bug c/35592] Want attribute to enable precision loss warning
--- Comment #4 from felix-gcc at fefe dot de 2008-04-01 16:09 --- I'm not familiar enough with how gcc works to say whether warning about precision loss that turns out important later on can be done at all. But I think we should not reject an idea because it only handles 60% of the cases. Instead we should be happy we only have to worry about the other 40%, not about 100% from then on. I agree that signed int to size_t conversion for memcpy should also be warned about by some attribute. Could be done by the same attribute. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35592
[Bug c/35592] Want attribute to enable precision loss warning
--- Comment #6 from felix-gcc at fefe dot de 2008-04-01 19:34 --- Sure. For example: char* c=malloc(lseek(somefd,0,SEEK_END); on a platform where off_t is 64-bit, but where size_t is 32-bit. For example: i686-linux with #define _FILE_OFFSET_BITS 64. Now that I'm thinking about it, would it be possible to have a generic overflow warning in that context? For example, malloc(p-len+1) So that gcc sees I'm adding something there, and if the range is not clamped down before that gives an error? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35592
[Bug c++/35790] New: operator new susceptible to integer overflow
operator new has an implicit *sizeof(type), and during that operation there can occur an integer overflow. Example: int* foo() { return new int[0x4000]; } Compiled for a 32-bit target, this allocates 0 bytes. Most compilers do not detect this either, but the Microsoft compiler instead generates code that in case of overflow generates an allocation for 0x bytes that will then fail. g++ should also do that. It catches many subtle security bugs, and it costs much less than for example -fstack-protector, which everyone agrees is a great idea. -- Summary: operator new susceptible to integer overflow Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35790
[Bug c++/19351] operator new[] can return heap blocks which are too small
--- Comment #15 from felix-gcc at fefe dot de 2008-04-01 21:24 --- I think we can all agree it does not matter what we call this problem. Real world programs have security problems because of this. -fstack-protector carries a much larger run-time cost and gcc still offers it, and there is even less grounds to argue by any C or C++ standard that it's not the programmer's fault. gcc still offers it. As mentioned in the other bug, Microsoft Visual C++ already does this check. They do it like this. After the multiplication they check of the overflow flag is set, which on x86 indicates the result does not fit in the lower 32 bits. If so, instead of the truncated value it passes (size_t)-1 the operator new, which causes that operator new to fail (in the default case at least, a user may define its own operator new and that one might still return something). My favorite solution would be for the code to fail immediately. Throw an exception or return NULL, depending on which operator new the program called. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19351
[Bug target/35646] New: gcc is not using the overflow flag
Two simple examples: unsigned int add(unsigned int a,unsigned int b) { if (a+ba) exit(0); return a+b; } This produces code without an extra cmp, as expected. void addto(unsigned int* a,unsigned int b) { if ((*a+=b)b) exit(0); } This generates this code: movl%esi, %eax addl(%rdi), %eax cmpl%eax, %esi movl%eax, (%rdi) ja .L5 I would have expected something like: addl%esi, (%rdi) jo .L5 Can we please fix this? It is a common case for integer overflow checking, and if we could get programmers to see that checking for integer overflows is not inefficient and you don't need some inline assembly code to get it to be efficient, that would help a lot. -- Summary: gcc is not using the overflow flag Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35646
[Bug c/35592] New: Want attribute to enable precision loss warning
gcc has a warning if one assigns a pointer to an integer of lower width. I would like to have a way to be warned if someone calls malloc(long long) on 32-bit platforms. A general warning about integer truncation would generate lots of spam, so I suggest adding a way to say if an integer given to this function is implicitly truncated, that's a bug). The main target would be malloc and malloc-like functions, obviously. -- Summary: Want attribute to enable precision loss warning Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35592
[Bug c/35592] Want attribute to enable precision loss warning
--- Comment #2 from felix-gcc at fefe dot de 2008-03-14 19:58 --- I am aware of -Wconversion, but I am not interested in ALL conversion truncations. Truncation happens to be a security issue in a few cases, in many other cases it would just be a regular bug. My suggestions aims to isolate the security relevant cases, for the rest we have -Wconversion. If the size_t given to memcpy is truncated, that does not overwrite a buffer. But if the size_t given to malloc is truncated, that is a pretty surefire way to find a security issue. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35592
[Bug c++/33715] New: Suggest -Wmemleak warning for C++
I would like to have a warning in C++ that warns about local variables assigned via operator new or operator new[], but then are not freed in an exception handling clause in case of an exception. This would probably be very noisy, but would also be very helpful in getting your own code exception safe. In a second step I'd like to have an attribute to denote values that are a handle, for example a FILE* or an int in open() and socket(), so that gcc could warn about resource leaks in general. -- Summary: Suggest -Wmemleak warning for C++ Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: x86_64-pc-linux-gnu GCC host triplet: x86_64-pc-linux-gnu GCC target triplet: x86_64-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33715
[Bug rtl-optimization/33716] New: gcc generates suboptimal code for long long shifts
Consider this function: unsigned long long x(unsigned long long l) { return l 4; } gcc will use the shrd instruction here, which is much slower than doing it by hand on at least Athlon, Pentium 3, VIA C3. On Core 2 shrd appears to be faster. On my Athlon 64, I measured 350 cycles vs 441 for a loop of 100. On my Core 2, I measured 672 cycles vs 624. So, my suggestion is: if -march= is set to Pentium 3 or a non-Intel CPU, don't use shrd and shrl. My benchmark program is on http://dl.fefe.de/shrd.c -- Summary: gcc generates suboptimal code for long long shifts Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: i386-pc-linux-gnu GCC host triplet: i386-pc-linux-gnu GCC target triplet: i386-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33716
[Bug rtl-optimization/33717] New: slow code generated for 64-bit arithmetic
gcc generates very poor code on some bignum code I wrote. I put the sample code to http://dl.fefe.de/bignum-add.c for you to look at. The crucial loop is this (x, y and z are arrays of unsigned int). for (i=0; i100; ++i) { l += (unsigned long long)x[i] + y[i]; z[i]=l; l=32; } gcc code (-O3 -march=athlon64): movl-820(%ebp,%esi,4), %eax movl-420(%ebp,%esi,4), %ecx xorl%edx, %edx xorl%ebx, %ebx addl%ecx, %eax adcl%ebx, %edx addl-1224(%ebp), %eax adcl-1220(%ebp), %edx movl%eax, -4(%edi,%esi,4) incl%esi movl%edx, %eax xorl%edx, %edx cmpl$101, %esi movl%eax, -1224(%ebp) movl%edx, -1220(%ebp) jne .L4 As you can see, gcc keeps the long long accumulator in memory. icc keeps it in registers instead: movl 4(%esp,%edx,4), %eax #25.30 xorl %ebx, %ebx#25.5 addl 404(%esp,%edx,4), %eax#25.5 adcl $0, %ebx #25.5 addl %esi, %eax#25.37 movl %ebx, %esi#25.37 adcl $0, %esi #25.37 movl %eax, 804(%esp,%edx,4)#26.5 addl $1, %edx #24.22 cmpl $100, %edx#24.15 jb..B1.4# Prob 99% #24.15 The difference is staggering: 2000 cycles for gcc, 1000 for icc. This only happens on x86, btw. On amd64 there are enough registers, so gcc and icc are closer (840 vs 924, icc still generates better code here). Still: both compilers could generate even better code. I put some inline asm in the file for comparison, which could be improved further by loop unrolling. -- Summary: slow code generated for 64-bit arithmetic Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: i386-pc-linux-gnu GCC host triplet: i386-pc-linux-gnu GCC target triplet: i386-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33717
[Bug c/32856] New: Invalid optimization in the face of aliasing
This contrived example is miscompiled by gcc: struct node { struct node* next, *prev; }; void foo(struct node* n) { n-next-next-prev=n; n-next-prev-next=n; } This is not from real code, but I wrote it to demonstrate aliasing issues for a talk I'm preparing now. gcc -O2 -fno-strict-aliasing generates this code: movq(%rdi), %rdx movq(%rdx), %rax movq%rdi, 8(%rax) movq8(%rdx), %rax movq%rdi, (%rax) Note how rdx is used to cache the value of n-next. Since we write through some pointer that might alias other memory, gcc can not assume n still points to the same value and n-next is unchanged after the first assignment. Interestingly enough, changing assignments to n-next-next-next=n; n-next-prev-next=n; properly reloads n-next. I'm guessing that's because -next has offset 0 relative to the pointer. -- Summary: Invalid optimization in the face of aliasing Product: gcc Version: 4.2.1 Status: UNCONFIRMED Severity: major Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32856
[Bug c/32856] Invalid optimization in the face of aliasing
--- Comment #1 from felix-gcc at fefe dot de 2007-07-22 17:09 --- Well, since n is passed in a register, it can assume that n is still the same here. :-) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32856
[Bug c/32856] Invalid optimization in the face of aliasing
--- Comment #2 from felix-gcc at fefe dot de 2007-07-22 17:12 --- FWIW, the C compilers from Intel and Sun do reload n-next. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32856
[Bug middle-end/32856] Invalid optimization in the face of aliasing
--- Comment #4 from felix-gcc at fefe dot de 2007-07-22 23:08 --- Falk: union { struct { void* unused; struct node n; } a; struct node b; } u; Then u.a.n.next == u.b.n.prev; Artificial? Sure. But legal. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32856
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #43 from felix-gcc at fefe dot de 2007-01-22 13:02 --- No, it WAS about the security. Now that you made me check and I saw that the optimization also doesn't give any actual speed increase, I want it removed on that grounds, too. And I want it removed for reasons of sanity. The compiler must never give me a negative value but then not take the if (a0) branch. That is utterly unacceptable. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|WONTFIX | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #48 from felix-gcc at fefe dot de 2007-01-22 19:50 --- Oh wow, another wise cracking newbie who comments without actually understanding the issue. I AM NOT RELYING ON UNDEFINED BEHAVIOR. On the contrary. gcc is fine to assign 23 instead of a negative number. But if it does assign a negative number (as it does), I want if (a0) to trigger. That part is not undefined. But never mind the security issue here, which is apparently too complicated for you guys to understand. This optimization actually makes code SLOWER. AND it makes people mad when they find out about it. So, uh, which part of that don't you understand? There is an optimization that makes the code slower, not faster. Turn it off already. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|WONTFIX | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #33 from felix-gcc at fefe dot de 2007-01-21 13:53 --- so now you give us... a straw man? The range analysis has nothing to do with just assuming integers can't wrap. But more to the point: the Intel compiler does not assume signed integers can't wrap, and IT STILL PRODUCES MUCH FASTER CODE THAN GCC. So all your hand waiving about how this optimization is good for performance is utterly destroyed by the Intel compiler. And please let me express my shock how you tell me to my face that the only example where this optimization has measurable impact (I didn't actually try it, but I will) is when it optimizes away range checks in C++ vectors. Which, you know, exist solely because THERE ARE NO RANGE CHECKS IN C ARRAYS and, uh, C++ is much better and people are told to use C++ vectors instead BECAUSE THEY HAVE RANGE CHECKS and now you tell me that your optimization removes those. Whoa, what an improvement, man. Now you convinced me. Not that the optimization is useful, mind you, but that you are a marauding saboteur sent by the evil minions at Microsoft on a mission to make open source software look bad. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|WONTFIX | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #36 from felix-gcc at fefe dot de 2007-01-21 17:47 --- I think the actual root issue here is that the gcc argumentation is fundamentally wrong. I am complaining that gcc removes my checks, not that signed integer overflow is undefined. Also, note that it is everything BUT undefined. Adding 5 to INT_MAX will create a negative number. It behaves EXACTLY as expected by basically everyone. And if gcc decided that it is undefined and it creates INT_MAX again, then I would be happy, too. No problem with that whatsoever. All I want is gcc to be consistent. gcc DOES create a negative number. And then it removes an if statement asking whether the number is negative. That can not be explained away with signed int overflow being undefined. Let's write it in separate statements, so even the biggest idiot in the world can understand the issue. int a,b,c; a=INT_MAX; /* statement ONE */ b=a+2; /* statement TWO */ c=(ba); /* statement THREE */ My issue with gcc is that it removes statement THREE. Your argument about undefinedness is about statement TWO. Following your argumentation, gcc is allowed to return 23 in statement TWO. But it still not allowed to generate no code for statement THREE. In my opinion, people who don't understand this most basic of logic should not be let NEAR the code of a compiler, let alone a compiler millions of people are depending on. Now, to summarize. We destroyed your complete argument, including wild assertions you happened to make during its course. We gave evidence that the current behavior is bad for a lot of people. What more do you need to see reason? Do you want a bribe? Some more crack? Hookers, maybe? I don't see what else could be discussed about the matter. I will refrain from wasting time on trying to find new way to explain the obvious to you. From now on, I'll just auto-reopen the bug and say see previous arguments. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|WONTFIX | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #41 from felix-gcc at fefe dot de 2007-01-22 02:18 --- So I tested some C++ vector code using at, in a desperate attempt to find ANY case where this so called optimization actually produces faster code. http://ptrace.fefe.de/vector2.C $ gcc -O3 -o vector2 vector2.C $ ./vector2 69859 cycles $ gcc -O3 -o vector2 vector2.C -fwrapv $ ./vector2 69606 cycles $ so, not only is the different negligible, it also turns out that the optimization made the code SLOWER. Now, let's see what the Intel compiler does (I'm using 9.1.042): $ icc64 -O3 -o vector2 vector2.C $ ./vector2 50063 cycles $ So, all this fuss you are making is about an optimization that actually makes code slower, and the competition does not need foul language lawyer games like this to still beat you by 28%. 28%! You should be ashamed of yourself. Why don't you get over the fact that this was a really bad decision, undo it, and we will all live happily ever after. Oh, and: If it really does not matter whether I keep reopening this bug, why do you keep closing it? I will keep this bug open, so the world can see how you broke gcc and are unable to let even facts as clear as these convince you to see the error of your ways. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|WONTFIX | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #27 from felix-gcc at fefe dot de 2007-01-18 15:20 --- Oh, so C++ code using vector gets faster, you say? Let's see... This is vector.C from http://ptrace.fefe.de/vector.C It does some assignments to a vector. It runs the loop 10 times and takes the minimum cycle counter. $ g++ -O2 -o vector vector.C $ ./vector 20724 cycles $ g++ -O2 -o vector vector.C -fwrapv $ ./vector 20724 cycles $ And AGAIN you are proven wrong by the facts. Do you have some more proof where this came from? Man, this is ridiculous. Just back out that optimization and I'll shut up. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|WONTFIX | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #28 from felix-gcc at fefe dot de 2007-01-18 15:23 --- Mhh, so I wondered, how can it be that the code is exactly as fast. So I disassembled the binary. You know what? IT'S THE EXACT SAME CODE. So, my conjecture is, again: your optimization does exactly NO good whatsoever, it just breaks code that tries to defend against integer overflows. Please remove it. Now. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #9 from felix-gcc at fefe dot de 2007-01-17 13:55 --- Hey Andrew, do you really think this issue goes away if you keep closing the bugs fast enough? Let me tell you something: that INT_MAX way to do it is bogus. These checks are there so that it is obvious the int overflow is caught and handled. If you use INT_MAX, then the auditor still has to check if it was an int and not an unsigned int, for example. If that doesn't convince you, let's say it's not int, but it's ptrdiff_t. Or off_t. Or beancount_t, which the application defined somewhere. Then limits.h won't have a _MAX definition for it. What if the size of the type depends on the context as well? There are multiple definitions of it depending on some -Dwhatever on the command line? All these cases were covered just fine by the if (a+100 a) check. There is no context needed about the type of a, it works for pointers, unsigned integers, and signed integers. Well, you broke the pointer bit once, too, but that was reverted. The guy who reverted it back then should come back, we need someone with his vision and good judgement here now. No, let's face it. You fucked this up royally, and now you are trying to close all the bugs as fast as you can, so nobody sees just how much damage you have done. You, sir, are unprofessional and a disgrace to the gcc development team. And this bug stays open until you revert the change or make it opt-in instead of opt-out. As long as you just destroy programs where the author foolishly opted in, I don't care. But I will not let you make my work environment less secure because you don't have the professionalism to let your pet optimization go, after it was shown to do more damage than good. How much more proof do you need? For god's sake, autoconf considers turning your optimization off globally! Do you even notice all the explosions around you? PS: Mr Simon, that link to a how-to that says btw this doesn't work for this special input, is that supposed to impress anyone? It certainly does not impress me very much, really. It's better to keep your mouth shut and appear stupid than to open it and remove all doubt. --Mark Twain (1835 - 1910) -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|DUPLICATE | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #12 from felix-gcc at fefe dot de 2007-01-17 15:21 --- (In reply to comment #11) Btw. your testcase fails with gcc 2.95.3 for me as well, so no news here. Bullshit. $ ./gcc2 -v Reading specs from /usr/lib/gcc-lib/i686-pc-linux-gnu/2.95.3/specs gcc version 2.95.3 20010315 (release) $ ./gcc2 -o int int.c 200 100 int: int.c:4: foo: Assertion `a+100 a' failed. $ Why don't you get your facts straight. My gcc is the stock gcc 2.95.3, your's is apparently some butchered distro version. You know, the more apologists for the current behavior come forward, the less credible your position looks. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #14 from felix-gcc at fefe dot de 2007-01-17 16:37 --- 1. apologist, in contrast to asshole, is not a cuss word. Apparently you are as ignorant about English as you are about the issue at hand. 2. I showed my gcc -v, why don't you? Maybe it's platform dependent? For all we know you could be running this on an old Alpha where sizeof(int)==8. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #17 from felix-gcc at fefe dot de 2007-01-17 17:02 --- You misunderstand. We don't want you to say anything. We want to you make your optimization off by default, or remove it altogether. You could also try to convince us that there is any actual tangible performance gain, then you would at least have a leg to stand on. We would still laugh in your face because you broke security code in real world apps and refuse to fix it, though, but it would be a start. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #21 from felix-gcc at fefe dot de 2007-01-17 17:20 --- I DID NOT WRITE THE BROKEN CODE. Trying to trivialize the issue or insult me will not make it go away. So, please tell me, which part of the argument in comment #9 were you unable to follow? I could try using less complicated words so you actually understand it this time around. Guys, your obligation is not just to implement the C standard. Your obligation is also not to break apps that depend on you. And A LOT of apps are depending on you. When you broke the floating point accuracy, you made it opt-in (-ffast-math). When you added the aliasing breakage, you made it opt-in (-fstrict-aliasing). IIRC for that you also quoted some legalese from the standard at first, until people with more grounding in reality overruled you. And I'm going to keep this bug open until the same thing happens again for this issue. You can't just potentially break of the free software in the world because you changed your mind about what liberty the C standard gives you. Grow up or move out of the way and let more responsible people handle our infrastructure. You know that the Ariane 5 rocket crashed (and could have killed people!) because of an int overflow? What if people die because you decided the C standard allows you to optimize away other people's security checks? Again: IT DOES NOT MATTER WHAT THE C STANDARD SAYS. You broke code, people are suffering damage. Now revert it. The least you can do is make -fwrapv on by default. You would still have to make it actually work (I hear it's broken in some corner cases?), but that's another story. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|WONTFIX | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #23 from felix-gcc at fefe dot de 2007-01-17 18:23 --- In earlier gcc versions this only happened if the optimizer was on. So your argument might hold some water there, if I squint my eyes enough. But gcc 4.1 removes that code even with the optimizer turned off. There goes your argument. Please make -fwrapv default again and I'll shut up. -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|WONTFIX | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #25 from felix-gcc at fefe dot de 2007-01-17 19:04 --- Well, duh. You removed the security checks. Hey, I have one for you, too. Optimize away all calls to pthread_mutex_lock, and lo and behold, multithreaded code will be much faster! It will also be broken, but apparently, that's not much of a concern around here. The most time critical code I have, I just benchmarked. $ gcc -O3 -fomit-frame-pointer -funroll-loops -march=athlon64 -o t t.c misc.c add.c mul.c write.c read.c comba.c $ ./t adding two bignums: 84 cycles multiply two bignums: 1414 cycles writing a bignum as radix 10: 207488 cycles comba: 1467 cycles $ gcc -O3 -fomit-frame-pointer -funroll-loops -march=athlon64 -o t t.c misc.c add.c mul.c write.c read.c comba.c -fwrapv adding two bignums: 82 cycles multiply two bignums: 1414 cycles writing a bignum as radix 10: 202761 cycles comba: 1465 cycles $ So, uh, where does the optimization part about your optimization come in? This is code that has no integer overflow checks. So my conjecture is: your optimization makes code faster in exactly those cases where it removes security checks from it, endangering people on the way. So, again, please make -fwrapv the default, and I'll leave you alone. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] New: assert(int+100 int) optimized away
The small test program at http://ptrace.fefe.de/int.c illustrated the problem. The assert is there to prevent integer overflow, which would not happen in my test program, but you get the idea. There appears to be something wrong with integer promotion here. The same code with int changed to unsigned int (http://ptrace.fefe.de/unsignedint.c) correctly fails the assertion at that point. My understanding of C integer promotion is that the 100 is an int unless anything else is said, so int+100 should still be an int, and so both sides should still be an int. Is that not correct? -- Summary: assert(int+100 int) optimized away Product: gcc Version: 4.1.1 Status: UNCONFIRMED Severity: critical Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #1 from felix-gcc at fefe dot de 2007-01-15 19:46 --- Mhh, if I change int+100 to (int)(int+100), the assert still gets optimized away. So it's not an integer promotion issue. Both sides are definitely int. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #3 from felix-gcc at fefe dot de 2007-01-15 19:50 --- even stranger, if I assert ((int)(a+100) 0) then it STILL gets optimized away. WTF!? -- felix-gcc at fefe dot de changed: What|Removed |Added Severity|normal |critical http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug c/30475] assert(int+100 int) optimized away
--- Comment #4 from felix-gcc at fefe dot de 2007-01-15 19:57 --- (In reply to comment #2) signed type overflow is undefined by the C standard, use unsigned int for the addition or use -fwrapv. You have GOT to be kidding? All kinds of security issues are caused by integer wraps, and you are just telling me that with gcc 4.1 and up I cannot test for them for signed data types any more?! You are missing the point here. There HAS to be a way to get around this. Existing software uses signed int and I can't just change it to unsigned int, but I still must be able to check for a wrap! There does not appear to be a work around I could do in the source code either! Do you expect me to cast it to unsigned, shift right by one, and then add or what?! PLEASE REVERT THIS CHANGE. This will create MAJOR SECURITY ISSUES in ALL MANNER OF CODE. I don't care if your language lawyers tell you gcc is right. THIS WILL CAUSE PEOPLE TO GET HACKED. I found this because one check to prevent people from getting hacked failed. THIS IS NOT A JOKE. FIX THIS! NOW! -- felix-gcc at fefe dot de changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475
[Bug bootstrap/29102] New: mudflap's configure tries to link a binary and fails because I don
-- Summary: mudflap's configure tries to link a binary and fails because I don Product: gcc Version: 4.1.1 Status: UNCONFIRMED Severity: critical Priority: P3 Component: bootstrap AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: arm-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29102
[Bug bootstrap/29102] mudflap's configure tries to link a binary and fails because I don
--- Comment #1 from felix-gcc at fefe dot de 2006-09-15 22:14 --- Turns out you can still do make install and get a working cross compiler after that, so I concur it's normal and not critical. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29102
[Bug c/27180] New: pointer arithmetic overflow handling broken
I have this function: static inline int range_ptrinbuf(const void* buf,unsigned long len,const void* ptr) { register const char* c=(const char*)buf; return (c c+lenc (const char*)ptr-clen); } I tested it with this test: assert(range_ptrinbuf(buf,(unsigned long)-1,buf+1)==0); With gcc 3.4.5, this passes (with and without optimization). With gcc 4.1.0, this fails. I put in a printf to see if any of the values is incorrectly calculated -- it's c+lenc that incorrectly returns 0. This is with and without optimizer. This is very bad because this kind of check is used to do security checks when validating data from incoming network packets. I was planning to use this function to check data in incoming SMB packets. This bug causes all kinds of well-meaning security checks to silently fail. I also compiled Samba and my Linux kernel with gcc 4.1. I'm feeling very uncomfortable now. Please release a fixed gcc version ASAP! -- Summary: pointer arithmetic overflow handling broken Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: blocker Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27180
[Bug c/22072] New: bizarre code for int*int/2
int triangle(int a,int b) { int c; c=a*b/2; return c; } emits this very bizarre code (at -O, -O2): mov a,%edx mov b,%eax imull %edx,%eax movl %eax,%edx shrl $31,%edx addl %edx,%eax sarl %eax ret Why are the two instructions after the imull emitted? Shouldn't this become simply imull and sarl? This code extracts the most significant bit of a*b and adds it to a*b, then shifting the result right. It almost looks as if trying to round or something? There is probably something obvious I'm overlooking here. The analog code is generated for ppc, x86_64, sparc and mips. Please explain. I also tried with -Os, and the code becomes cltd (sign-extend 32 to 64 bits) plus idivl with 2. Could it be that the peephole optimizer converts the idivl to a shift but forgets to remove the sign extend code? -- Summary: bizarre code for int*int/2 Product: gcc Version: 3.4.4 Status: UNCONFIRMED Severity: minor Priority: P2 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: felix-gcc at fefe dot de CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22072
[Bug c/22072] bizarre code for int*int/2
--- Additional Comments From felix-gcc at fefe dot de 2005-06-15 06:12 --- by the way, -Os generates an unnecessary register move: pushl %ebp movl$2, %edx movl%esp, %ebp movl12(%ebp), %eax movl%edx, %ecx imull 8(%ebp), %eax popl%ebp cltd idivl %ecx ret gcc should put the $2 directly in %ecx here. Or it should note that mov $2,%ecx; idiv %ecx is 7 bytes, while sar %eax is only two bytes, and emit the latter. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22072
[Bug middle-end/22072] bizarre code for int*int/2
--- Additional Comments From felix-gcc at fefe dot de 2005-06-15 21:05 --- (In reply to comment #5) (In reply to comment #4) what about *arithmetic shift* instruction (e.g. SAR on ix86) ? Nope, try that with a negative number and you will notice that it will not work. Actually it does work. 5 SAR 1 == 2; -5 SAR 1 == -2. That's exactly what SAR is for, after all. See the Intel manuals. Or look here: http://faydoc.tripod.com/cpu/sar.htm SAR r/m32, 1Signed divide* r/m32 by 2, once -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22072