[Bug middle-end/24929] long long shift/mask operations should be better optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24929 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |4.3.0 Resolution|--- |FIXED Status|NEW |RESOLVED Keywords||missed-optimization --- Comment #8 from Andrew Pinski --- Fixed a long time ago.
[Bug middle-end/24929] long long shift/mask operations should be better optimized
--- Comment #6 from steven at gcc dot gnu dot org 2006-09-20 22:19 --- *** Bug 28405 has been marked as a duplicate of this bug. *** -- steven at gcc dot gnu dot org changed: What|Removed |Added CC||vda dot linux at googlemail ||dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24929
[Bug middle-end/24929] long long shift/mask operations should be better optimized
--- Comment #4 from ian at airs dot com 2006-06-27 06:05 --- With my current version of the lower-subreg patch, I get this with -O2 -momit-leaf-frame-pointer: f: movl16(%esp), %eax movl4(%esp), %ecx movl8(%esp), %edx shrl$16, %eax andl$255, %eax shldl $8, %ecx, %edx sall$8, %ecx orl %ecx, %eax ret which may be optimal. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24929
[Bug middle-end/24929] long long shift/mask operations should be better optimized
--- Comment #5 from uros at kss-loka dot si 2006-06-27 10:12 --- (In reply to comment #4) which may be optimal. movzbl 18(%esp), %eax could be used in this particular case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24929
[Bug middle-end/24929] long long shift/mask operations should be better optimized
--- Comment #2 from ian at airs dot com 2006-02-02 18:14 --- With an updated version of RTH's subreg lowering pass, I get this instruction sequence: f: movl16(%esp), %eax movl4(%esp), %edx movl8(%esp), %ecx shrl$16, %eax andl$255, %eax shldl $8, %edx, %ecx sall$8, %edx orl %edx, %eax movl%ecx, %edx ret This is one instruction shorter than the icc sequence, due to the use of shldl. It could be improved by switching the roles of %ecx and %edx to avoid the final move, although that is complex to implement give the way the register allocator currently handles pseudo-registers larger than word mode. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24929
[Bug middle-end/24929] long long shift/mask operations should be better optimized
--- Comment #3 from pinskia at gcc dot gnu dot org 2006-02-02 18:16 --- Confirmed. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2006-02-02 18:16:13 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24929
[Bug middle-end/24929] long long shift/mask operations should be better optimized
--- Comment #1 from tkho at ucla dot edu 2005-11-18 02:35 --- Created an attachment (id=10273) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10273action=view) shift/mask long long testcase Here is a rough instruction-count comparison for f() compiled at -O2, march=pentiumpro between icc9 and gcc head 20051108 with the patch in PR 17886, comment #16: icc: 11 gcc: 23 `icc -O2 -march=pentiumpro -S test3.c` gives: movl 4(%esp), %eax movl 8(%esp), %ecx movl %eax, %edx shrl $24, %edx shll $8, %eax shll $8, %ecx orl %ecx, %edx movzwl18(%esp), %ecx movzbl%cl, %ecx orl %ecx, %eax ret `gcc -c test3.c -save-temps -O2 -march=pentiumpro -momit-leaf-frame-pointer` gives: subl$12, %esp movl%edi, 8(%esp) movl28(%esp), %edi movl16(%esp), %eax movl20(%esp), %edx movl%esi, 4(%esp) movl24(%esp), %esi movl%edi, %esi xorl%edi, %edi movl8(%esp), %edi movl%ebx, (%esp) shrl$16, %esi xorl%ebx, %ebx shldl $8, %eax, %edx movl%esi, %ecx movl4(%esp), %esi orl %ebx, %edx movl(%esp), %ebx andl$255, %ecx sall$8, %eax addl$12, %esp orl %ecx, %eax ret For comparison, here's the code from gcc 2.95.3. It generates the same 18 instructions for both march=i386 and march=pentiumpro. `gcc -c test3.c -save-temps -O2 -momit-leaf-frame-pointer -march=pentiumpro`: pushl %ebx movl 8(%esp),%ecx movl 12(%esp),%ebx movl 16(%esp),%eax movl 20(%esp),%edx shldl $8,%ecx,%ebx sall $8,%ecx movl %edx,%eax xorl %edx,%edx shrl $16,%eax andl $255,%eax andl $0,%edx orl %eax,%ecx orl %edx,%ebx movl %ecx,%eax movl %ebx,%edx popl %ebx ret -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24929