http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54589
Bug #: 54589 Summary: [missed-optimization] struct offset add should be folded into address calculation Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: sgunder...@bigfoot.com Hi, I found this in 4.4 (Ubuntu 10.04), and have confirmed it's still there in gcc (Debian 20120820-1) 4.8.0 20120820 (experimental) [trunk revision 190537] This code: #include <emmintrin.h> struct param { int a, b, c, d; __m128i array[256]; }; void func(struct param *p, unsigned char *src, int *dst) { __m128i x = p->array[*src]; *dst = _mm_cvtsi128_si32(x); } compiles with -O2 on x86-64 to this assembler: 0000000000000000 <func>: 0: 0f b6 06 movzbl (%rsi),%eax 3: 48 83 c0 01 add $0x1,%rax 7: 48 c1 e0 04 shl $0x4,%rax b: 8b 04 07 mov (%rdi,%rax,1),%eax e: 89 02 mov %eax,(%rdx) 10: c3 retq The add should be folded into the address calculation here. (The shl can't, because it's too big.) Curiously enough, if I misalign the struct element by removing c and d, and declaring the struct __attribute__((packed)), GCC will do that; the mov will then be from $8(%rdi,%rax,1),%eax and there is no redundant add.