http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46524
Summary: Code size regression due to not reusing immediate operands of moves Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: hubi...@gcc.gnu.org Created attachment 22433 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22433 preprocessed testcase this is another CSiBE module. The testcase is storing a lot of zeros and ones into many places. GCC 4.3 keep 0 in ebp and does: f29: f3 0f 5c 84 24 e8 00 subss 0xe8(%rsp),%xmm0 f30: 00 00 f32: 8b 84 24 0c 01 00 00 mov 0x10c(%rsp),%eax f39: c7 84 24 80 00 00 00 movl $0x3f800000,0x80(%rsp) f40: 00 00 80 3f f44: 89 ac 24 84 00 00 00 mov %ebp,0x84(%rsp) f4b: 89 ac 24 88 00 00 00 mov %ebp,0x88(%rsp) f52: 89 ac 24 8c 00 00 00 mov %ebp,0x8c(%rsp) f59: 89 44 24 40 mov %eax,0x40(%rsp) f5d: 89 44 24 54 mov %eax,0x54(%rsp) f61: 89 ac 24 90 00 00 00 mov %ebp,0x90(%rsp) f68: c7 84 24 94 00 00 00 movl $0x3f800000,0x94(%rsp) f6f: 00 00 80 3f f73: 89 ac 24 98 00 00 00 mov %ebp,0x98(%rsp) f7a: 89 ac 24 9c 00 00 00 mov %ebp,0x9c(%rsp) f81: f3 0f 59 05 00 00 00 mulss 0x0(%rip),%xmm0 # f89 <main+0xf89> f88: 00 f89: 89 ac 24 a0 00 00 00 mov %ebp,0xa0(%rsp) f90: 89 ac 24 a4 00 00 00 mov %ebp,0xa4(%rsp) f97: c7 84 24 a8 00 00 00 movl $0x3f800000,0xa8(%rsp) f9e: 00 00 80 3f fa2: 89 ac 24 ac 00 00 00 mov %ebp,0xac(%rsp) fa9: 89 ac 24 b0 00 00 00 mov %ebp,0xb0(%rsp) fb0: 89 ac 24 b4 00 00 00 mov %ebp,0xb4(%rsp) fb7: 89 ac 24 b8 00 00 00 mov %ebp,0xb8(%rsp) fbe: c7 84 24 bc 00 00 00 movl $0x3f800000,0xbc(%rsp) fc5: 00 00 80 3f fc9: 89 6c 24 44 mov %ebp,0x44(%rsp) fcd: 89 6c 24 48 mov %ebp,0x48(%rsp) fd1: 89 6c 24 4c mov %ebp,0x4c(%rsp) fd5: 89 6c 24 50 mov %ebp,0x50(%rsp) fd9: 89 6c 24 58 mov %ebp,0x58(%rsp) fdd: 89 6c 24 5c mov %ebp,0x5c(%rsp) fe1: 89 6c 24 60 mov %ebp,0x60(%rsp) fe5: 89 6c 24 64 mov %ebp,0x64(%rsp) fe9: f3 0f 11 44 24 68 movss %xmm0,0x68(%rsp) fef: 89 6c 24 6c mov %ebp,0x6c(%rsp) ff3: 89 6c 24 70 mov %ebp,0x70(%rsp) ff7: 89 6c 24 74 mov %ebp,0x74(%rsp) ffb: 89 6c 24 78 mov %ebp,0x78(%rsp) fff: c7 44 24 7c 00 00 80 movl $0x3f800000,0x7c(%rsp) 1006: 3f Mainline uses stores: 13a1: f3 0f 5c 8c 24 dc 00 subss 0xdc(%rsp),%xmm1 13a8: 00 00 13aa: 49 03 44 24 40 add 0x40(%r12),%rax 13af: f3 0f 11 40 2c movss %xmm0,0x2c(%rax) 13b4: c7 40 20 00 00 00 00 movl $0x0,0x20(%rax) 13bb: c7 40 24 00 00 00 00 movl $0x0,0x24(%rax) 13c2: c7 40 28 00 00 00 00 movl $0x0,0x28(%rax) 13c9: 8b 84 24 f8 00 00 00 mov 0xf8(%rsp),%eax 13d0: f3 0f 11 44 24 40 movss %xmm0,0x40(%rsp) 13d6: f3 0f 59 0d 00 00 00 mulss 0x0(%rip),%xmm1 # 13de <main+0x13de> 13dd: 00 13de: f3 0f 11 44 24 54 movss %xmm0,0x54(%rsp) 13e4: f3 0f 11 44 24 68 movss %xmm0,0x68(%rsp) 13ea: c7 44 24 44 00 00 00 movl $0x0,0x44(%rsp) 13f1: 00 13f2: 89 84 24 80 00 00 00 mov %eax,0x80(%rsp) 13f9: f3 0f 11 44 24 7c movss %xmm0,0x7c(%rsp) 13ff: 89 84 24 94 00 00 00 mov %eax,0x94(%rsp) 1406: c7 44 24 48 00 00 00 movl $0x0,0x48(%rsp) 140d: 00 140e: c7 44 24 4c 00 00 00 movl $0x0,0x4c(%rsp) 1415: 00 1416: c7 44 24 50 00 00 00 movl $0x0,0x50(%rsp) 141d: 00 141e: c7 44 24 58 00 00 00 movl $0x0,0x58(%rsp) 1425: 00 1426: c7 44 24 5c 00 00 00 movl $0x0,0x5c(%rsp) 142d: 00 142e: c7 44 24 60 00 00 00 movl $0x0,0x60(%rsp) 1435: 00 1436: c7 44 24 64 00 00 00 movl $0x0,0x64(%rsp) 143d: 00 143e: c7 44 24 6c 00 00 00 movl $0x0,0x6c(%rsp) 1445: 00 1446: c7 44 24 70 00 00 00 movl $0x0,0x70(%rsp) 144d: 00 144e: c7 44 24 74 00 00 00 movl $0x0,0x74(%rsp) 1455: 00 1456: c7 44 24 78 00 00 00 movl $0x0,0x78(%rsp) 145d: 00 145e: c7 84 24 84 00 00 00 movl $0x0,0x84(%rsp) 1465: 00 00 00 00 1469: c7 84 24 88 00 00 00 movl $0x0,0x88(%rsp) 1470: 00 00 00 00 1474: c7 84 24 8c 00 00 00 movl $0x0,0x8c(%rsp) 147b: 00 00 00 00 147f: c7 84 24 90 00 00 00 movl $0x0,0x90(%rsp) 1486: 00 00 00 00 148a: c7 84 24 98 00 00 00 movl $0x0,0x98(%rsp) 1491: 00 00 00 00 1495: c7 84 24 9c 00 00 00 movl $0x0,0x9c(%rsp) 149c: 00 00 00 00 14a0: c7 84 24 a0 00 00 00 movl $0x0,0xa0(%rsp) 14a7: 00 00 00 00 RTL cprop1 pass manages to propagate constants everywhere. -fno-gcse leads to proper codegen here, but still we get about 7% bigger text section compared to 4.3.