https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106594
--- Comment #21 from Roger Sayle <roger at nextmovesoftware dot com> --- I completely agree that Richard Sandiford's patch is a much better solution, but I'd like to counter the claims that the change originally proposed in comment #8 is obviously universally bad. Segher has proposed that object code size correlates with the quality of "combine", so I thought that I'd point out that the original patch reduces code size on the CSiBE benchmark on x86_64 when compiled with -Os. #bench,file,before,after,delta,cumsum bzip2-1.0.2,bzlib,11750,11756,6,6 cg_compiler_opensrc,memory,818,825,7,13 jpeg-6b,jcsample,2814,2804,-10,3 jpeg-6b,jcphuff,3606,3609,3,6 libpng-1.2.5,pngget,3798,3792,-6,0 linux-2.4.23-pre3-testplatform,fs/nfs/nfs2xdr,6070,6069,-1,-1 linux-2.4.23-pre3-testplatform,fs/nfs/nfs3xdr,8229,8225,-4,-5 linux-2.4.23-pre3-testplatform,fs/ext3/balloc,6769,6773,4,-1 linux-2.4.23-pre3-testplatform,fs/ext3/ialloc,6063,6064,1,0 linux-2.4.23-pre3-testplatform,fs/nfsd/nfsfh,5896,5889,-7,-7 linux-2.4.23-pre3-testplatform,fs/nfsd/nfs3xdr,6145,6143,-2,-9 linux-2.4.23-pre3-testplatform,fs/lockd/xdr,5924,5916,-8,-17 linux-2.4.23-pre3-testplatform,fs/lockd/xdr4,5883,5875,-8,-25 linux-2.4.23-pre3-testplatform,fs/inode,8227,8229,2,-23 linux-2.4.23-pre3-testplatform,mm/memory,8877,8880,3,-20 linux-2.4.23-pre3-testplatform,lib/zlib_deflate/deflate,6319,6315,-4,-24 linux-2.4.23-pre3-testplatform,net/ipv4/tcp_ipv4,17587,17588,1,-23 linux-2.4.23-pre3-testplatform,net/sunrpc/auth_unix,2155,2148,-7,-30 linux-2.4.23-pre3-testplatform,net/sunrpc/svcauth,948,947,-1,-31 linux-2.4.23-pre3-testplatform,net/sunrpc/xdr,4033,4043,10,-21 linux-2.4.23-pre3-testplatform,kernel/timer,5106,5108,2,-19 The story with -O2 is more complicated, it does indeed increase code size, but the effects are greatly inflated due to jump alignment (notice that the majority of deltas in the report below are multiples of 16). If the single pathological OpenTCP/ip is excluded, the size is reduced over all of the other tests. OpenTCP-1.0.4,dns/dns,1793,1825,32,32 OpenTCP-1.0.4,http/http_server,2691,2676,-15,17 OpenTCP-1.0.4,ip,3152,3312,160,177 OpenTCP-1.0.4,tcp,8823,8839,16,193 OpenTCP-1.0.4,udp,2147,2163,16,209 bzip2-1.0.2,compress,17779,17763,-16,193 jikespg-1.3,src/mkfirst,20023,20007,-16,177 jikespg-1.3,src/mkred,8993,16,193 jikespg-1.3,src/produce,14897,14961,-16,177 jikespg-1.3,src/remsp,10678,10694,16,193 jikespg-1.3,src/resolve,17542,17558,16,209 jpeg-6b,jchuff,6439,6423,-16,193 jpeg-6b,jcphuff,7921,7905,-16,177 jpeg-6b,jdmarker,9845,9893,48,225 jpeg-6b,jquant2,6785,6769,-16,209 jpeg-6b,wrtarga,1353,1369,16,225 jpeg-6b,wrbmp,2551,2567,16,241 libmspack,test/cabextract_md5,32951,32935,-16,225 libpng-1.2.5,pngget,4583,4567,-16,209 libpng-1.2.5,pngwutil,21647,21615,-32,177 libpng-1.2.5,pngrtran,26045,26109,64,241 libpng-1.2.5,pngwtran,2539,2555,16,257 linux-2.4.23-pre3-testplatform,fs/nfs/nfs3xdr,14038,14006,-32,225 linux-2.4.23-pre3-testplatform,fs/nfsd/nfs3xdr,13584,13568,-16,209 linux-2.4.23-pre3-testplatform,fs/lockd/xdr,9554,9538,-16,193 linux-2.4.23-pre3-testplatform,fs/lockd/xdr4,7903,7919,16,209 linux-2.4.23-pre3-testplatform,fs/buffer,22824,22840,16,225 linux-2.4.23-pre3-testplatform,mm/filemap,23872,23888,16,241 linux-2.4.23-pre3-testplatform,net/ipv4/ip_input,4189,4173,-16,225 linux-2.4.23-pre3-testplatform,net/ipv4/ip_fragment,7242,7226,-16,209 linux-2.4.23-pre3-testplatform,net/ipv4/ip_options,7664,7680,16,225 linux-2.4.23-pre3-testplatform,net/ipv4/ip_output,10956,10924,-32,193 linux-2.4.23-pre3-testplatform,net/ipv4/tcp_ipv4,22663,22679,16,209 linux-2.4.23-pre3-testplatform,net/ipv4/udp,10365,10349,-16,193 linux-2.4.23-pre3-testplatform,net/ipv4/icmp,8589,8573,-16,177 linux-2.4.23-pre3-testplatform,net/sunrpc/auth_unix,2782,2766,-16,161 linux-2.4.23-pre3-testplatform,net/sunrpc/svcauth,1172,1156,-16,145 linux-2.4.23-pre3-testplatform,drivers/char/raw,4860,4876,16,161 linux-2.4.23-pre3-testplatform,kernel/exit,5485,5469,-16,145 linux-2.4.23-pre3-testplatform,kernel/timer,7257,7273,16,161 lwip-0.5.3.preproc,src/core/ipv4/ip,1883,1899,16,177 lwip-0.5.3.preproc,src/core/tcp_input,5513,5497,-16,161 lwip-0.5.3.preproc,src/core/tcp_output,3290,3354,64,225 teem-1.6.0-src,src/gage/st,5248,5264,16,241 teem-1.6.0-src,src/nrrd/apply1D,10837,10789,-48,193 unrarlib-0.4.0,unrarlib/unrarlib,16682,16666,-16,177 zlib-1.1.4,deflate,8721,8689,-32,145 Picking "lwip-0.5.3.preproc,src/core/tcp_output" as an example size regression, the first difference in code is: Before: 83 c2 05 add $0x5,%edx 89 d3 mov %edx,%ebx c1 e3 0c shl $0xc,%ebx After: c1 e2 0c shl $0xc,%edx 8d 9a 00 50 00 00 lea 0x5000(%rdx),%ebx Notice that the size has increased by a byte, but the new sequence is actually now only two instructions compared to the original three. Let's just say the situation is complicated (comparing code size when not optimizing for code size may be misleading), but importantly it is possible to do better than the current expand_compound_operation/make_compound_operation.