------- Additional Comments From pinskia at gcc dot gnu dot org 2005-09-13 18:51 ------- This is what we get one the mainline: .L4: movl (%ecx), %eax addl $4, %ecx movl %eax, (%edi,%edx,4) movl (%ebp,%edx,4), %eax movl %eax, (%esi,%edx,4) incl %edx cmpl %edx, %ebx jne .L4
Note the code in comment #4 has a target patch which improves this a little further: Index: i386.c =============================================================== ==== RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v retrieving revision 1.858 diff -u -p -r1.858 i386.c --- i386.c 6 Sep 2005 19:57:46 -0000 1.858 +++ i386.c 13 Sep 2005 18:49:44 -0000 @@ -5273,6 +5273,10 @@ ix86_address_cost (rtx x) /* More complex memory references are better. */ if (parts.disp && parts.disp != const0_rtx) cost--; + + if (parts.scale != 1) + cost--; + if (parts.seg != SEG_DEFAULT) cost--; But since I don't have SPEC, I have not submitted the patch. Steven could you test this patch and submit it for me? ChangeLog (please make a better changelog): (ix86_address_cost): More complex is cheaper than anything else. -- What |Removed |Added ---------------------------------------------------------------------------- Summary|[4.0/4.1 Regression] |[4.0 Regression] suboptimal |suboptimal use of fancy x86 |use of fancy x86 addressing |addressing modes |modes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18463