------- Additional Comments From uros at kss-loka dot si 2004-10-25 14:35 ------- The problem here is triggered in reload() function around line 950, this part (#ifdef'd part was added by me:):
for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) if (reg_renumber[i] < 0 && reg_equiv_memory_loc[i]) { rtx x = eliminate_regs (reg_equiv_memory_loc[i], 0, NULL_RTX); #ifdef DEBUG debug_rtx(reg_equiv_memory_loc[i]); debug_rtx(x); #endif if (strict_memory_address_p (GET_MODE (regno_reg_rtx[i]), XEXP (x, 0))) reg_equiv_mem[i] = x, reg_equiv_address[i] = 0; ... For the testcase from comment #9 (converted to plain c), 'gcc -O2 -msse' will produce relevant debug information: IN: (mem:V4SF (plus:SI (reg/f:SI 20 frame) (const_int -32 [0xffffffe0])) [6 S16 A8]) OUT: (mem:V4SF (plus:SI (reg/f:SI 6 bp) (const_int -56 [0xffffffc8])) [6 S16 A8]) So, the problem is inside eliminate_regs() function, that unaligns otherwise aligned address. This unaligned address is passed down and somewhere around line 1214, following code will be triggered: for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) { rtx addr = 0; if (reg_equiv_mem[i]) addr = XEXP (reg_equiv_mem[i], 0); ... and this addr is used as new (unaligned) address on stack. To further analyze this issue: following RTX is passed to eliminate_regs(): (mem:V4SF (plus:SI (reg/f:SI 20 frame) (const_int -32 [0xffffffe0])) [6 S16 A8]) After getting through MEM: case, function recurses to PLUS: case, where following RTX is processed: plus:SI (reg/f:SI 20 frame) (const_int -32 [0xffffffe0])) and this code is triggered: ... else return gen_rtx_PLUS (Pmode, ep->to_rtx, plus_constant (XEXP (x, 1), ep->previous_offset)); where ep->previous_offset (when substituting frame pointer with ebp) equals (-24). And the resulting sum is then -56. I'm a little lost here, what previous_offset field represents, perhaps someone with more knowledge could find, if magic number (-24) is OK [it is not!]. BTW: the testcase from comment #9 when -fomit-frame-pointer is added to compilation flags produces correctly aligned address, because ep->previous_offset, when substituting frame pointer with esp, equals to 64. -32 + 64 = 32 in this case. Regarding comment #17: Perhaps original testcase still uses ebp, even with '-fomit-frame-pointer'. Uros. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17990