------- Additional Comments From andrewhutchinson at cox dot net 2005-03-27 14:33 ------- The problem here is that gcc is using a DImode register to handle 6 byte (int+long) structure. Why I have no idea!
Since the target has no insn for DI move, gcc turns this into individual QImode byte moves (subregs all over the place!). The 'stacked' 6 byte structure is 'popped' into DI register (6 bytes ). Two other byte registers are explicitely cleared (making our 8 byte DI register) What then follows is a large amount of shuffling. i.e. Moving from intermediate virtual DI register (8 bytes) into correct place for a 6 byte return. Which seems to surpass the abilities of the register allocator (DI and return registers overlap). Smaller structures (<=4 bytes) are optimally handled. Larger structure >8 are also much better since they are returned in memory. So in summary, it would appear that the root cause is allocation of a DI mode register for structures >4 and <=8 bytes. A secondary factor is the use of QImode moves (when SI,HImode are available and more efficient) The problem can be partially alleviated by defining DImode moves (that a hell of a change though). Poor code still remains - for example clearing unused padding bytes and extra register usage. PS -fpack-struct does not change this bug. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11180