This issue was uncovered in porting our existing software to the GNU tool- chain. We have a number of structures that contain 3 individual bytes of data. When the GNU tool-chain compiles the source code, it creates a load/store byte instruction followed by a load/store half-word instruction with an odd (1,3,5,7,9,11,etc) memory offset. This causes a data alignment exception to occur.
We have tried all combinations of the compiler flags for structure packing, alignment (natural, power), and anything else that we have been able to uncover in the GCC documentation. This behavior exists at optimization levels, 0,1 and 2. We haven't tried any other levels as of yet. There should be a means of having the compiler override mis-aligned address references. This is supported at a software level (if the authors of the OS or software handle this exception). However, the authors of this component did not, and it would cause far too much of a runtime hit to implement this. A code sample is included below that will re-create this problem. ---- File test.cc ----- struct foo { char bar1; char bar2; char bar3; }; int foobarStruct(foo fubarStruct) { if(fubarStruct.bar1 == 'A' && fubarStruct.bar2 == 'B' && fubarStruct.bar3 == 'C') { return 1; } else { return 0; } } int main(int argc, char **argv) { int rVal1; int rVal2; foo barStruct; barStruct.bar1 = 'A'; barStruct.bar2 = 'B'; barStruct.bar3 = 'C'; rVal1 = foobarStruct(barStruct); barStruct.bar1 = 'A'; barStruct.bar2 = 'C'; barStruct.bar3 = 'B'; rVal1 = foobarStruct(barStruct); return (rVal1 || rVal2); } ------------- End of file --------------------- Again using any combinations of compiler flags -malign-natural, -malign-power, - fpack-struct=2, -fno-pack-struct, etc have not given us the desired behavior. Here's the assembly output from the command(s): (1)> g++ test.cc -o test (2)> ppcobjdump -C -S test.o test.o: file format elf32-powerpc Disassembly of section .text: 00000000 <foobarStruct(foo)>: 0: 94 21 ff e8 stwu r1,-24(r1) 4: 93 e1 00 14 stw r31,20(r1) 8: 7c 3f 0b 78 mr r31,r1 c: 90 7f 00 0c stw r3,12(r31) 10: 81 3f 00 0c lwz r9,12(r31) 14: 88 09 00 00 lbz r0,0(r9) 18: 54 00 06 3e clrlwi r0,r0,24 1c: 2f 80 00 41 cmpwi cr7,r0,65 20: 40 9e 00 38 bne- cr7,58 <foobarStruct(foo)+0x58> 24: 81 3f 00 0c lwz r9,12(r31) 28: 88 09 00 01 lbz r0,1(r9) 2c: 54 00 06 3e clrlwi r0,r0,24 30: 2f 80 00 42 cmpwi cr7,r0,66 34: 40 9e 00 24 bne- cr7,58 <foobarStruct(foo)+0x58> 38: 81 3f 00 0c lwz r9,12(r31) 3c: 88 09 00 02 lbz r0,2(r9) 40: 54 00 06 3e clrlwi r0,r0,24 44: 2f 80 00 43 cmpwi cr7,r0,67 48: 40 9e 00 10 bne- cr7,58 <foobarStruct(foo)+0x58> 4c: 38 00 00 01 li r0,1 50: 90 1f 00 08 stw r0,8(r31) 54: 48 00 00 0c b 60 <foobarStruct(foo)+0x60> 58: 39 20 00 00 li r9,0 5c: 91 3f 00 08 stw r9,8(r31) 60: 80 1f 00 08 lwz r0,8(r31) 64: 7c 03 03 78 mr r3,r0 68: 81 61 00 00 lwz r11,0(r1) 6c: 83 eb ff fc lwz r31,-4(r11) 70: 7d 61 5b 78 mr r1,r11 74: 4e 80 00 20 blr 00000078 <main>: 78: 94 21 ff a8 stwu r1,-88(r1) 7c: 7c 08 02 a6 mflr r0 80: 93 e1 00 54 stw r31,84(r1) 84: 90 01 00 5c stw r0,92(r1) 88: 7c 3f 0b 78 mr r31,r1 8c: 90 7f 00 28 stw r3,40(r31) 90: 90 9f 00 2c stw r4,44(r31) 94: 48 00 00 01 bl 94 <main+0x1c> 98: 38 00 00 41 li r0,65 9c: 98 1f 00 16 stb r0,22(r31) a0: 38 00 00 42 li r0,66 a4: 98 1f 00 17 stb r0,23(r31) a8: 38 00 00 43 li r0,67 ac: 98 1f 00 18 stb r0,24(r31) b0: 88 1f 00 16 lbz r0,22(r31) b4: a1 3f 00 17 lhz r9,23(r31) <-- Notice the odd offset b8: 98 1f 00 13 stb r0,19(r31) bc: b1 3f 00 14 sth r9,20(r31) c0: 88 1f 00 13 lbz r0,19(r31) c4: a1 3f 00 14 lhz r9,20(r31) c8: 98 1f 00 30 stb r0,48(r31) cc: b1 3f 00 31 sth r9,49(r31) <-- Notice the off offset d0: 38 1f 00 30 addi r0,r31,48 d4: 7c 03 03 78 mr r3,r0 d8: 48 00 00 01 bl d8 <main+0x60> dc: 7c 60 1b 78 mr r0,r3 e0: 90 1f 00 0c stw r0,12(r31) e4: 38 00 00 41 li r0,65 e8: 98 1f 00 16 stb r0,22(r31) ec: 38 00 00 43 li r0,67 f0: 98 1f 00 17 stb r0,23(r31) f4: 38 00 00 42 li r0,66 f8: 98 1f 00 18 stb r0,24(r31) fc: 88 1f 00 16 lbz r0,22(r31) 100: a1 3f 00 17 lhz r9,23(r31) <-- Again odd offsets 104: 98 1f 00 10 stb r0,16(r31) 108: b1 3f 00 11 sth r9,17(r31) <-- Again odd offsets 10c: 88 1f 00 10 lbz r0,16(r31) 110: a1 3f 00 11 lhz r9,17(r31) 114: 98 1f 00 30 stb r0,48(r31) 118: b1 3f 00 31 sth r9,49(r31) 11c: 38 1f 00 30 addi r0,r31,48 120: 7c 03 03 78 mr r3,r0 124: 48 00 00 01 bl 124 <main+0xac> 128: 7c 60 1b 78 mr r0,r3 12c: 90 1f 00 0c stw r0,12(r31) 130: 80 1f 00 0c lwz r0,12(r31) 134: 2f 80 00 00 cmpwi cr7,r0,0 138: 40 9e 00 10 bne- cr7,148 <main+0xd0> 13c: 80 1f 00 08 lwz r0,8(r31) 140: 2f 80 00 00 cmpwi cr7,r0,0 144: 41 9e 00 10 beq- cr7,154 <main+0xdc> 148: 38 00 00 01 li r0,1 14c: 90 1f 00 40 stw r0,64(r31) 150: 48 00 00 0c b 15c <main+0xe4> 154: 38 00 00 00 li r0,0 158: 90 1f 00 40 stw r0,64(r31) 15c: 80 1f 00 40 lwz r0,64(r31) 160: 7c 03 03 78 mr r3,r0 164: 81 61 00 00 lwz r11,0(r1) 168: 80 0b 00 04 lwz r0,4(r11) 16c: 7c 08 03 a6 mtlr r0 170: 83 eb ff fc lwz r31,-4(r11) 174: 7d 61 5b 78 mr r1,r11 178: 4e 80 00 20 blr >From Section 3.3.1 Alignment and Misaligned Accesses The operand of a single-register memory access instruction has a natural alignment boundary equal to the operand length. The "natural" address of an operand is an integral multiple of the operand length, ...... I can understand what the compiler is trying to achieve here in the sense of doing two loads/stores versus three, however it is performing misaligned loads/stores as a result. This optimization actually becomes a performance hit if the underlying system is forced to perform the exception handling and piece the parts together. Perhaps it's not a "real" bug, however not being able to override this behavior probably is. This behavior has been observed in 4.0.1, 4.0.0, 3.4.4 and 3.3.1 -- Summary: C & C++ compiler generating misaligned references regardless of compiler flags Product: gcc Version: 4.0.1 Status: UNCONFIRMED Severity: normal Priority: P2 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: mcvick_e at iname dot com CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: powerpc-eabi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23539