https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49444
bin.cheng <amker.cheng at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amker.cheng at gmail dot com --- Comment #8 from bin.cheng <amker.cheng at gmail dot com> --- This should be fixed on trunk now. At least for r211210 and r214864. For Andrew's test, the generated mips assmbly for kernel loop is as below. $L3: lwl $5,1($16) lwl $4,5($16) lwl $3,9($16) lwr $5,4($16) lwr $4,8($16) lwr $3,12($16) lw $2,%gp_rel(ss)($28) addiu $16,$16,13 sw $5,0($2) sw $4,4($2) jal g sw $3,8($2) bne $16,$17,$L3 move $2,$0 For Richard's case (with an explicit conversion when calling foo), the generated mips assembly is as below. foo: .frame $sp,0,$31 # vars= 0, regs= 0/0, args= 0, gp= 0 .mask 0x00000000,0 .fmask 0x00000000,0 .set noreorder .set nomacro lwl $2,0($4) nop lwr $2,3($4) j $31 nop .set macro .set reorder .end foo .size foo, .-foo Apparently, lwl/lwr are generated for unalgned memory access. Thanks, bin