http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53823
--- Comment #21 from John David Anglin <danglin at gcc dot gnu.org> 2012-08-01 18:44:04 UTC --- The issue is with the handling of negative constants. In this bit of code, max_cost = (set_src_cost (gen_rtx_MULT (mode, fake_reg, op1), speed) - neg_cost(speed, mode)); if (max_cost > 0 && choose_mult_variant (mode, -coeff, &algorithm, &variant, max_cost)) max_cost is computed to be 24 and choose_mult_variant returns 0. This causes the following hunk to be executed: max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, op1), speed); if (choose_mult_variant (mode, coeff, &algorithm, &variant, max_cost)) return expand_mult_const (mode, op0, coeff, target, &algorithm, variant); max_cost is now 32 and choose_mult_variant succeeds. However, expand_mult_const generates incorrect code when coeff is negative. I hacked the max_cost in the negative case to be 32, and this produces the correct result and code similar to your cross-compiled output. If I remember correctly, the xmpyu instruction was introduced in PA 1.1 and this results in different costs for PA 1.0 and 1.1.