On Wed, Nov 11, 2020 at 11:17:32AM +0100, Philipp Tomsich wrote: > From: Philipp Tomsich <p...@gnu.org> > > The function > long f(long a) > { > return(a & 0xFFFFFFFFull) << 3; > } > is folded into > _1 = a_2(D) << 3; > _3 = _1 & 34359738360; > wheras the construction > return (a & 0xFFFFFFFFull) * 8; > results in > _1 = a_2(D) & 4294967295; > _3 = _1 * 8; > > This leads to suboptimal code-generation for RISC-V (march=rv64g), as > the shifted constant needs to be expanded into 3 RTX and 2 RTX (one > each for the LSHIFT_EXPR and the BIT_AND_EXPR) which will overwhelm > the combine pass (a sequence of 5 RTX are not considered): > li a5,1 # tmp78, # 23 [c=4 l=4] > *movdi_64bit/1 > slli a5,a5,35 #, tmp79, tmp78 # 24 [c=4 l=4] ashldi3 > addi a5,a5,-8 #, tmp77, tmp79 # 9 [c=4 l=4] adddi3/1 > slli a0,a0,3 #, tmp76, tmp80 # 6 [c=4 l=4] ashldi3 > and a0,a0,a5 # tmp77,, tmp76 # 15 [c=4 l=4] anddi3/0 > ret # 28 [c=0 l=4] simple_return > instead of: > slli a0,a0,32 #, tmp76, tmp79 # 26 [c=4 l=4] ashldi3 > srli a0,a0,29 #,, tmp76 # 27 [c=4 l=4] lshrdi3 > ret # 24 [c=0 l=4] simple_return > > We address this by adding a simplification for > (a << s) & M, where ((M >> s) << s) == M > to > (a & M_unshifted) << s, where M_unshifted := (M >> s) > which undistributes the LSHIFT.
This is problematic, we have another rule that goes against this: /* Fold (X {&,^,|} C2) << C1 into (X << C1) {&,^,|} (C2 << C1) (X {&,^,|} C2) >> C1 into (X >> C1) & (C2 >> C1). */ (for shift (lshift rshift) (for bit_op (bit_and bit_xor bit_ior) (simplify (shift (convert?:s (bit_op:s @0 INTEGER_CST@2)) INTEGER_CST@1) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (with { tree mask = int_const_binop (shift, fold_convert (type, @2), @1); } (bit_op (shift (convert @0) @1) { mask; })))))) and we don't want the two rules to keep fighting against each other. It is better to have one form as canonical and only right before expansion (isel pass) or during expansion decide e.g. based on target costs whether that (X << C1) & (C2 << C1) is better expanded like that, or as (X & C2) << C1, or as (X << C3) >> C4. Jakub