On 4/16/19 10:29 AM, Steve Ellcey wrote: > Re-ping. I know there are discussions about bigger changes to fix the > various failures listed in PR rtl-optimization/87763 but this patch > at least fixes one of them (gcc.target/aarch64/lsl_asr_sbfiz.c). > > Steve Ellcey > sell...@marvell.com So here's my take on fixing the lsl_asr_sbfix.c regression.
As I mentioned earlier the problem is the aarch64 is using the old way of describing bitfield extractions (extv, extzv). Specifically it defined a single expander that operated on the natural word mode (DImode). This forces the input operand to be used in DImode as well. So we get those annoying subregs in the expressions that combine generates. But there's a better way :-) The new way to handle this stuff is to define expanders for supported modes (SI and DI for aarch64). Interestingly enough the aarch64 port already does this for bitfield insertions via insv. So I made the obvious changes to the extv/extzv expander. That fixes the lsl_asr_sbfiz regression. But doesn't bootstrap. The availability of the new expander makes extract_bit_field_using_extv want to extract a field from an HFmode object and shove it into an SImode. That runs afoul of checks in validate_subreg (as it should). We shouldn't be using subregs to change the size of a floating point object, so that needs to be filtered out in extract_bit_field_using_extv. That fixes the bootstrap issue. But regression testing trips over ashtidisi.c. That can be easily fixed by allowing zero/sign extractions which change size. ie: (set (reg:DI) (sign_extract:DI (reg:SI) ...))) Which seems like a reasonable thing to handle. So here's what I've got. I've bootstrapped and regression tested on aarch64. It's also bootstrapped on aarch64_be for good measure. OK (from aarch64 maintainers) for the gcc-9 branch and trunk? Or we could just address on the trunk for gcc-10. I don't have strong preferences either way. Jeff ps. Time to return insv regressions and address Segher's comments :-)
* aarch64.md (extv, extzv expander): Generalize so that it works for both SImode and DImode. (extv_3264, extzv_3264): New pattern to handle extractions with size change between the input and output operand. * expmed.c (extract_bitfield_using_extv): Do not allow changing sizes of floating point objects using SUBREGs. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 5a1894063a1..13e2bca05a1 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -5390,16 +5390,16 @@ ;; Bitfields ;; ------------------------------------------------------------------- -(define_expand "<optab>" - [(set (match_operand:DI 0 "register_operand" "=r") - (ANY_EXTRACT:DI (match_operand:DI 1 "register_operand") +(define_expand "<optab><mode>" + [(set (match_operand:GPI 0 "register_operand" "=r") + (ANY_EXTRACT:GPI (match_operand:GPI 1 "register_operand") (match_operand 2 - "aarch64_simd_shift_imm_offset_di") - (match_operand 3 "aarch64_simd_shift_imm_di")))] + "aarch64_simd_shift_imm_offset_<mode>") + (match_operand 3 "aarch64_simd_shift_imm_<mode>")))] "" { if (!IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), - 1, GET_MODE_BITSIZE (DImode) - 1)) + 1, GET_MODE_BITSIZE (<MODE>mode) - 1)) FAIL; } ) @@ -5418,6 +5418,21 @@ [(set_attr "type" "bfx")] ) +;; Similar to the previous pattern, but 32->64 extraction +(define_insn "*<optab>_3264" + [(set (match_operand:DI 0 "register_operand" "=r") + (ANY_EXTRACT:DI (match_operand:SI 1 "register_operand" "r") + (match_operand 2 + "aarch64_simd_shift_imm_offset_si" "n") + (match_operand 3 + "aarch64_simd_shift_imm_si" "n")))] + "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), + 1, GET_MODE_BITSIZE (DImode) - 1)" + "<su>bfx\\t%x0, %x1, %3, %2" + [(set_attr "type" "bfx")] +) + + ;; When the bit position and width add up to 32 we can use a W-reg LSR ;; instruction taking advantage of the implicit zero-extension of the X-reg. (define_split diff --git a/gcc/expmed.c b/gcc/expmed.c index d7f8e9a5d76..ce8f9595b9a 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -1544,7 +1544,14 @@ extract_bit_field_using_extv (const extraction_insn *extv, rtx op0, mode. Instead, create a temporary and use convert_move to set the target. */ if (REG_P (target) - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)) + && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode) + /* We can't change the size of float mode subregs (see + validate_subreg). So if either mode is a floating point + mode and the sizes are not equal, then reject doing this + via gen_lowpart. */ + && !((FLOAT_MODE_P (GET_MODE (target)) || FLOAT_MODE_P (ext_mode)) + && maybe_ne (GET_MODE_BITSIZE (GET_MODE (target)), + GET_MODE_BITSIZE (ext_mode)))) { target = gen_lowpart (ext_mode, target); if (partial_subreg_p (GET_MODE (spec_target), ext_mode))