On 4/26/19 8:52 AM, Richard Earnshaw (lists) wrote:
> On 26/04/2019 15:13, Jeff Law wrote:
>> On 4/16/19 10:29 AM, Steve Ellcey wrote:
>>> Re-ping.  I know there are discussions about bigger changes to fix the
>>> various failures listed in PR rtl-optimization/87763 but this patch
>>> at least fixes one of them (gcc.target/aarch64/lsl_asr_sbfiz.c).
>>>
>>> Steve Ellcey
>>> sell...@marvell.com
>> So here's my take on fixing the lsl_asr_sbfix.c regression.
>>
>> As I mentioned earlier the problem is the aarch64 is using the old way
>> of describing bitfield extractions (extv, extzv).  Specifically it
>> defined a single expander that operated on the natural word mode (DImode).
>>
>> This forces the input operand to be used in DImode as well.  So we get
>> those annoying subregs in the expressions that combine generates.  But
>> there's a better way :-)
>>
>> The new way to handle this stuff is to define expanders for supported
>> modes (SI and DI for aarch64).  Interestingly enough the aarch64 port
>> already does this for bitfield insertions via insv.
>>
>> So I made the obvious changes to the extv/extzv expander.  That fixes
>> the lsl_asr_sbfiz regression.  But doesn't bootstrap.  The availability
>> of the new expander makes extract_bit_field_using_extv want to extract a
>> field from an HFmode object and shove it into an SImode.  That runs
>> afoul of checks in validate_subreg (as it should).  We shouldn't be
>> using subregs to change the size of a floating point object, so that
>> needs to be filtered out in extract_bit_field_using_extv.
>>
>> That fixes the bootstrap issue.  But regression testing trips over
>> ashtidisi.c.  That can be easily fixed by allowing zero/sign extractions
>> which change size.  ie:
>>
>> (set (reg:DI) (sign_extract:DI (reg:SI) ...)))
>>
>> Which seems like a reasonable thing to handle.
>>
>> So here's what I've got.  I've bootstrapped and regression tested on
>> aarch64.  It's also bootstrapped on aarch64_be for good measure.
>>
>> OK (from aarch64 maintainers) for the gcc-9 branch and trunk?  Or we
>> could just address on the trunk for gcc-10.  I don't have strong
>> preferences either way.
>>
>> Jeff
>>
>> ps.  Time to return insv regressions and address Segher's comments :-)
>>
>>
>>
>> P
>>
>>      * aarch64.md (extv, extzv expander): Generalize so that it works
>>      for both SImode and DImode.
>>      (extv_3264, extzv_3264): New pattern to handle extractions with
>>      size change between the input and output operand.
>>      * expmed.c (extract_bitfield_using_extv): Do not allow changing
>>      sizes of floating point objects using SUBREGs.
>>
>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
>> index 5a1894063a1..13e2bca05a1 100644
>> --- a/gcc/config/aarch64/aarch64.md
>> +++ b/gcc/config/aarch64/aarch64.md
>> @@ -5390,16 +5390,16 @@
>>  ;; Bitfields
>>  ;; -------------------------------------------------------------------
>>  
>> -(define_expand "<optab>"
>> -  [(set (match_operand:DI 0 "register_operand" "=r")
>> -    (ANY_EXTRACT:DI (match_operand:DI 1 "register_operand")
>> +(define_expand "<optab><mode>"
>> +  [(set (match_operand:GPI 0 "register_operand" "=r")
>> +    (ANY_EXTRACT:GPI (match_operand:GPI 1 "register_operand")
>>                      (match_operand 2
>> -                      "aarch64_simd_shift_imm_offset_di")
>> -                    (match_operand 3 "aarch64_simd_shift_imm_di")))]
>> +                      "aarch64_simd_shift_imm_offset_<mode>")
>> +                    (match_operand 3 "aarch64_simd_shift_imm_<mode>")))]
>>    ""
>>    {
>>      if (!IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]),
>> -               1, GET_MODE_BITSIZE (DImode) - 1))
>> +               1, GET_MODE_BITSIZE (<MODE>mode) - 1))
>>       FAIL;
>>    }
>>  )
>> @@ -5418,6 +5418,21 @@
>>    [(set_attr "type" "bfx")]
>>  )
>>  
>> +;; Similar to the previous pattern, but 32->64 extraction
>> +(define_insn "*<optab>_3264"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +    (ANY_EXTRACT:DI (match_operand:SI 1 "register_operand" "r")
> 
> So is that valid RTL (DI extract of SI value)?  Why wouldn't that just
> use a paradoxical subreg to widen the register being extracted?
It wasn't clear to me -- I couldn't find anything which indicated it
wasn't valid.  Given that the RTL code explicitly sign/zero extends the
value we should have the freedom to have the destination be wider.

No idea why combine didn't wrap it in a subreg when it didn't match the
first time around.  I suspect it's never been worth doing until we added
the multi-mode extraction expander support and the targets where we're
using multi-mode extraction expanders didn't have a testcase where it'd
matter.

Jeff

Reply via email to