RE: [PATCH][AARCH64] Fix for PR86901

2020-04-30 Thread Modi Mo via Gcc-patches
Hey all-apologies for the long delay. Haven't had time until recently to look 
into this further.
>>> The zero extract now matching against other modes would generate a
>>> test + branch rather than the combined instruction which led to the
>>> code size regression. I've updated the patch so that tbnz etc. matches GPI 
>>> and
>>that brings code size down to <0.2% in spec2017 and <0.4% in spec2006.
>>
>>That's looking better indeed. I notice there are still differences, eg. 
>>tbz/tbnz
>>counts are significantly different in perlbench, with ~350 missed cases 
>>overall
>>(mostly tbz reg, #7).
>>
>>There are also more uses of uxtw, ubfiz, sbfiz - for example I see cases like 
>>this
>>in namd:
>>
>>  42c7dc:       13007400        sbfx    w0, w0, #0, #30
>>  42c7e0:       937c7c00        sbfiz   x0, x0, #4, #32
>>
>>So it would be a good idea to check any benchmarks where there is still a non-
>>trivial codesize difference. You can get a quick idea what is happening by
>>grepping for instructions like this:
>>
>>grep -c sbfiz out1.txt out2.txt
>>out1.txt:872
>>out2.txt:934
>>
>>grep -c tbnz out1.txt out2.txt
>>out1.txt:5189
>>out2.txt:4989

That's really good insight Wilco! I took a look at the tbnz/tbz case in perl 
and we lose matching against this because allowing SI mode on extv/extzv causes 
subst in combine.c to generate:

(lshiftrt:SI (reg:SI 107 [ _16 ])
(const_int 7 [0x7]))
(nil)

Instead of:

(and:DI (lshiftrt:DI (subreg:DI (reg:SI 107 [ _16 ]) 0)
(const_int 7 [0x7]))
(const_int 1 [0x1]))

The latter case is picked up in make_compound_operation_int to transform into a 
zero_extract while the new case is left alone. A lshiftrt generally can't be 
reduced down to a bit-test but in this case it can because we have zero_bit 
information on it. Given that, looking around try_combine it seems like the 
best place to detect this pattern is in the 2nd chance code after the first 
failure of recog_for_combine which I've done in this patch. I think this is the 
place to put this fix given changing subst/make_compound_operation_int leads to 
significantly more diffs.

After this change the total number of tbnz/tbz lines up near identical to the 
baseline which is good and overall size within .1% on spec 2017 and spec 2006. 
However, looking further at ubfiz there's a pretty large increase in certain 
benchmarks. I looked into spec 2017/blender and we fail to combine this pattern:

Trying 653 -> 654:
  653: r512:SI=r94:SI 0>>0x8
  REG_DEAD r94:SI
  654: r513:DI=zero_extend(r512:SI)
  REG_DEAD r512:SI
Failed to match this instruction:
(set (reg:DI 513)
(zero_extract:DI (reg:SI 94 [ bswapdst_4 ])
(const_int 8 [0x8])
(const_int 8 [0x8])))

Where previously we combined it like this:

Trying 653 -> 654:
  653: r512:SI=r94:SI 0>>0x8
  REG_DEAD r94:SI
  654: r513:DI=zero_extend(r512:SI)
  REG_DEAD r512:SI
Successfully matched this instruction:
(set (reg:DI 513)
(zero_extract:DI (subreg:DI (reg:SI 94 [ bswapdst_4 ]) 0) // subreg used
(const_int 8 [0x8])
(const_int 8 [0x8])))

Here's where I'm at an impasse. The code that generates the modes in 
get_best_reg_extraction_insn looks at the inner mode of SI now that extzvsi is 
valid and generates a non-subreg use. However, the MD pattern is looking for 
all modes being DI or SI not a mix. I think a fix could be done to canonicalize 
these extracts to the same mode but am unsure if in general a mode mismatched 
extract RTX is valid which would make this a fairly large change. 

Latest patch with fix for tbnz/tbz is attached alongside the numbers for SPEC 
and instruction count for SPEC 2017 are attached for reference.

>>> Can you send me the necessary documents to make that happen? Thanks!
>>
>>That's something you need to sort out with the fsf. There is a mailing list 
>>for this:
>>mailto:ass...@gnu.org.

I haven't had any response from my previous mail there. Should I add one of you 
to the CC or mail someone specifically to get traction?

Best,
Modi


pr86901.patch
Description: pr86901.patch
basediff
% increase
textdatabss total   filenametextdata
bss total   filename
1038264 243802  12472   1294538 
base/benchspec/CPU2006/400.perlbench/exe/perlbench_base.gcc9-base   
1038344 243714  12472   1294530 
diff/benchspec/CPU2006/400.perlbench/exe/perlbench_base.gcc9-diff   
0.01%
72024   16030   432892382   
base/benchspec/CPU2006/401.bzip2/exe/bzip2_base.gcc9-base   
72016   16030   432892374   
diff/benchspec/CPU2006/401.bzip2/exe/bzip2_base.gcc9-diff   
-0.01%
2651976 816398  749792  4218166 
base/benchspec/CPU2006/403.gcc/exe/gcc_base.gcc9-base   2652096 
816558  749792  4218446 

Re: [PATCH][AARCH64] Fix for PR86901

2020-03-03 Thread Wilco Dijkstra
Hi Modi,

> The zero extract now matching against other modes would generate a test + 
> branch rather
> than the combined instruction which led to the code size regression. I've 
> updated the patch
> so that tbnz etc. matches GPI and that brings code size down to <0.2% in 
> spec2017 and <0.4% in spec2006.

That's looking better indeed. I notice there are still differences, eg. 
tbz/tbnz counts are
significantly different in perlbench, with ~350 missed cases overall (mostly 
tbz reg, #7).

There are also more uses of uxtw, ubfiz, sbfiz - for example I see cases like 
this in namd:

  42c7dc:       13007400        sbfx    w0, w0, #0, #30
  42c7e0:       937c7c00        sbfiz   x0, x0, #4, #32

So it would be a good idea to check any benchmarks where there is still a 
non-trivial
codesize difference. You can get a quick idea what is happening by grepping for
instructions like this:

grep -c sbfiz out1.txt out2.txt
out1.txt:872
out2.txt:934

grep -c tbnz out1.txt out2.txt
out1.txt:5189
out2.txt:4989

> Can you send me the necessary documents to make that happen? Thanks!

That's something you need to sort out with the fsf. There is a mailing list for 
this:
ass...@gnu.org.

Cheers,
Wilco



Re: [PATCH][AARCH64] Fix for PR86901

2020-03-03 Thread Wilco Dijkstra
Hi Modi,

> The zero extract now matching against other modes would generate a test + 
> branch rather
> than the combined instruction which led to the code size regression. I've 
> updated the patch
> so that tbnz etc. matches GPI and that brings code size down to <0.2% in 
> spec2017 and <0.4% in spec2006.

That's looking better indeed. I notice there are still differences, eg. 
tbz/tbnz counts are
significantly different in perlbench, with ~350 missed cases overall (mostly 
tbz reg, #7).

There are also more uses of uxtw, ubfiz, sbfiz - for example 


@Wilco I've gotten instruction on my side to set up an individual contributor's 
license for the time being. Can you send me the necessary documents to make 
that happen? Thanks!

RE: [PATCH][AARCH64] Fix for PR86901

2020-02-21 Thread Modi Mo via gcc-patches
> > Sounds good. I'll get those setup and running and will report back on
> > findings. What's the preferred way to measure codesize? I'm assuming
> > by default the code pages are aligned so smaller differences would need to 
> > trip
> over the boundary to actually show up.
> 
> You can use the size command on the binaries:
> 
> >size /bin/ls
>text  data bss dec hex filename
>  107271  20243472  112767   1b87f /bin/ls
> 
> As you can see it shows the text size in bytes. It is not rounded up to a 
> page, so it
> is an accurate measure of the codesize. Generally -O2 size is most useful to
> check (since that is what most applications build with), but -Ofast -flto can 
> be
> useful as well (the global inlining means you get instruction combinations 
> which
> appear less often with -O2).
> 
> Cheers,
> Wilco
Alrighty, I've got spec 2017 and spec 2006 setup and building. Using default 
configurations so -O2 in spec2006 and -O3 in spec2017. Testing the patch as 
last sent showed a 1% code size regression in spec 2017 perlbench which turned 
out to be a missing pattern for tbnz and all its variants:

(define_insn "*tb1"
  [(set (pc) (if_then_else
  (EQL (zero_extract:DI (match_operand:GPI 0 "register_operand" 
"r")  <--- only matches against zero_extract:DI
(const_int 1)
(match_operand 1
  "aarch64_simd_shift_imm_" "n"))

The zero extract now matching against other modes would generate a test + 
branch rather than the combined instruction which led to the code size 
regression. I've updated the patch so that tbnz etc. matches GPI and that 
brings code size down to <0.2% in spec2017 and <0.4% in spec2006.

Spec results are attached for reference.

@Wilco I've gotten instruction on my side to set up an individual contributor's 
license for the time being. Can you send me the necessary documents to make 
that happen? Thanks!

ChangeLog:
2020-02-21  Di Mo  

gcc/
* config/aarch64/aarch64.md: Add GPI modes to extsv/extv patterns. 
Allow tb1 pattern to match against zero_extract:GPI.
* expmed.c (extract_bit_field_using_extv): Change gen_lowpart to 
gen_lowpart_if_possible to avoid compiler assert building libgcc.
testsuite/
* gcc.target/aarch64/pr86901.c: Add new test.

Modi


pr86901.patch
Description: pr86901.patch


spec2006
Description: spec2006


spec2017
Description: spec2017


Re: [PATCH][AARCH64] Fix for PR86901

2020-02-07 Thread Wilco Dijkstra
Hi,

Richard wrote:
> However, inside the compiler we really want to represent this as a 
>shift.
...
> Ideally this would be handled inside the mid-end expansion of an 
> extract, but in the absence of that I think this is best done inside the 
> extv expansion so that we never end up with a real extract in that case.

Yes the mid-end could be improved - it turns out it is due to expansion of
bitfields, all variations of (x & mask) >> N are optimized into shifts early on.

However it turns out Combine can already transform these zero/sign_extends
into shifts, so we do end up with good code. With the latest patch I get:

typedef struct { int x : 6, y : 6, z : 20; } X;

int f (int x, X *p) { return x + p->z; }

ldr w1, [x1]
add w0, w0, w1, asr 12
ret

So this case looks alright.

> Sounds good. I'll get those setup and running and will report back on 
> findings. What's
> the preferred way to measure codesize? I'm assuming by default the code pages 
> are 
> aligned so smaller differences would need to trip over the boundary to 
> actually show up.

You can use the size command on the binaries:

>size /bin/ls
   textdata bss dec hex filename
 10727120243472  112767   1b87f /bin/ls

As you can see it shows the text size in bytes. It is not rounded up to a page, 
so it is an 
accurate measure of the codesize. Generally -O2 size is most useful to check 
(since that
is what most applications build with), but -Ofast -flto can be useful as well 
(the global 
inlining means you get instruction combinations which appear less often with 
-O2).

Cheers,
Wilco

Re: [PATCH][AARCH64] Fix for PR86901

2020-02-07 Thread Richard Earnshaw (lists)

On 07/02/2020 02:12, Modi Mo via gcc-patches wrote:

I did a quick bootstrap, this shows several failures like:

gcc/builtins.c:9427:1: error: unrecognizable insn:
  9427 | }
   | ^
(insn 212 211 213 24 (set (reg:SI 207)
 (zero_extract:SI (reg:SI 206)
 (const_int 26 [0x1a])
 (const_int 6 [0x6]))) "/gcc/builtins.c":9413:44 -1
  (nil))

The issue here is that 26+6 = 32 and that's not a valid ubfx encoding.
Currently cases like this are split into a right shift in aarch64.md around line
5569:


Appreciate you taking a look and the validation. I've gotten access to an 
aarch64 server and the bootstrap demonstrated the issue you saw. This was 
caused by my re-definition of the pattern to:
+  if (width == 0 || (pos + width) > GET_MODE_BITSIZE (mode))
+FAIL;

Which meant for SImode only a sum of >32 bit actually triggers the fail condition 
for the define_expand whereas the existing define_insn fails on >=32 bit. I looked 
into the architecture reference manual and the bits are available for ubfx/sbfx for 
that type of encoding and the documentation says you can use [lsb, 32-lsb] for SImode 
as a legal pair. Checking with the GNU assembler it does accept a sum of 32 but 
transforms it into a LSR:

Assembly file:
ubfxw0, w0, 24, 8

Disassembly of section .text:

 :
0:   53187c00lsr w0, w0, #24


This is how LSR is implemented in AArch64, as an extract.  So the 
disassembler is showing the canonical representation.


However, inside the compiler we really want to represent this as a 
shift.  Why?  Because there are cases where we can then merge a shift 
into other operations when we can't merge the more general extract 
operation.  For example, we can merge an LSR into a subsequent 'AND' 
operation, but can't merge a more general extract into an AND. 
Essentially, we want the compiler to describe this in the canonical 
shift form rather than the extract form.


Ideally this would be handled inside the mid-end expansion of an 
extract, but in the absence of that I think this is best done inside the 
extv expansion so that we never end up with a real extract in that case.





Similarly with the 64 bit version it'll become a 64 bit LSR. Certainly other 
assemblers could trip over, I've attached a new patch that allows this encoding 
and bootstrap + testing c/c++ testsuite looks good. I'll defer to you if it's 
better to explicitly do the transformation in GCC.


;; When the bit position and width add up to 32 we can use a W-reg LSR ;;
instruction taking advantage of the implicit zero-extension of the X-reg.
(define_split
   [(set (match_operand:DI 0 "register_operand")
 (zero_extract:DI (match_operand:DI 1 "register_operand")
  (match_operand 2
"aarch64_simd_shift_imm_offset_di")
  (match_operand 3
"aarch64_simd_shift_imm_di")))]
   "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), 1,
  GET_MODE_BITSIZE (DImode) - 1)
&& (INTVAL (operands[2]) + INTVAL (operands[3]))
== GET_MODE_BITSIZE (SImode)"
   [(set (match_dup 0)
 (zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3]
   {
 operands[4] = gen_lowpart (SImode, operands[1]);
   }

However that only supports DImode, not SImode, so it needs to be changed
to be more general using GPI.

Your new extv patterns should replace the magic patterns above it:


With the previous discovery that a sum of 32/64 will trigger LSR in the 
assembler I was curious what would happen if I remove this pattern. Turns out, 
you will end up with a UBFX x0, x0, 24, 8 compared to a LSR w0, w0, 24 in the 
test case associated with this change (gcc.target/aarch64/ubfx_lsr_1.c) which 
doesn't get transformed into an LSR by the assembler since it's in 64 bit mode. 
So this pattern still has value but I don't think it's necessary to extend it 
to DI since that'll automatically get turned into a LSR by the assembler as I 
previously mentioned.



;; ---
;; Bitfields
;; ---

(define_expand ""

These are the current extv/extzv patterns, but without a mode. They are no
longer used when we start using the new ones.

Note you can write  to combine the extzv and extz patterns.
But please add a comment mentioning the pattern names so they are easy to
find!


Good call here, made this change in the new patch. I did see the define_insn of these 
guys below it but somehow missed that they were expanded just above. I believe, from 
my understanding of GCC, that the matching pattern below the first line is what 
constrains  into just extv/extsv from the long list of iterators it 
belongs to. Still, I see that there's constrained iterators elsewhere like:

;; Optab prefix for sign/zero-extending operations
(define_code_attr su_optab [(sign_extend 

RE: [PATCH][AARCH64] Fix for PR86901

2020-02-06 Thread Modi Mo via gcc-patches
> I did a quick bootstrap, this shows several failures like:
> 
> gcc/builtins.c:9427:1: error: unrecognizable insn:
>  9427 | }
>   | ^
> (insn 212 211 213 24 (set (reg:SI 207)
> (zero_extract:SI (reg:SI 206)
> (const_int 26 [0x1a])
> (const_int 6 [0x6]))) "/gcc/builtins.c":9413:44 -1
>  (nil))
> 
> The issue here is that 26+6 = 32 and that's not a valid ubfx encoding.
> Currently cases like this are split into a right shift in aarch64.md around 
> line
> 5569:

Appreciate you taking a look and the validation. I've gotten access to an 
aarch64 server and the bootstrap demonstrated the issue you saw. This was 
caused by my re-definition of the pattern to:
+  if (width == 0 || (pos + width) > GET_MODE_BITSIZE (mode))
+FAIL;

Which meant for SImode only a sum of >32 bit actually triggers the fail 
condition for the define_expand whereas the existing define_insn fails on >=32 
bit. I looked into the architecture reference manual and the bits are available 
for ubfx/sbfx for that type of encoding and the documentation says you can use 
[lsb, 32-lsb] for SImode as a legal pair. Checking with the GNU assembler it 
does accept a sum of 32 but transforms it into a LSR:

Assembly file:
ubfxw0, w0, 24, 8

Disassembly of section .text:

 :
   0:   53187c00lsr w0, w0, #24

Similarly with the 64 bit version it'll become a 64 bit LSR. Certainly other 
assemblers could trip over, I've attached a new patch that allows this encoding 
and bootstrap + testing c/c++ testsuite looks good. I'll defer to you if it's 
better to explicitly do the transformation in GCC.

> ;; When the bit position and width add up to 32 we can use a W-reg LSR ;;
> instruction taking advantage of the implicit zero-extension of the X-reg.
> (define_split
>   [(set (match_operand:DI 0 "register_operand")
> (zero_extract:DI (match_operand:DI 1 "register_operand")
>  (match_operand 2
>"aarch64_simd_shift_imm_offset_di")
>  (match_operand 3
>"aarch64_simd_shift_imm_di")))]
>   "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), 1,
>  GET_MODE_BITSIZE (DImode) - 1)
>&& (INTVAL (operands[2]) + INTVAL (operands[3]))
>== GET_MODE_BITSIZE (SImode)"
>   [(set (match_dup 0)
> (zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3]
>   {
> operands[4] = gen_lowpart (SImode, operands[1]);
>   }
> 
> However that only supports DImode, not SImode, so it needs to be changed
> to be more general using GPI.
> 
> Your new extv patterns should replace the magic patterns above it:

With the previous discovery that a sum of 32/64 will trigger LSR in the 
assembler I was curious what would happen if I remove this pattern. Turns out, 
you will end up with a UBFX x0, x0, 24, 8 compared to a LSR w0, w0, 24 in the 
test case associated with this change (gcc.target/aarch64/ubfx_lsr_1.c) which 
doesn't get transformed into an LSR by the assembler since it's in 64 bit mode. 
So this pattern still has value but I don't think it's necessary to extend it 
to DI since that'll automatically get turned into a LSR by the assembler as I 
previously mentioned.


> ;; ---
> ;; Bitfields
> ;; ---
> 
> (define_expand ""
> 
> These are the current extv/extzv patterns, but without a mode. They are no
> longer used when we start using the new ones.
> 
> Note you can write  to combine the extzv and extz patterns.
> But please add a comment mentioning the pattern names so they are easy to
> find!

Good call here, made this change in the new patch. I did see the define_insn of 
these guys below it but somehow missed that they were expanded just above. I 
believe, from my understanding of GCC, that the matching pattern below the 
first line is what constrains  into just extv/extsv from the long list 
of iterators it belongs to. Still, I see that there's constrained iterators 
elsewhere like: 

;; Optab prefix for sign/zero-extending operations
(define_code_attr su_optab [(sign_extend "") (zero_extend "u")

I added a comment in this patch before the pattern. Thoughts on defining 
another constrained version to make it clearer (in addition or in lieu of the 
comment)?

> Besides a bootstrap it is always useful to compile a large body of code with
> your change (say SPEC2006/2017) and check for differences in at least
> codesize. If there is an increase in instruction count then there may be more
> issues that need to be resolved.

Sounds good. I'll get those setup and running and will report back on findings. 
What's the preferred way to measure codesize? I'm assuming by default the code 
pages are aligned so smaller differences would need to trip over the boundary 
to actually show up. 
 
> I find it easiest to develop on a many-core AArch64 

Re: [PATCH][AARCH64] Fix for PR86901

2020-02-05 Thread Wilco Dijkstra
Hi Modi,

Thanks for your patch! 

> Adding support for extv and extzv on aarch64 as described in 
> PR86901. I also changed
> extract_bit_field_using_extv to use gen_lowpart_if_possible instead of 
> gen_lowpart directly. Using
> gen_lowpart directly will fail with an ICE in building libgcc when the 
> compiler fails to successfully do so 
> whereas gen_lowpart_if_possible will bail out of matching this pattern 
> gracefully.

I did a quick bootstrap, this shows several failures like:

gcc/builtins.c:9427:1: error: unrecognizable insn:
 9427 | }
  | ^
(insn 212 211 213 24 (set (reg:SI 207)
(zero_extract:SI (reg:SI 206)
(const_int 26 [0x1a])
(const_int 6 [0x6]))) "/gcc/builtins.c":9413:44 -1
 (nil))

The issue here is that 26+6 = 32 and that's not a valid ubfx encoding. Currently
cases like this are split into a right shift in aarch64.md around line 5569:

;; When the bit position and width add up to 32 we can use a W-reg LSR
;; instruction taking advantage of the implicit zero-extension of the X-reg.
(define_split
  [(set (match_operand:DI 0 "register_operand")
(zero_extract:DI (match_operand:DI 1 "register_operand")
 (match_operand 2
   "aarch64_simd_shift_imm_offset_di")
 (match_operand 3
   "aarch64_simd_shift_imm_di")))]
  "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), 1,
 GET_MODE_BITSIZE (DImode) - 1)
   && (INTVAL (operands[2]) + INTVAL (operands[3]))
   == GET_MODE_BITSIZE (SImode)"
  [(set (match_dup 0)
(zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3]
  {
operands[4] = gen_lowpart (SImode, operands[1]);
  }

However that only supports DImode, not SImode, so it needs to be changed to
be more general using GPI.

Your new extv patterns should replace the magic patterns above it:

;; ---
;; Bitfields
;; ---

(define_expand ""

These are the current extv/extzv patterns, but without a mode. They are no 
longer
used when we start using the new ones.

Note you can write  to combine the extzv and extz patterns.
But please add a comment mentioning the pattern names so they are easy to find!

Besides a bootstrap it is always useful to compile a large body of code with 
your change
(say SPEC2006/2017) and check for differences in at least codesize. If there is 
an increase
in instruction count then there may be more issues that need to be resolved.

> I'm looking through past mails and https://gcc.gnu.org/contribute.html which 
> details testing bootstrap.
> I'm building a cross-compiler (x64_aarch64) and the instructions don't 
> address that scenario. The GCC 
> cross-build is green and there's no regressions on the C/C++ tests (The 
> go/fortran etc. look like they 
> need additional infrastructure built on my side to work). Is there a workflow 
> for cross-builds or should I 
> aim to get an aarch64 native machine for full validation?

I find it easiest to develop on a many-core AArch64 server so you get much 
faster builds,
bootstraps and regression tests. Cross compilers are mostly useful if you want 
to test big-endian
or new architecture features which are not yet supported in hardware. You don't 
normally need
to test Go/Fortran/ADA etc unless your patch does something that would directly 
affect them.

Finally do you have a copyright assignment with the FSF?

Cheers,
Wilco

[PATCH][AARCH64] Fix for PR86901

2020-02-04 Thread Modi Mo via gcc-patches
Hi,

Adding support for extv and extzv on aarch64 as described in 
PR86901. I also changed extract_bit_field_using_extv to use 
gen_lowpart_if_possible instead of gen_lowpart directly. Using gen_lowpart 
directly will fail with an ICE in building libgcc when the compiler fails to 
successfully do so whereas gen_lowpart_if_possible will bail out of matching 
this pattern gracefully.

I'm looking through past mails and https://gcc.gnu.org/contribute.html which 
details testing bootstrap. I'm building a cross-compiler (x64_aarch64) and the 
instructions don't address that scenario. The GCC cross-build is green and 
there's no regressions on the C/C++ tests (The go/fortran etc. look like they 
need additional infrastructure built on my side to work). Is there a workflow 
for cross-builds or should I aim to get an aarch64 native machine for full 
validation?


ChangeLog:
2020-02-03  Di Mo  

gcc/
* config/aarch64/aarch64.md: Add define_expand for extv and 
extzv.
* expmed.c (extract_bit_field_using_extv): Change gen_lowpart to 
gen_lowpart_if_possible to avoid compiler assert building libgcc.
testsuite/
* gcc.target/aarch64/pr86901.c: Add new test.


Best,
Modi


pr86901.patch
Description: pr86901.patch