[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #19 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6b1f841ce0ccf30eda7896ba5ab0aa94c72307b2 commit r11-7799-g6b1f841ce0ccf30eda7896ba5ab0aa94c72307b2 Author: Jakub Jelinek Date: Tue Mar 23 16:29:47 2021 +0100 Add forgotten attribution on PR target/99593 testcase.
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #18 from Christophe Lyon --- Not a big deal if you forget, that's a detail :-)
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #17 from Jakub Jelinek --- Christophe, sorry for forgetting to add Co-authored-by, will fix it up in the ChangeLog tomorrow by hand.
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #16 from Jakub Jelinek --- Fixed.
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #15 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:009528d61c796608affd1eaa18ae31a3679eb46d commit r11-7733-g009528d61c796608affd1eaa18ae31a3679eb46d Author: Jakub Jelinek Date: Fri Mar 19 13:48:44 2021 +0100 arm: Fix mve_vshlq* [PR99593] As mentioned in the PR, before the r11-6708-gbfab355012ca0f5219da8beb04f2fdaf757d34b7 change v[al]shr3 expanders were expanding the shifts by register to gen_ashl3_{,un}signed which don't support immediate CONST_VECTOR shift amounts, but now expand to mve_vshlq_ which does. The testcase ICEs, because the constraint doesn't match the predicate and because LRA works solely with the constraints, so it can e.g. from REG_EQUAL propagate there a CONST_VECTOR which matches the constraint but fails the predicate and only later on other passes will notice the predicate fails and ICE. Fixed by adding a constraint that matches the immediate part of the predicate. PR target/99593 * config/arm/constraints.md (Ds): New constraint. * config/arm/vec-common.md (mve_vshlq_): Use w,Ds constraint instead of w,Dm. * g++.target/arm/pr99593.C: New test.
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #14 from Christophe Lyon --- I can confirm that the new test (as described in comment #12) works: - alone it results in ICE in the relevant configurations (skipped otherwise) - with the patch, it passes in the relevant configurations (skipped otherwise)
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #13 from Jakub Jelinek --- Oops, thanks for catching that. Made those changes and restarted testing.
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #12 from Christophe Lyon --- I have tests in progress too (with and without the fix), except that I have typedef uint32x4_t i; instead of typedef uint32x2_t i; and I replaced the (__builtin_neon_hi *) cast with (int16_t*)
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #11 from Jakub Jelinek --- Created attachment 50415 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50415&action=edit gcc11-pr99593.patch Ok, trying to test this overnight.
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #10 from ktkachov at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #9) > Comment on attachment 50412 [details] > proposed testcase > > Any reason not to replace > __simd128_int32_t with int32x4_t , > __simd128_float32_t with float32x4_t and > __simd128_uint32_t with uint32x2_t ? > Drop the commented out __builtin_* names etc.? Drop the (__builtin_neon_hi > *) cast? > Otherwise LGTM if it still FAILs without the above patch and PASSes with it, > but the final call is Kyrill's (or other ARM maintainers'). Indeed. Let's have a consolidated patch on gcc-patches for review.
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #9 from Jakub Jelinek --- Comment on attachment 50412 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50412 proposed testcase Any reason not to replace __simd128_int32_t with int32x4_t , __simd128_float32_t with float32x4_t and __simd128_uint32_t with uint32x2_t ? Drop the commented out __builtin_* names etc.? Drop the (__builtin_neon_hi *) cast? Otherwise LGTM if it still FAILs without the above patch and PASSes with it, but the final call is Kyrill's (or other ARM maintainers').
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #8 from ktkachov at gcc dot gnu.org --- (In reply to Christophe Lyon from comment #7) > Created attachment 50412 [details] > proposed testcase > > Here is a proposal for a testcase derived from the initial description: > - added relevant dg-* directives > - replaced builtin calls with intrinsics > > Jakub, Kyrill, is that OK with you? Thanks, that looks ok except: +typedef __simd128_int32_t g; +typedef __simd128_float32_t h; +typedef __simd128_uint32_t i; Can we replace them with the right ACLE types as well?
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #7 from Christophe Lyon --- Created attachment 50412 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50412&action=edit proposed testcase Here is a proposal for a testcase derived from the initial description: - added relevant dg-* directives - replaced builtin calls with intrinsics Jakub, Kyrill, is that OK with you?
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #6 from Jakub Jelinek --- Before r11-6708 it was using different patterns - ashl3_signed - and those don't accept immediate CONST_VECTOR at all, while mve_vshlq_ does.
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #5 from Jakub Jelinek --- (In reply to Christophe Lyon from comment #4) > I have updated the test in the initial description to call intrinsics, added > relevant dg-* directives. Testing with several configurations under progress. > > > Is it really r11-6708 that introduced the problem? It doesn't modify vshq > unlike it's predecessor r11-6707-g7432f255b70811dafaf325d94036ac580891de69. I think so, r11-6707 doesn't ICE. In *.expand dump, the only r11-6707 to r11-6708 differences are 4 times UNSPEC_ASHIFT_SIGNED -> VSHLQ_S changes, and that is true up to ira dump too. *.lra dump has: -(insn 256 174 176 4 (set (reg:V4SI 52 d18 [331]) -(const_vector:V4SI [ -(const_int -3 [0xfffd]) repeated x4 -])) "ccLSOUAW.ii":66:45 1061 {*neon_movv4si} - (nil)) -(insn 176 256 177 4 (set (reg:V4SI 48 d16 [297]) +(insn 176 174 177 4 (set (reg:V4SI 48 d16 [297]) (unspec:V4SI [ (reg:V4SI 48 d16 [296]) -(reg:V4SI 52 d18 [331]) -] UNSPEC_ASHIFT_SIGNED)) "ccLSOUAW.ii":66:45 1350 {ashlv4si3_signed} +(const_vector:V4SI [ +(const_int -3 [0xfffd]) repeated x4 +]) +] VSHLQ_S)) "ccLSOUAW.ii":66:45 510 {mve_vshlq_sv4si} (expr_list:REG_EQUAL (unspec:V4SI [ (reg:V4SI 48 d16 [296]) (const_vector:V4SI [ (const_int -3 [0xfffd]) repeated x4 ]) -] UNSPEC_ASHIFT_SIGNED) +] VSHLQ_S) (nil))) difference.
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 Christophe Lyon changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |clyon at gcc dot gnu.org --- Comment #4 from Christophe Lyon --- I have updated the test in the initial description to call intrinsics, added relevant dg-* directives. Testing with several configurations under progress. Is it really r11-6708 that introduced the problem? It doesn't modify vshq unlike it's predecessor r11-6707-g7432f255b70811dafaf325d94036ac580891de69. And that one only moves the offending pattern from neon.md to vec-common.md, so the bug was probably already present? As of -mtune=generic-armv7-a, I can reproduce the ICE with -mcpu=generic-armv7-a, but I haven't found a -march equivalent (-march=armv7-a+fp does not trigger the ICE)
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 --- Comment #3 from Jakub Jelinek --- I've tried to reproduce it with typedef __simd128_int32_t int32x4_t; void foo (int32x4_t c) { int32x4_t b = __builtin_neon_vdup_nv4si (3); register int32x4_t a __asm ("d16"); asm volatile ("" : "=w" (a)); a = a >> b; a |= c; asm volatile ("" : : "w" (a)); } but it doesn't, for some reason the ICE needs a REG_EQUAL attribute on the right shift into which we propagate the constant and LRA picks the constant from there, but haven't managed to force that.
[Bug target/99593] [11 Regression] arm Neon ICE when compiling firefox (skia) since r11-6708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99593 ktkachov at gcc dot gnu.org changed: What|Removed |Added Summary|[11 Regression] arm MVE ICE |[11 Regression] arm Neon |when compiling firefox |ICE when compiling firefox |(skia) since r11-6708 |(skia) since r11-6708 Keywords||ice-on-valid-code --- Comment #2 from ktkachov at gcc dot gnu.org --- (Note it's a Neon ICE, not MVE). Yeah, that fix looks ok. Chritophe, could you help here to write a testcase using arm_neon.h intrinsics (rather than the builtins they decompose to)?