[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2022-03-01 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #23 from H.J. Lu --- A patch is posted at https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591093.html

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2022-03-02 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #24 from H.J. Lu --- Another testcase: [hjl@gnu-tgl-2 pr103393]$ cat x.c struct TestData { float arr[8]; }; void cpy(struct TestData *s1, struct TestData *s2 ) { for(int i=0; i<16; ++i) { s1->arr[i] = s2->arr[i]; } } [hjl@

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2022-03-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 Richard Biener changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Component|target

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #9 from Richard Biener --- In particular MOVE_RATIO only looks applicable if the target (or RTL expansion?) would split the bigger GIMPLE move into pieces honoring MOVE_MAX. Though technically even MOVE_MAX only guarantees: "The ma

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #11 from Jakub Jelinek --- Actually no, GET_MODE_SIZE in that case is the size of the whole operation. To me the previous change looks extremely ARM specific with load lines in mind which no other target has. If we want to support m

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-25 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #12 from Richard Earnshaw --- (In reply to Jakub Jelinek from comment #10) > Alternatively, couldn't we check next to that new > && have_insn_for (SET, mode) > also that > && known_le (GET_MODE_SIZE

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-25 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #13 from Richard Earnshaw --- Also, note that the comment in gimple-fold.c prior to this change read: /* If we can perform the copy efficiently with first doing all loads and then all stores inline it that way. Curre

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-25 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #14 from H.J. Lu --- (In reply to Richard Earnshaw from comment #13) > Also, note that the comment in gimple-fold.c prior to this change read: > > /* If we can perform the copy efficiently with first doing all loads >

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-26 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #15 from Richard Earnshaw --- It seems perverse to me that you have a standard named pattern in the x86 backend that is enabled, but then you somehow expect the generic parts of the compiler to know that it shouldn't be used. Eith

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #16 from Jakub Jelinek --- (In reply to Richard Earnshaw from comment #15) > It seems perverse to me that you have a standard named pattern in the x86 > backend that is enabled, but then you somehow expect the generic parts of > the

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-26 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #17 from Richard Earnshaw --- (In reply to Jakub Jelinek from comment #16) > (In reply to Richard Earnshaw from comment #15) > > It seems perverse to me that you have a standard named pattern in the x86 > > backend that is enabled, b

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #18 from Jakub Jelinek --- No. Generic vectors need to work too. And those always do use the standard optabs.

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-26 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #19 from Richard Earnshaw --- It sounds to me like you're trying to keep your cake and eat it.

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #20 from Jakub Jelinek --- The aarch64 MOVE_MAX definition of (UNITS_PER_WORD * 2) clearly doesn't match the documentation, because with Neon/SVE around, you can move quickly much more bytes by a single instruction than that. And th

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #21 from Richard Biener --- Note using MOVE_RATIO in gimple-fold but then always emitting just a single stmt and not honoring MOVE_MAX on that is fishy - you seem to be expecting RTL expansion to fix up but that's clearly not happeni

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-26 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #22 from Richard Earnshaw --- Looking at the different port definitions for MOVE_MAX, it would appear that only the i386 port seems to be using a value that is not the size of a general-purpose register.