[Bug target/104458] [11/12 Regression] ICE: SIGSEGV in gen_udivmodhiqi3 -> copy_rtx with -O1 -m8bit-idiv

2022-02-09 Thread ubizjak at gmail dot com via Gcc-bugs
at gcc dot gnu.org |ubizjak at gmail dot com Last reconfirmed||2022-02-09 Ever confirmed|0 |1 --- Comment #1 from Uroš Bizjak --- The testcase is quite creative with casts, creating: (gdb) p debug_rtx ( operands[3]) (subreg:DI

[Bug tree-optimization/104445] [12 Regression] ICE in vect_create_partial_epilog, at tree-vect-loop.cc:5098

2022-02-09 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104445 --- Comment #7 from Uroš Bizjak --- (In reply to Richard Biener from comment #6) > We are missing vec_extractv2sisi or vec_extractv8qiv4qi, with -mno-mmx -mavx. > It seems we have addv2si3 available though. vec_extractv2sisi is available in

[Bug tree-optimization/104445] [12 Regression] ICE in vect_create_partial_epilog, at tree-vect-loop.cc:5098

2022-02-09 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104445 --- Comment #5 from Uroš Bizjak --- We do have: (define_expand "vec_extractv4qiqi" [(match_operand:QI 0 "register_operand") (match_operand:V4QI 1 "register_operand") (match_operand 2 "const_int_operand")] "TARGET_SSE4_1" {

[Bug target/104362] [12 Regression] ICE in ix86_expand_epilogue, at config/i386/i386.c:9362 since r12-3117-g6e5401e87d02919b

2022-02-03 Thread ubizjak at gmail dot com via Gcc-bugs
|--- |FIXED Target Milestone|12.0|11.4 Host|x86_64-linux-gnu|i386-linux-gnu Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #7 from Uroš Bizjak --- Fixed for gcc-11.4+

[Bug target/104362] [12 Regression] ICE in ix86_expand_epilogue, at config/i386/i386.c:9362 since r12-3117-g6e5401e87d02919b

2022-02-03 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104362 --- Comment #4 from Uroš Bizjak --- Or simply: --cut here-- diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index ad5a5caa413..dd5584fb8ed 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -7400,7 +7400,8 @@

[Bug target/104362] [12 Regression] ICE in ix86_expand_epilogue, at config/i386/i386.c:9362 since r12-3117-g6e5401e87d02919b

2022-02-03 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104362 --- Comment #3 from Uroš Bizjak --- diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index ad5a5caa413..a61a5390127 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -7403,6 +7403,10 @@ find_drap_reg (void)

[Bug middle-end/104151] [9/10/11/12 Regression] x86: excessive code generated for 128-bit byteswap

2022-01-31 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104151 --- Comment #12 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #10) > (In reply to Hongtao.liu from comment #4) > > Also there's separate issue, codegen for below is not optimal > > gimple: > > _11 = VIEW_CONVERT_EXPR(a_3(D)) > >

[Bug rtl-optimization/104306] New: Use secondary_reload for optimized interunit reg-reg moves

2022-01-31 Thread ubizjak at gmail dot com via Gcc-bugs
Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Created attachment 52318 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52318=edit Prototype patch I was looking

[Bug middle-end/104151] [9/10/11/12 Regression] x86: excessive code generated for 128-bit byteswap

2022-01-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104151 --- Comment #10 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #4) > Also there's separate issue, codegen for below is not optimal > gimple: > _11 = VIEW_CONVERT_EXPR(a_3(D)) > asm: > mov QWORD PTR [rsp-24], rdi >

[Bug rtl-optimization/104054] [9/10/11/12 Regression] '-fcompare-debug' failure (length) w/ -Os -funroll-loops since r8-3946-gfe94440235cfaa57

2022-01-19 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104054 --- Comment #8 from Uroš Bizjak --- Without debug instructions, the compiler is able to rename insns to: 65: di:DI=si:DI 66: dx:DI=r11:DI 74: cx:QI=0x1 REG_EQUAL 0x1 41: L41: 42: NOTE_INSN_BASIC_BLOCK 6 43:

[Bug rtl-optimization/104054] [9/10/11/12 Regression] '-fcompare-debug' failure (length) w/ -Os -funroll-loops since r8-3946-gfe94440235cfaa57

2022-01-19 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104054 --- Comment #7 from Uroš Bizjak --- For some reason the pass does not detect usage of Register si in (insn 55): (debug_insn 55 54 56 6 (var_location:TI b (reg/v:TI 4 si [orig:86 b ] [86])) -1 (nil)) Register ax (1): Register dx (1):

[Bug rtl-optimization/104054] [9/10/11/12 Regression] '-fcompare-debug' failure (length) w/ -Os -funroll-loops since r8-3946-gfe94440235cfaa57

2022-01-17 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104054 --- Comment #5 from Uroš Bizjak --- Could be a red herring, but in _.rnreg dump: Register r9 (1): 75 [GENERAL_REGS] 18 [ALL_REGS] 97 [GENERAL_REGS] Register r10 (1): 76 [GENERAL_REGS] 18 [ALL_REGS] 23 [GENERAL_REGS] ... Register di (1): 55

[Bug rtl-optimization/104054] [9/10/11/12 Regression] '-fcompare-debug' failure (length) w/ -Os -funroll-loops since r8-3946-gfe94440235cfaa57

2022-01-17 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104054 Uroš Bizjak changed: What|Removed |Added Keywords|wrong-code | --- Comment #4 from Uroš Bizjak ---

[Bug rtl-optimization/104054] [9/10/11/12 Regression] '-fcompare-debug' failure (length) w/ -Os -funroll-loops since r8-3946-gfe94440235cfaa57

2022-01-17 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104054 --- Comment #3 from Uroš Bizjak --- The first difference is in rnreg pass, w/o -g: 28: L28: 29: NOTE_INSN_BASIC_BLOCK 4 30: [`i']=0 63: di:DI=r9:DI <--- here 64: dx:DI=r10:DI 9: r8:HI=0x5 REG_EQUAL 0x5 98:

[Bug target/104003] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-6488-g820ac79e8448ad6c

2022-01-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104003 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/104001] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-6538-g5f19303ada7db92c155332e7ba317233ca05946b

2022-01-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104001 --- Comment #4 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #2) > I'm testing > > 1 file changed, 3 insertions(+), 3 deletions(-) > gcc/config/i386/i386.md | 6 +++--- > > modified gcc/config/i386/i386.md > @@ -10455,7 +10455,7

[Bug target/104003] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-6488-g820ac79e8448ad6c

2022-01-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104003 --- Comment #2 from Uroš Bizjak --- (define_insn "*xop_pcmov_" - [(set (match_operand:VI_32 0 "register_operand" "=x") -(if_then_else:VI_32 - (match_operand:VI_32 3 "register_operand" "x") - (match_operand:VI_32 1

[Bug target/104003] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-6488-g820ac79e8448ad6c

2022-01-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104003 Uroš Bizjak changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com

[Bug testsuite/103935] [12 regression] g++.dg/vect/slp-pr98855.cc fails after r12-6273

2022-01-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103935 Uroš Bizjak changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com

[Bug regression/103997] gcc.target/i386/pr88531-??.c scan-assembler-times FAILs

2022-01-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103997 Uroš Bizjak changed: What|Removed |Added Target||x86 Keywords|

[Bug regression/103997] New: gcc.target/i386/pr88531-??.c scan-assembler-times FAILs

2022-01-12 Thread ubizjak at gmail dot com via Gcc-bugs
Priority: P3 Component: regression Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Recent patch introduced following testsuite FAILs: FAIL: gcc.target/i386/pr88531-1b.c scan-assembler-times vgatherqpd 4 FAIL: gcc.target

[Bug target/100637] [i386] Vectorize 4-byte vectors

2022-01-12 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100637 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug middle-end/88670] [meta-bug] generic vector extension issues

2022-01-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88670 Bug 88670 depends on bug 103948, which changed state. Bug 103948 Summary: Vectorizer does not use vec_cmpMN without vcondMN pattern https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948 What|Removed |Added

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2022-01-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 103948, which changed state. Bug 103948 Summary: Vectorizer does not use vec_cmpMN without vcondMN pattern https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948 What|Removed |Added

[Bug tree-optimization/103948] Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948 Uroš Bizjak changed: What|Removed |Added Target Milestone|--- |12.0 Status|UNCONFIRMED

[Bug tree-optimization/103948] Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948 --- Comment #7 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #6) > I'll try your proposed patch from Comment #5 later today and report here. Yes, the patch works for me.

[Bug tree-optimization/103948] Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948 --- Comment #6 from Uroš Bizjak --- (In reply to Richard Biener from comment #5) > I guess that tree-vect-generic.c is not up-to-date with gimple-isel.cc. We > should probably somehow factor out relevant pieces. > > Note vector lowering will

[Bug testsuite/103935] [12 regression] g++.dg/vect/slp-pr98855.cc fails after r12-6273

2022-01-10 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103935 --- Comment #3 from Uroš Bizjak --- (In reply to Richard Biener from comment #2) > no longer xfailed. I suggest to re-add the { xfail *-*-* } to the > profitability check. You mean xfail for non-x86 targets?

[Bug tree-optimization/103948] Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948 --- Comment #4 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #3) > diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c > index 78e388d82f6..871366f3b7e 100644 > --- a/gcc/optabs-tree.c > +++ b/gcc/optabs-tree.c > @@ -502,6 +502,9 @@

[Bug tree-optimization/103948] Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948 --- Comment #3 from Uroš Bizjak --- diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c index 78e388d82f6..871366f3b7e 100644 --- a/gcc/optabs-tree.c +++ b/gcc/optabs-tree.c @@ -502,6 +502,9 @@ expand_vec_cond_expr_p (tree value_type, tree

[Bug tree-optimization/103948] Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948 --- Comment #2 from Uroš Bizjak --- Created attachment 52146 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52146=edit The complete testcase

[Bug tree-optimization/103948] Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948 --- Comment #1 from Uroš Bizjak --- Created attachment 52145 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52145=edit Patch that illustrates the problem on x86 target This patch should vectorize all integer relational operations with

[Bug tree-optimization/103948] New: Vectorizer does not use vec_cmpMN without vcondMN pattern

2022-01-08 Thread ubizjak at gmail dot com via Gcc-bugs
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- I was trying to add v2qi vec_cmpv2qiv2qi pattern to x86: (define_expand "vec_cmpv2qiv2qi" [(set (match_oper

[Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used

2022-01-07 Thread ubizjak at gmail dot com via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: unsigned char ur[16], ua[16], ub[16]; void avgu_v2qi (void) { int i; for (i = 0; i < 2; i++) ur[i] = (ua[i] + ub[i] + 1) >> 1; } does not

[Bug testsuite/103935] [12 regression] g++.dg/vect/slp-pr98855.cc fails after r12-6273

2022-01-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103935 --- Comment #1 from Uroš Bizjak --- As said in the patch submission: I have changed scan-tree-dump patterns in g++.dg/vect/slp-pr98855.cc to check that no SLP vectorization was performed. The existing scan-tree-dump-times was too fragile,

[Bug target/103928] [12 Regression] ICE in get_insn_template, at final.c:2050

2022-01-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103928 --- Comment #12 from Uroš Bizjak --- (In reply to Manuel Lauss from comment #10) > So it was either fixed in trunk in the last 20 hours, or pgo build broke > gcc, or "-mno-xop" fixed it. The fix for PR103905 was pushed to the master in the

[Bug target/103928] [12 Regression] ICE in get_insn_template, at final.c:2050

2022-01-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103928 --- Comment #11 from Uroš Bizjak --- (In reply to Martin Liška from comment #8) > > No, bdver4 does not include XOP. > > Ohh, didn't know that... Sorry, I was wrong: {"bdver4", PROCESSOR_BDVER4, CPU_BDVER4, PTA_64BIT | PTA_MMX |

[Bug target/103928] [12 Regression] ICE in get_insn_template, at final.c:2050

2022-01-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103928 --- Comment #7 from Uroš Bizjak --- (In reply to Martin Liška from comment #6) > Then you may be affected by PR103905 which is fixed on the current master. > Please pull to tip of master branch. No, bdver4 does not include XOP.

[Bug rtl-optimization/94440] [9 Regression] ICE in check_bool_attrs, at recog.c:2168 since r7-5324-gb8cab8a5492e9639

2022-01-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94440 --- Comment #21 from Uroš Bizjak --- Fixed?

[Bug target/103915] [12 Regression] ICE: SIGSEGV in memory_operand with -flive-range-shrinkage since r12-6215-g708b87dcb6e48cb4

2022-01-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103915 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/92860] [9/10/11/12 regression] Global flags affected by -O settings are clobbered by optimize attribute

2022-01-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92860 Bug 92860 depends on bug 103905, which changed state. Bug 103905 Summary: [12 Regression] Miscompiled i386-expand.c with -march=bdver1 and -O3 since r12-1789-g836328b2c99f5b8d https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905 What

[Bug target/103905] [12 Regression] Miscompiled i386-expand.c with -march=bdver1 and -O3 since r12-1789-g836328b2c99f5b8d

2022-01-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/103915] [12 Regression] ICE: SIGSEGV in memory_operand with -flive-range-shrinkage since r12-6215-g708b87dcb6e48cb4

2022-01-05 Thread ubizjak at gmail dot com via Gcc-bugs
at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #2 from Uroš Bizjak --- diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index fc8ec5e4d49..96d85a54e10 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -2752,7 +2752,7 @@ "" "#"

[Bug target/103905] [12 Regression] Miscompiled i386-expand.c with -march=bdver1 and -O3 since r12-1789-g836328b2c99f5b8d

2022-01-05 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905 Uroš Bizjak changed: What|Removed |Added Attachment #52120|0 |1 is obsolete|

[Bug target/103905] [12 Regression] Miscompiled i386-expand.c with -march=bdver1 and -O3 since r12-1789-g836328b2c99f5b8d

2022-01-05 Thread ubizjak at gmail dot com via Gcc-bugs
|| Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #8 from Uroš Bizjak --- Created attachment 52127 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52127=edit Proposed pa

[Bug target/103905] [12 Regression] Miscompiled i386-expand.c with -march=bdver1 and -O3 since r12-1789-g836328b2c99f5b8d

2022-01-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905 --- Comment #6 from Uroš Bizjak --- @Jakub: It looks the problem is in expand_vec_perm_pshufb, where permutation vector is recalculated for partial vectors: if (vmode == V4QImode || vmode == V8QImode) { rtx m128 = GEN_INT

[Bug target/103905] [12 Regression] Miscompiled i386-expand.c with -march=bdver1 and -O3 since r12-1789-g836328b2c99f5b8d

2022-01-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905 --- Comment #4 from Uroš Bizjak --- Created attachment 52123 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52123=edit Patch that disables XOP permute with partial vectors Please try this patch.

[Bug target/103905] [12 Regression] Miscompiled i386-expand.c with -march=bdver1 and -O3 since r12-1789-g836328b2c99f5b8d

2022-01-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905 --- Comment #3 from Uroš Bizjak --- (In reply to Martin Liška from comment #1) > Created attachment 52120 [details] > Isolated test-case > > Isolated test-case where only the miscompiled function > ix86_expand_vec_extract_even_odd uses -O3. >

[Bug target/103905] [12 Regression] Miscompiled i386-expand.c with -march=bdver1 and -O3 since r12-1789-g836328b2c99f5b8d

2022-01-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905 --- Comment #2 from Uroš Bizjak --- The referred patch adds: +;; Pack/unpack vector modes +(define_mode_attr mmxpackmode + [(V4HI "V8QI") (V2SI "V4HI")]) + +(define_expand "vec_pack_trunc_" + [(match_operand: 0 "register_operand") +

[Bug target/103900] [12 Regression] ICE: in expand_expr_real_2, at expr.c:9771 with -O -fno-tree-dce -fno-tree-dse since r12-6173-g9ff206d3865df5cb

2022-01-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103900 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/103861] [i386] vectorize v2qi vectors

2022-01-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103861 --- Comment #7 from Uroš Bizjak --- (In reply to Richard Biener from comment #6) > Not fully fixed I guess? Not yet. I have a bunch of follow-up patches for various operations.

[Bug target/103900] [12 Regression] ICE: in expand_expr_real_2, at expr.c:9771 with -O -fno-tree-dce -fno-tree-dse since r12-6173-g9ff206d3865df5cb

2022-01-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103900 Uroš Bizjak changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com

[Bug target/103900] [12 Regression] ICE: in expand_expr_real_2, at expr.c:9771 with -O -fno-tree-dce -fno-tree-dse since r12-6173-g9ff206d3865df5cb

2022-01-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103900 --- Comment #6 from Uroš Bizjak --- (In reply to Martin Liška from comment #5) > No, it still crashes with the current master (g:fbb592407c9): Ah, the compiler is blindly trying to generate V2QI XOR due to missing one_cmplv2qi2 pattern. I have

[Bug target/103900] [12 Regression] ICE: in expand_expr_real_2, at expr.c:9771 with -O -fno-tree-dce -fno-tree-dse since r12-6173-g9ff206d3865df5cb

2022-01-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103900 --- Comment #2 from Uroš Bizjak --- Looks fixed, does not ICE for me with: GNU C17 (GCC) version 12.0.0 20220104 (experimental) [master r12-6200-g62c8b21d48a] (x86_64-pc-linux-gnu)

[Bug target/103894] [12 Regression] ICE: in emit_move_multi_word, at expr.c:3870 with -mno-sse2

2022-01-03 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103894 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/103894] [12 Regression] ICE: in emit_move_multi_word, at expr.c:3870 with -mno-sse2

2022-01-03 Thread ubizjak at gmail dot com via Gcc-bugs
at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #2 from Uroš Bizjak --- Created attachment 52111 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52111=edit Proposed patch Patch in testing.

[Bug target/103861] [i386] vectorize v2qi vectors

2021-12-29 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103861 --- Comment #3 from Uroš Bizjak --- The patched compiler compiles the testcase from Comment #0 on x86_64 with -O2 to: plus: movl%edi, %edx movl%esi, %eax addb%sil, %dl addb%ah, %dh movl

[Bug target/103861] [i386] vectorize v2qi vectors

2021-12-29 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103861 --- Comment #2 from Uroš Bizjak --- Created attachment 52087 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52087=edit Protorypw patch to vectorize with v2qi vectors Patch that implmenents V2QI moves, logic and basic arithmetic

[Bug target/103861] [i386] vectorize v2qi vectors

2021-12-29 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103861 --- Comment #1 from Uroš Bizjak --- Also: char r[2], a[2], b[2]; void foo (void) { int i; for (i = 0; i < 2; i++) r[i] = a[i] + b[i]; }

[Bug target/103861] New: [i386] vectorize v2qi vectors

2021-12-29 Thread ubizjak at gmail dot com via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: typedef char __v2qi __attribute__ ((__vector_size__ (2))); __v2qi plus (__v2qi a, __v2qi b) { return a + b; }; should be vectorized.

[Bug target/103842] [12 regression] ICE on ilmbase-2.5.7: during RTL pass: expand

2021-12-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103842 --- Comment #6 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #5) > Created attachment 52068 [details] > gcc12-pr103842.patch > > Untested fix. The patch is OK. Thanks, Uros.

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #17 from Uroš Bizjak --- (In reply to hubicka from comment #16) > > > > > > It could be done, but I was under impression that the sequence to load > > > 1.0f > > > into topmost elements nullifies the benefit of operation to divide

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #14 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #13) > Created attachment 52051 [details] > Patch that implements v2sf division This patch also enables vectorization of the testcase from Comment #7. Using

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #13 from Uroš Bizjak --- Created attachment 52051 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52051=edit Patch that implements v2sf division Please try the attached patch, for the following testcase: --cut here-- float

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #12 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #10) > At least on your short testcase clang doesn't use divps either. > We do support mulv2sf3, addv2sf3 etc. but not divv2sf3 I bet because with >

[Bug target/103772] [12 Regression] wrong code with -mavx512fp16 since r12-5700-g7eb961d83b0eda53

2021-12-20 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103772 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/103772] [12 Regression] wrong code with -mavx512fp16 since r12-5700-g7eb961d83b0eda53

2021-12-20 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103772 Uroš Bizjak changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-17 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #9 from Uroš Bizjak --- (In reply to Thiago Macieira from comment #0) > Testcase: ... > The assembly for this produces: > > vmovdqu16 (%rdi), %ymm1 > vmovdqu16 32(%rdi), %ymm2 > vpcmpuw $0,

[Bug target/103753] New: Unoptimal avx2 V16HF vector insert to element 0

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- (Cloned from PR103571#18) Following testcase: --cut here-- typedef _Float16 __v16hf __attribute__ ((__vector_size__ (32))); __v16hf foo (_Float16 x) { return (__v16hf

[Bug target/102812] Unoptimal (and wrong) code for _Float16 insert

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812 Uroš Bizjak changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #28 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #18) > codegen for foo1/foo2 is suboptimal under -mavx2, i guess we can have > vec_setv16hf_0 and with vpblendw. True, some opportunities are missing from

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #27 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #17) > (In reply to Hongtao.liu from comment #16) > > There're already testcases for vec_extract/vec_set/vec_duplicate, but those > > testcases are written under

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #25 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #22) > Yes, besides TARGET_VECTOR_MODE_SUPPORTED_P, other part in the attached > patch looks fine, the condition should be binded to real instructions but > not mode.

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Uroš Bizjak changed: What|Removed |Added Attachment #51950|0 |1 is obsolete|

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-09 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Uroš Bizjak changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Uroš Bizjak changed: What|Removed |Added Attachment #51948|0 |1 is obsolete|

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #13 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #12) > Hongtao, can you please review the patch and perhaps test it a bit more? This part is missing from ix86_expand_vector_set_var: --cut here @@ -15912,7 +15921,8

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #12 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #10) > Sure. Please find attached the complete patch that enables HF vector modes in Comment #11. The patch survives bootstrap and regression test and works OK for the

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Uroš Bizjak changed: What|Removed |Added Attachment #51941|0 |1 is obsolete|

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #9 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #8) > (In reply to Uroš Bizjak from comment #6) > > (In reply to Hongtao.liu from comment #5) > > > > > There're several places in i386-expand.c which assume

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #7 from Uroš Bizjak --- Created attachment 51941 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51941=edit Proposed patch The patch moves put V2HF+V4HF+V8HF/V16HF/V32HF TO VALID_SSE2/AVX256/AVX512F_REG_MODE. Also, introduces

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #6 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #5) > There're several places in i386-expand.c which assume TARGET_AVX512FP16 for > case V8HF/V16HF/V32HF, if we want to put V8HF/V16HF/V32HF in >

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #4 from Uroš Bizjak --- (In reply to Hongyu Wang from comment #3) > So we may need to support V8HFmode in VALID_SSE2_REG_MODE if we don't want > to modify those function_args and function_value stuff. We have V8HFmode moves for

[Bug target/103571] New: ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-06 Thread ubizjak at gmail dot com via Gcc-bugs
Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: --cut here-- typedef _Float16 v2hf __attribute__((vector_size(4))); typedef _Float16 v4hf __attribute__((vector_size(8))); typedef _Float16 v8hf

[Bug c++/102881] gcc totally broken when trailing return type combine with decltype lambda

2021-12-01 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102881 --- Comment #4 from Uroš Bizjak --- > The master branch has been updated by Uros Bizjak : Oops, wrong PR number...

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-12-01 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #26 from Uroš Bizjak --- The testcase now compiles with -O2 -mf16c to: vpxor %xmm2, %xmm2, %xmm2 vpblendw$1, %xmm0, %xmm2, %xmm0 vpblendw$1, %xmm1, %xmm2, %xmm1 vcvtph2ps

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-29 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #21 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #20) > (In reply to Uroš Bizjak from comment #18) > > (In reply to Uroš Bizjak from comment #17) > > > (In reply to Hongtao.liu from comment #16) > > > > > > >

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #18 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #17) > (In reply to Hongtao.liu from comment #16) > > > ix86_expand_vector_set is mainly used by vec_set_optab which exactly takes > > target as both input and output,

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #17 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #16) > ix86_expand_vector_set is mainly used by vec_set_optab which exactly takes > target as both input and output, it seems we can't create a new target for > that.

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #15 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #14) > (In reply to Uroš Bizjak from comment #13) > > (In reply to Hongtao.liu from comment #12) > > > > > > > > Just noticed that for some reason two VPXORs are

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #13 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #12) > > > > Just noticed that for some reason two VPXORs are emitted. One should be > > enough for both VPINSRW insns. > > With new alternative in your attached

[Bug middle-end/103439] genemit emits dead code

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103439 --- Comment #3 from Uroš Bizjak --- (In reply to rguent...@suse.de from comment #2) > On Fri, 26 Nov 2021, ubizjak at gmail dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103439 > > > > --- Com

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #10 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #7) > compiles with unpatched gcc -O2 -mf16c to: > > vmovss %xmm0, %xmm0, %xmm2 # 27[c=4 l=4] *movhf_internal/3 > pextrw $0, %xmm1, -4(%rsp)

[Bug middle-end/103439] genemit emits dead code

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103439 --- Comment #1 from Uroš Bizjak --- (In reply to Richard Biener from comment #0) > I'm not sure if there are valid cases where we have a mix of a direct > RTL pattern and manual expansion, so where the { } part falls thru. Yes, we have quite

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #8 from Uroš Bizjak --- diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 68606e57e60..a2ebaa5ac63 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2528,12 +2528,12 @@ case TYPE_SSELOG:

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #7 from Uroš Bizjak --- The improvement with patch from comment #6: The testcase: _Float16 test (_Float16 a, _Float16 b) { return a + b; } compiles with unpatched gcc -O2 -mf16c to: vmovss %xmm0, %xmm0, %xmm2 # 27

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #6 from Uroš Bizjak --- Created attachment 51879 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51879=edit Improve HI/HFmode scalar insert The attached patch further improves HFmode -> SFmode conversion. HFmode values are

[Bug c/103406] gcc -O0 behaves differently on "DBL_MAX related operations" than gcc -O1 and above

2021-11-24 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103406 --- Comment #2 from Uroš Bizjak --- gcc/libgcc/config/i386/sfp-machine.h says: /* Here is something Intel misdesigned: the specs don't define the case where we have two NaNs with same mantissas, but different sign. Different operations

[Bug target/103074] [11/12 Regression] ICE in lra_assign, at lra-assigns.c:1649 since r11-5066-gbe39636d9f68c437

2021-11-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103074 --- Comment #4 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #3) > Ah, actually what I see is that sched1 swaps the order of: > (insn 22 21 23 4 (parallel [ > (set (reg:SI 88) > (ashiftrt:SI (reg/v:SI

<    1   2   3   4   5   6   7   8   9   10   >