[Bug target/103861] New: [i386] vectorize v2qi vectors

2021-12-29 Thread ubizjak at gmail dot com via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: typedef char __v2qi __attribute__ ((__vector_size__ (2))); __v2qi plus (__v2qi a, __v2qi b) { return a + b; }; should be vectorized.

[Bug target/103842] [12 regression] ICE on ilmbase-2.5.7: during RTL pass: expand

2021-12-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103842 --- Comment #6 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #5) > Created attachment 52068 [details] > gcc12-pr103842.patch > > Untested fix. The patch is OK. Thanks, Uros.

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #17 from Uroš Bizjak --- (In reply to hubicka from comment #16) > > > > > > It could be done, but I was under impression that the sequence to load > > > 1.0f > > > into topmost elements nullifies the benefit of operation to divide

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #14 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #13) > Created attachment 52051 [details] > Patch that implements v2sf division This patch also enables vectorization of the testcase from Comment #7. Using -ffast-math,

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #13 from Uroš Bizjak --- Created attachment 52051 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52051&action=edit Patch that implements v2sf division Please try the attached patch, for the following testcase: --cut here-- fl

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #12 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #10) > At least on your short testcase clang doesn't use divps either. > We do support mulv2sf3, addv2sf3 etc. but not divv2sf3 I bet because with > TARGET_MMX_WITH_SSE

[Bug target/103772] [12 Regression] wrong code with -mavx512fp16 since r12-5700-g7eb961d83b0eda53

2021-12-20 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103772 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/103772] [12 Regression] wrong code with -mavx512fp16 since r12-5700-g7eb961d83b0eda53

2021-12-20 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103772 Uroš Bizjak changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-17 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #9 from Uroš Bizjak --- (In reply to Thiago Macieira from comment #0) > Testcase: ... > The assembly for this produces: > > vmovdqu16 (%rdi), %ymm1 > vmovdqu16 32(%rdi), %ymm2 > vpcmpuw $0, %ymm0,

[Bug target/103753] New: Unoptimal avx2 V16HF vector insert to element 0

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- (Cloned from PR103571#18) Following testcase: --cut here-- typedef _Float16 __v16hf __attribute__ ((__vector_size__ (32))); __v16hf foo (_Float16 x) { return (__v16hf

[Bug target/102812] Unoptimal (and wrong) code for _Float16 insert

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812 Uroš Bizjak changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #28 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #18) > codegen for foo1/foo2 is suboptimal under -mavx2, i guess we can have > vec_setv16hf_0 and with vpblendw. True, some opportunities are missing from expand_vec_per

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #27 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #17) > (In reply to Hongtao.liu from comment #16) > > There're already testcases for vec_extract/vec_set/vec_duplicate, but those > > testcases are written under TARGET_A

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #25 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #22) > Yes, besides TARGET_VECTOR_MODE_SUPPORTED_P, other part in the attached > patch looks fine, the condition should be binded to real instructions but > not mode. OK

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Uroš Bizjak changed: What|Removed |Added Attachment #51950|0 |1 is obsolete|

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-09 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Uroš Bizjak changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #2

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Uroš Bizjak changed: What|Removed |Added Attachment #51948|0 |1 is obsolete|

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #13 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #12) > Hongtao, can you please review the patch and perhaps test it a bit more? This part is missing from ix86_expand_vector_set_var: --cut here @@ -15912,7 +15921,8 @@

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #12 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #10) > Sure. Please find attached the complete patch that enables HF vector modes in Comment #11. The patch survives bootstrap and regression test and works OK for the f

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Uroš Bizjak changed: What|Removed |Added Attachment #51941|0 |1 is obsolete|

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #9 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #8) > (In reply to Uroš Bizjak from comment #6) > > (In reply to Hongtao.liu from comment #5) > > > > > There're several places in i386-expand.c which assume TARGET_AVX51

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #7 from Uroš Bizjak --- Created attachment 51941 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51941&action=edit Proposed patch The patch moves put V2HF+V4HF+V8HF/V16HF/V32HF TO VALID_SSE2/AVX256/AVX512F_REG_MODE. Also, intr

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #6 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #5) > There're several places in i386-expand.c which assume TARGET_AVX512FP16 for > case V8HF/V16HF/V32HF, if we want to put V8HF/V16HF/V32HF in > VALID_SSE2/AVX256/AVX51

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #4 from Uroš Bizjak --- (In reply to Hongyu Wang from comment #3) > So we may need to support V8HFmode in VALID_SSE2_REG_MODE if we don't want > to modify those function_args and function_value stuff. We have V8HFmode moves for TAR

[Bug target/103571] New: ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-06 Thread ubizjak at gmail dot com via Gcc-bugs
Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: --cut here-- typedef _Float16 v2hf __attribute__((vector_size(4))); typedef _Float16 v4hf __attribute__((vector_size(8))); typedef _Float16 v8hf

[Bug c++/102881] gcc totally broken when trailing return type combine with decltype lambda

2021-12-01 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102881 --- Comment #4 from Uroš Bizjak --- > The master branch has been updated by Uros Bizjak : Oops, wrong PR number...

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-12-01 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #26 from Uroš Bizjak --- The testcase now compiles with -O2 -mf16c to: vpxor %xmm2, %xmm2, %xmm2 vpblendw$1, %xmm0, %xmm2, %xmm0 vpblendw$1, %xmm1, %xmm2, %xmm1 vcvtph2ps %xmm1

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-29 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #21 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #20) > (In reply to Uroš Bizjak from comment #18) > > (In reply to Uroš Bizjak from comment #17) > > > (In reply to Hongtao.liu from comment #16) > > > > > > > ix86_expa

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #18 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #17) > (In reply to Hongtao.liu from comment #16) > > > ix86_expand_vector_set is mainly used by vec_set_optab which exactly takes > > target as both input and output, i

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #17 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #16) > ix86_expand_vector_set is mainly used by vec_set_optab which exactly takes > target as both input and output, it seems we can't create a new target for > that. O

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #15 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #14) > (In reply to Uroš Bizjak from comment #13) > > (In reply to Hongtao.liu from comment #12) > > > > > > > > Just noticed that for some reason two VPXORs are emitted

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #13 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #12) > > > > Just noticed that for some reason two VPXORs are emitted. One should be > > enough for both VPINSRW insns. > > With new alternative in your attached match(

[Bug middle-end/103439] genemit emits dead code

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103439 --- Comment #3 from Uroš Bizjak --- (In reply to rguent...@suse.de from comment #2) > On Fri, 26 Nov 2021, ubizjak at gmail dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103439 > > > > --- Com

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #10 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #7) > compiles with unpatched gcc -O2 -mf16c to: > > vmovss %xmm0, %xmm0, %xmm2 # 27[c=4 l=4] *movhf_internal/3 > pextrw $0, %xmm1, -4(%rsp)

[Bug middle-end/103439] genemit emits dead code

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103439 --- Comment #1 from Uroš Bizjak --- (In reply to Richard Biener from comment #0) > I'm not sure if there are valid cases where we have a mix of a direct > RTL pattern and manual expansion, so where the { } part falls thru. Yes, we have quite so

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #8 from Uroš Bizjak --- diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 68606e57e60..a2ebaa5ac63 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2528,12 +2528,12 @@ case TYPE_SSELOG:

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #7 from Uroš Bizjak --- The improvement with patch from comment #6: The testcase: _Float16 test (_Float16 a, _Float16 b) { return a + b; } compiles with unpatched gcc -O2 -mf16c to: vmovss %xmm0, %xmm0, %xmm2 # 27

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-11-26 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #6 from Uroš Bizjak --- Created attachment 51879 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51879&action=edit Improve HI/HFmode scalar insert The attached patch further improves HFmode -> SFmode conversion. HFmode values a

[Bug c/103406] gcc -O0 behaves differently on "DBL_MAX related operations" than gcc -O1 and above

2021-11-24 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103406 --- Comment #2 from Uroš Bizjak --- gcc/libgcc/config/i386/sfp-machine.h says: /* Here is something Intel misdesigned: the specs don't define the case where we have two NaNs with same mantissas, but different sign. Different operations pi

[Bug target/103074] [11/12 Regression] ICE in lra_assign, at lra-assigns.c:1649 since r11-5066-gbe39636d9f68c437

2021-11-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103074 --- Comment #4 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #3) > Ah, actually what I see is that sched1 swaps the order of: > (insn 22 21 23 4 (parallel [ > (set (reg:SI 88) > (ashiftrt:SI (reg/v:SI 8

[Bug target/103193] gcc for x86_64: wrong code generation: ucomiss instead of comiss

2021-11-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103193 --- Comment #2 from Uroš Bizjak --- Here is compilable testcase: _Bool a (void) { #pragma STDC FENV_ACCESS ON float f; _Bool b; f = __builtin_nan (""); b = f >= f; // ucomiss (wrong), comiss (correct) return b; } Tree optimizers (_.o

[Bug target/103020] New: ICE with V1TImode on x86_32

2021-11-01 Thread ubizjak at gmail dot com via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: --cut here-- typedef int __v1ti __attribute__((mode (V1TI))); __v1ti foo (__v1ti a) { return a + a; } --cut here-- ICEs with gcc -O2 -msse2 -m32: v1ti.c:1:1: warning

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-11-01 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 --- Comment #7 from Uroš Bizjak --- IMO, inlined fmod (and drem) should eventually be expanded in a generic way in the middle-end as: fmod (a, p) = a - trunc (a/p) * p drem (a, p) = a - roundeven (a/p) * p so division can be later simplified t

[Bug target/103008] poor inlined builtin_fmod on x86_64

2021-10-31 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103008 Uroš Bizjak changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug target/100119] [x86] Conversion unsigned int -> double produces -0 (-m32 -msse2 -mfpmath=sse)

2021-10-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100119 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|---

[Bug target/102761] [10/11/12 Regression] ICE with -O1 and above: in ix86_print_operand_address_as due to %a0 and if_then_else and X constraint

2021-10-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102761 --- Comment #7 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #5) > Not a regression, so let's not backport this too far. > > Fixed for gcc-11.3+. Actually, gcc-10.4+.

[Bug target/102761] [10/11/12 Regression] ICE with -O1 and above: in ix86_print_operand_address_as due to %a0 and if_then_else and X constraint

2021-10-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102761 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug rtl-optimization/102840] [12 Regression] gcc.target/i386/pr22076.c by r12-4475

2021-10-21 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102840 Uroš Bizjak changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/102812] Unoptimal (and wrong) code for _Float16 insert

2021-10-20 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102812 --- Comment #2 from Uroš Bizjak --- Please note that the code above should compile via ix86_expand_vector_set, similar to: --cut here-- typedef short v8hi __attribute__((__vector_size__(16))); v8hi foo (short a) { return (v8hi) {a, 0, 0, 0,

[Bug target/102812] New: Unoptimal (and wrong) code for _Float16 insert

2021-10-18 Thread ubizjak at gmail dot com via Gcc-bugs
: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following code: --cut here-- typedef _Float16 v8hf __attribute__((__vector_size__ (16))); v8hf t (_Float16 a) { return (v8hf){a, 0, 0, 0, 0, 0, 0, 0}; } --cut here

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-10-18 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #2 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #1) > vxorps %xmm1, %xmm1, %xmm1 > vblendps %xmm1, %xmm1, %xmm0, $1 > vcvtps2ph %xmm0, %xmm1 vmovss %xmm1, %xmm1, %xmm0 instead of vblendps would also do the trick.

[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-10-18 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #1 from Uroš Bizjak --- Something like (argument and result in %xmm0): vpxor %xmm1, %xmm1, %xmm1 vpblendw %xmm1, %xmm1, %xmm0, $1 vcvtph2ps %xmm0, %xmm1 instead of __extendhfsf2 and: vxorps %xmm1, %xmm1, %xmm1 vblendps %xmm1, %xmm

[Bug target/102811] New: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c

2021-10-18 Thread ubizjak at gmail dot com via Gcc-bugs
Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- The following testcase: _Float16 test (_Float16 a, _Float16 b) { return a + b; } compiles with -O2 -mf16c to

[Bug target/102761] [10/11/12 Regression] ICE with -O1 and above: in ix86_print_operand_address_as due to %a0 and if_then_else and X constraint

2021-10-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102761 --- Comment #2 from Uroš Bizjak --- diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index fb656094e9e..5d019f8e058 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -13921,7 +13921,10 @@ ix86_print_operand_address_

[Bug target/93934] Unnecessary fld of uninitialized float stack variable results in ub of valid C++ code

2021-10-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93934 --- Comment #25 from Uroš Bizjak --- (In reply to jos...@codesourcery.com from comment #24) > This is a fundamentally different test, because it involves (in the > abstract machine) lvalue-to-rvalue conversion of a sNaN representation. > That

[Bug target/93934] Unnecessary fld of uninitialized float stack variable results in ub of valid C++ code

2021-10-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93934 --- Comment #21 from Uroš Bizjak --- (In reply to Zoltan Vajda from comment #19) > The problem does not only apply for conditional moves! I can turn on sse, > for example. > > https://gcc.godbolt.org/z/jP3Kne8T5 > > Then the problematic code wi

[Bug target/93934] Unnecessary fld of uninitialized float stack variable results in ub of valid C++ code

2021-10-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93934 --- Comment #20 from Uroš Bizjak --- (In reply to jos...@codesourcery.com from comment #16) > I don't think this bug is anything to do with -fsignaling-nans, for the > same reason as applies to bug 58416 and bug 71460. The situation is hopeless

[Bug target/93934] Unnecessary fld of uninitialized float stack variable results in ub of valid C++ code

2021-10-14 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93934 --- Comment #18 from Uroš Bizjak --- The following patch fixes the PR, see the comment inline: --cut here-- diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 6e2b7920d2b..b87490fe544 100644 --- a/gcc/config/i386/i3

[Bug target/93934] Unnecessary fld of uninitialized float stack variable results in ub of valid C++ code

2021-10-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93934 --- Comment #13 from Uroš Bizjak --- (In reply to Zoltan Vajda from comment #12) > Using -mfpmath=sse here does not help on a 32 bit platfrom. > https://gcc.godbolt.org/z/hs1Ef6aj4 > At line 31 the assembly code performs the speculative load. Ye

[Bug target/93934] Unnecessary fld of uninitialized float stack variable results in ub of valid C++ code

2021-10-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93934 --- Comment #11 from Uroš Bizjak --- (In reply to Zoltan Vajda from comment #9) > As I understand it, it is acknowledged, that this is a bug. However, the > issue is in state NEW for a quite long time. The issue is still present in > GCC 11.2. Do

[Bug target/15533] Missed move to partial register

2021-10-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15533 --- Comment #6 from Uroš Bizjak --- (In reply to Cesar Eduardo Barros from comment #0) > When compiling: > #include > #define regparm __attribute__((regparm(3))) > uint8_t a; > uint16_t regparm fn(uint16_t b) > { return (b & ~0xFF) | a; } This

[Bug target/85730] complex code for modifying lowest byte in a 4-byte vector

2021-10-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85730 --- Comment #9 from Uroš Bizjak --- An interesting observation with the following testcase: --cut here-- typedef char V __attribute__((vector_size(4))); struct S { char val; char pad1; char pad2; char pad3; }; V foo (V v) { v[0] <<=

[Bug target/85730] complex code for modifying lowest byte in a 4-byte vector

2021-10-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85730 --- Comment #8 from Uroš Bizjak --- Created attachment 51564 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51564&action=edit Prototype patch Attached patch works around reload problem and creates: movl%edi, %eax movb

[Bug target/85730] complex code for modifying lowest byte in a 4-byte vector

2021-10-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85730 --- Comment #6 from Uroš Bizjak --- (In reply to Richard Biener from comment #5) > The GIMPLE IL is now using BIT_INSERT_EXPRs consistently for all cases and > combine does > > Trying 8 -> 11: > 8: {r90:SI=r89:SI<<0x1;clobber flags:CC;} >

[Bug target/89954] missed optimization for signed extension for x86-64

2021-09-30 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89954 Uroš Bizjak changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com Target

[Bug target/89954] missed optimization for signed extension for x86-64

2021-09-30 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89954 --- Comment #10 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #9) > (In reply to Hongtao.liu from comment #8) > > (In reply to Uroš Bizjak from comment #7) > > > Created attachment 51496 [details] > > > Prototype patch > > > > +;; c

[Bug target/60778] shift not folded into shift on x86-64

2021-09-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60778 --- Comment #5 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #4) > > > We have 2->2 combine now but it looks like we don't try to split inside a > > > mem ... > > > > This address looks similar to how Alpha implemented unaligned l

[Bug target/60778] shift not folded into shift on x86-64

2021-09-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60778 --- Comment #3 from Uroš Bizjak --- (In reply to Andrew Pinski from comment #2) > Trying 7 -> 8: > 7: {r87:DI=r89:DI>>0x3;clobber flags:CC;} > REG_DEAD r89:DI > REG_UNUSED flags:CC > 8: r88:DF=[r87:DI*0x8+`mem'] > REG_DE

[Bug target/89954] missed optimization for signed extension for x86-64

2021-09-22 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89954 --- Comment #7 from Uroš Bizjak --- Created attachment 51496 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51496&action=edit Prototype patch

[Bug target/89954] missed optimization for signed extension for x86-64

2021-09-11 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89954 --- Comment #5 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #4) > It looks like there's splitter in aarch64 which combines > load+xor+zero_extend to zero_extend(mem) + xor, x86 doesn't have. The simple > way is to add corresponding

[Bug target/102143] ABI incompatibility with clang when passing 32bit vectors on 32bit i686

2021-09-01 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102143 --- Comment #5 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #3) > gcc has also some problems in this area. With -mregparm=3, one would expect > arguments obeying integer ABI to be passed in registers, but regparm flag > has no effe

[Bug target/102143] ABI incompatibility with clang when passing 32bit vectors on 32bit i686

2021-09-01 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102143 --- Comment #3 from Uroš Bizjak --- gcc has also some problems in this area. With -mregparm=3, one would expect arguments obeying integer ABI to be passed in registers, but regparm flag has no effect and the same code is produced: foo:

[Bug target/102143] ABI incompatibility with clang when passing 32bit vectors on 32bit i686

2021-09-01 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102143 --- Comment #2 from Uroš Bizjak --- (In reply to H.J. Lu from comment #1) > 16-bit and 32-bit vector pass and return are not specified in i386 psABI. > 64-bit vector is specified, not really usable. Any suggestions? With -mno-sse, clang create

[Bug target/102143] New: ABI incompatibility with clang when passing 32bit vectors on 32bit i686

2021-08-31 Thread ubizjak at gmail dot com via Gcc-bugs
: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: --cut here-- typedef short __v2hi __attribute__((__vector_size__ (4))); __v2hi foo (__v2hi x, __v2hi y

[Bug target/51838] Inefficient add of 128 bit quantity represented as 64 bit tuple to 128 bit integer.

2021-08-29 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51838 --- Comment #3 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #2) > The interest thing is when i remove addti3 and ashlti3 from i386.md, GCC > generates optimal code. Yes, we had this situation with _doubleword instructions, and it

[Bug target/102057] [12 Regression] ICE at -O2 in extract_constrain_insn, at recog.c:2670

2021-08-26 Thread ubizjak at gmail dot com via Gcc-bugs
|--- |FIXED Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com Target||x86_64 --- Comment #6 from Uroš Bizjak --- Fixed.

[Bug target/102057] [12 Regression] ICE at -O2 in extract_constrain_insn, at recog.c:2670

2021-08-25 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102057 --- Comment #4 from Uroš Bizjak --- Oooh, default argument! --cut here-- diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 41d85623ad6..528116dfe2d 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -19475,

[Bug target/102057] [12 Regression] ICE at -O2 in extract_constrain_insn, at recog.c:2670

2021-08-25 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102057 Uroš Bizjak changed: What|Removed |Added CC||sayle at gcc dot gnu.org --- Comment #3 f

[Bug target/102057] ICE at -O2 in extract_constrain_insn, at recog.c:2670

2021-08-25 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102057 --- Comment #1 from Uroš Bizjak --- A recently added peephole is converting: (insn 229 108 280 22 (parallel [ (set (reg:CCZ 17 flags) (compare:CCZ (ashiftrt:SI (reg:SI 0 ax [orig:90 _12 ] [90])

[Bug target/102027] [11/12 Regression] ABI break when using vector type in function arg/return value

2021-08-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102027 --- Comment #2 from Uroš Bizjak --- gcc-11 does: : 0: 55 push %rbp 1: 48 89 e5mov%rsp,%rbp 4: 48 8b 05 00 00 00 00mov0x0(%rip),%rax# b

[Bug target/102027] [11/12 Regression] ABI break when using vector type in function arg/return value

2021-08-23 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102027 --- Comment #1 from Uroš Bizjak --- (In reply to Marek Polacek from comment #0) > We have an ABI break. Discovered by > Running > /root/rpmbuild/BUILD/gcc-11.2.1-20210728/gcc/testsuite/gcc.dg/compat/struct- > layout-1.exp ... > FAIL: tmpdir-gcc

[Bug target/101930] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-2888-g8c8df06e46493f6c

2021-08-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101930 --- Comment #2 from Uroš Bizjak --- (In reply to Hongtao.liu from comment #1) > (> Maybe similar to PR101860. > > No, it's a different issue. > > Should be fixed by > > { > rtx op2 = gen_reg_rtx (mode); > > - if (!nonimmediat

[Bug target/101812] [12 Regression] ICE: Segmentation fault (in ix86_expand_sse_movcc) since r12-731-gb1f7fd8a2a5558da

2021-08-09 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101812 Uroš Bizjak changed: What|Removed |Added CC|uros at gcc dot gnu.org| Resolution|---

[Bug fortran/101660] [12 Regression] FAIL: gfortran.dg/bind_c_array_params_3.f90

2021-08-09 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101660 Uroš Bizjak changed: What|Removed |Added CC||ubizjak at gmail dot com --- Comment #7

[Bug libfortran/101820] fatal error: ISO_Fortran_binding.h: No such file or directory

2021-08-09 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101820 Uroš Bizjak changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW

[Bug libfortran/101820] New: fatal error: ISO_Fortran_binding.h: No such file or directory

2021-08-08 Thread ubizjak at gmail dot com via Gcc-bugs
Priority: P3 Component: libfortran Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- There are quite some FAILs in the gfortran testsuite, e.g.: FAIL: gfortran.dg/ISO_Fortran_binding_1.f90 -O0 (test for excess

[Bug target/101812] [12 Regression] ICE: Segmentation fault (in ix86_expand_sse_movcc) since r12-731-gb1f7fd8a2a5558da

2021-08-08 Thread ubizjak at gmail dot com via Gcc-bugs
|ASSIGNED Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com Version|unknown |12.0 --- Comment #2 from Uroš Bizjak --- Patch in testing: --cut here-- diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index

[Bug target/101797] ICE on valid code at -O2 and -O3: in extract_constrain_insn, at recog.c:2670

2021-08-06 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101797 Uroš Bizjak changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/101797] ICE on valid code at -O2 and -O3: in extract_constrain_insn, at recog.c:2670

2021-08-06 Thread ubizjak at gmail dot com via Gcc-bugs
gcc dot gnu.org |ubizjak at gmail dot com Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2021-08-06 --- Comment #1 from Uroš Bizjak --- Patch in testing: --cut here-- diff --git a/gcc

[Bug c++/100977] [C++23] Implement C++ Identifier Syntax using Unicode Standard Annex 31

2021-08-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100977 --- Comment #8 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #7) > True, but is it worth changing on a tool that is one twice in a decade? Well, the question is self-answering ;)

[Bug c++/100977] [C++23] Implement C++ Identifier Syntax using Unicode Standard Annex 31

2021-08-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100977 --- Comment #6 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #3) > - printf ("{ %s|%s|%s|%s|%s|%s|%s|%s|%s, %3d, %#06x },\n", > + printf ("{ %s|%s|%s|%s|%s|%s|%s|%s|%s|%s|%s, %3d, %#06x },\n", BTW: You can also use width

[Bug target/101761] Random hang with 29_atomics/atomic_ref/wait_notify.cc

2021-08-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101761 --- Comment #2 from Uroš Bizjak --- (In reply to H.J. Lu from comment #0) > 29_atomics/atomic_ref/wait_notify.cc in 64-bit on Skylake server: > > It happens about once a few weeks. while true ; do ./a.out ; done will hang immediately.

[Bug target/101761] Random hang with 29_atomics/atomic_ref/wait_notify.cc

2021-08-03 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101761 --- Comment #1 from Uroš Bizjak --- Probably related to PR97936.

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-08-03 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #39 from Uroš Bizjak --- Please open a new bugreport, failures in Comment 37 and Comment 38 have nothing with r7-1112-gbeed3701c796842abbfb27d7484b35bd82818740 which was fully reverted. (FTR, fixed peepholes were ineffective, so the

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-07-19 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 Uroš Bizjak changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-07-19 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #31 from Uroš Bizjak --- (In reply to H.J. Lu from comment #30) > (In reply to Uroš Bizjak from comment #29) > > (In reply to H.J. Lu from comment #28) > > > 29_atomics/atomic_ref/wait_notify.c has the same issue on Linux/x86-64 > >

[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686

2021-07-19 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #29 from Uroš Bizjak --- (In reply to H.J. Lu from comment #28) > 29_atomics/atomic_ref/wait_notify.c has the same issue on Linux/x86-64 with > -m32: Are you sure? The mentioned peephole2 patterns now emit only x87 or SSE DFmode loa

[Bug target/101346] ICE: maximum number of generated reload insns per insn achieved (90)

2021-07-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101346 Uroš Bizjak changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com Ever

[Bug tree-optimization/101434] vector-by-vector left shift expansion for char/short is not optimal

2021-07-13 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434 --- Comment #3 from Uroš Bizjak --- (In reply to Richard Biener from comment #1) > Probably low priority if not doable nicely w/o XOP. -mxop can be substituted with -mavx512bw -mavx512vl for the same effect.

<    2   3   4   5   6   7   8   9   10   11   >