[Bug target/26915] missed sized opt returning -1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915 --- Comment #8 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:247c407c83f0015f4b92d5f71e45b63192f6757e commit r12-4475-g247c407c83f0015f4b92d5f71e45b63192f6757e Author: Roger Sayle Date: Mon Oct 18 12:15:40 2021 +0100 Try placing RTL folded constants in the constant pool. My recent attempts to come up with a testcase for my patch to evaluate ss_plus in simplify-rtx.c, identified a missed optimization opportunity (that's potentially a long-time regression): The RTL optimizers no longer place constants in the constant pool. The motivating x86_64 example is the simple program: typedef char v8qi __attribute__ ((vector_size (8))); v8qi foo() { v8qi tx = { 1, 0, 0, 0, 0, 0, 0, 0 }; v8qi ty = { 2, 0, 0, 0, 0, 0, 0, 0 }; v8qi t = __builtin_ia32_paddsb(tx, ty); return t; } which (with my previous patch) currently results in: foo:movq.LC0(%rip), %xmm0 movq.LC1(%rip), %xmm1 paddsb %xmm1, %xmm0 ret even though the RTL contains the result in a REG_EQUAL note: (insn 7 6 12 2 (set (reg:V8QI 83) (ss_plus:V8QI (reg:V8QI 84) (reg:V8QI 85))) "ssaddqi3.c":7:12 1419 {*mmx_ssaddv8qi3} (expr_list:REG_DEAD (reg:V8QI 85) (expr_list:REG_DEAD (reg:V8QI 84) (expr_list:REG_EQUAL (const_vector:V8QI [ (const_int 3 [0x3]) (const_int 0 [0]) repeated x7 ]) (nil) Together with the patch below, GCC will now generate the much more sensible: foo:movq.LC2(%rip), %xmm0 ret My first approach was to look in cse.c (where the REG_EQUAL note gets added) and notice that the constant pool handling functionality has been unreachable for a while. A quick search for constant_pool_entries_cost shows that it's initialized to zero, but never set to a non-zero value, meaning that force_const_mem is never called. This functionality used to work way back in 2003, but has been lost over time: https://gcc.gnu.org/pipermail/gcc-patches/2003-October/116435.html The changes to cse.c below restore this functionality (placing suitable constants in the constant pool) with two significant refinements; (i) it only attempts to do this if the function already uses a constant pool (thanks to the availability of crtl->uses_constant_pool since 2003). (ii) it allows different constants (i.e. modes) to have different costs, so that floating point "doubles" and 64-bit, 128-bit, 256-bit and 512-bit vectors don't all have the share the same cost. Back in 2003, the assumption was that everything in a constant pool had the same cost, hence the global variable constant_pool_entries_cost. Although this is a useful CSE fix, it turns out that it doesn't cure my motivating problem above. CSE only considers a single instruction, so determines that it's cheaper to perform the ss_plus (COSTS_N_INSNS(1)) than read the result from the constant pool (COSTS_N_INSNS(2)). It's only when the other reads from the constant pool are also eliminated, that this transformation is a win. Hence a better place to perform this transformation is in combine, where after failing to "recog" the load of a suitable constant, it can retry after calling force_const_mem. This achieves the desired transformation and allows the backend insn_cost call-back to control whether or not using the constant pool is preferrable. Alas, it's rare to change code generation without affecting something in GCC's testsuite. On x86_64-pc-linux-gnu there were two families of new failures (and I'd predict similar benign fallout on other platforms). One failure was gcc.target/i386/387-12.c (aka PR target/26915), where the test is missing an explicit -m32 flag. On i686, it's very reasonable to materialize -1.0 using "fld1; fchs", but on x86_64-pc-linux-gnu we currently generate the awkward: testm1: fld1 fchs fstpl -8(%rsp) movsd -8(%rsp), %xmm0 ret which combine now very reasonably simplifies to just: testm1: movsd .LC3(%rip), %xmm0 ret The other class of x86_64-pc-linux-gnu failure was from materialization of vector constants using vpbroadcast (e.g. gcc.target/i386/pr90773-17.c) where the decision is finely balanced; the load of an integer register with an immediate constant, followed by a vpbroadcast is deemed to be COSTS_N_INSNS(2), whereas a load from the constant pool is also reported as COSTS_N_INSNS(2). My solution is to tweak the i386.c's rtx_costs so that all other things being equal, an instruction (sequence) that accesses memory is
[Bug target/26915] missed sized opt returning -1.0
--- Comment #8 from chaoyingfu at gcc dot gnu dot org 2006-12-01 00:07 --- Subject: Bug 26915 Author: chaoyingfu Date: Fri Dec 1 00:05:26 2006 New Revision: 119383 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=119383 Log: Merged revisions 118455-118543 via svnmerge from svn+ssh://[EMAIL PROTECTED]/svn/gcc/trunk r118455 | fxcoudert | 2006-11-03 03:51:09 -0800 (Fri, 03 Nov 2006) | 20 lines PR libfortran/27895 * intrinsics/reshape_generic.c (reshape_internal): Fix so that it works correctly for zero-sized arrays. * m4/reshape.m4: Likewise. * generated/reshape_r16.c: Regenerate. * generated/reshape_c4.c: Regenerate. * generated/reshape_i4.c: Regenerate. * generated/reshape_c16.c: Regenerate. * generated/reshape_r10.c: Regenerate. * generated/reshape_r8.c: Regenerate. * generated/reshape_c10.c: Regenerate. * generated/reshape_c8.c: Regenerate. * generated/reshape_i8.c: Regenerate. * generated/reshape_i16.c: Regenerate. * generated/reshape_r4.c: Regenerate. * gcc/testsuite/gfortran.dg/zero_sized_1.f90: Uncomment checks for RESHAPE. r118458 | amylaar | 2006-11-03 06:52:19 -0800 (Fri, 03 Nov 2006) | 97 lines gcc: 2006-11-03 Jorn Rennecke [EMAIL PROTECTED] * config/sh/crt1.asm: Fix #ifdef indent. 2006-11-03 Jorn Rennecke [EMAIL PROTECTED] Merged from STMicroelectronics sources: 2006-10-06 Andrew Stubbs [EMAIL PROTECTED] * config/sh/crt1.asm (vbr_600): Add missing #if. 2006-08-03 Jorn Rennecke [EMAIL PROTECTED] * sh.opt (mfused-madd): New option. * sh.md (mac_media, macsf3): Make conditional on TARGET_FMAC. 2006-07-04 Andrew Stubbs [EMAIL PROTECTED] * config/sh/crt1.asm (vbr_start): Move to new section .test.vbr. Remove pointless handler at VBR+0. (vbr_200, vbr_300, vbr_500): Remove pointless handler. (vbr_600): Save and restore mach and macl, fpul and fpscr and fr0 to fr7. Make sure the timer handler is called with the correct FPU precision setting, according to the ABI. 2006-06-14 Jorn Rennecke [EMAIL PROTECTED] * config/sh/sh.opt (m2a-single, m2a-single-only): Fix Condition. * config/sh/sh.h (SUPPORT_SH2A_NOFPU): Fix condition. (SUPPORT_SH2A_SINGLE_ONLY, SUPPORT_SH2A_SINGLE_ONLY): Likewise. 2006-06-09 Jorn Rennecke [EMAIL PROTECTED] * sh.md (cmpgeusi_t): Change into define_insn_and_split. Accept zero as second operand. 2006-04-28 Jorn Rennecke [EMAIL PROTECTED] * config/sh/divtab-sh4-300.c, config/sh/lib1funcs-4-300.asm: Fixed some bugs related to negative values, in particular -0 and overflow at -0x8000. * config/sh/divcost-analysis: Added sh4-300 figures. 2006-04-27 Jorn Rennecke [EMAIL PROTECTED] * config/sh/t-sh (MULTILIB_MATCHES): Add -m4-300* / -m4-340 options. 2006-04-26 Jorn Rennecke [EMAIL PROTECTED] * config/sh/t-sh (OPT_EXTRA_PARTS): Add libgcc-4-300.a. ($(T)div_table-4-300.o, $(T)libgcc-4-300.a): New rules. * config/sh/divtab-sh4-300.c, config/sh/lib1funcs-4-300.asm: New files. * config/sh/embed-elf.h (LIBGCC_SPEC): Use -lgcc-4-300 for -m4-300* / -m4-340. 2006-04-24 Jorn Rennecke [EMAIL PROTECTED] SH4-300 scheduling description fixes to SH4-[12]00 description: * sh.md: New instruction types: fstore, movi8, fpscr_toggle, gp_mac, mac_mem, mem_mac, dfp_mul, fp_cmp. (insn_class, dfp_comp, any_fp_comp): Update. (push_fpul, movsf_ie, fpu_switch, toggle_sz, toggle_pr): Update type. (cmpgtsf_t, cmpeqsf_t, cmpgtsf_t_i4, cmpeqsf_t_i4): Likewise. (muldf3_i): Likewise. (movsi_i): Split rI08 alternative into two separate alternatives. Update type. (movsi_ie, movsi_i_lowpart): Likewise. (movqi_i): Split ri alternative into two separate alternatives. Update type. * sh1.md (sh1_load_store, sh1_fp): Update. * sh4.md (sh4_store, sh4_mac_gp, fp_arith, fp_double_arith): Update. (mac_mem, sh4_fpscr_toggle): New insn_reservations. * sh4a.md (sh4a_mov, sh4a_load, sh4a_store, sh4a_fp_arith): Update. (sh4a_fp_double_arith): Likewise. * sh4-300.md: New file. * sh.c (sh_handle_option): Handle m4-300* options. (sh_adjust_cost): Fix latency of auto-increments. Handle SH4-300 differently than other SH4s. Check for new insn types. * sh.h (OVERRIDE_OPTIONS): Initilize sh_branch_cost if it has not been set by an option. * sh.opt (m4-300, m4-100-nofpu, m4-200-nofpu): New options. (m4-300-nofpu, -m4-340, m4-300-single, m4-300-single-only):
[Bug target/26915] missed sized opt returning -1.0
--- Comment #4 from pluto at agmk dot net 2006-11-04 09:10 --- Created an attachment (id=12545) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12545action=view) patch from Uros Bizjak. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915
[Bug target/26915] missed sized opt returning -1.0
--- Comment #5 from pluto at agmk dot net 2006-11-04 09:11 --- with attached patch gcc42 produces: $ ./xgcc -B. PR26915.c -m32 -S -Os -fomit-frame-pointer .file PR26915.c .text .globl minus1 .type minus1, @function minus1: fld1 fchs ret .size minus1, .-minus1 .ident GCC: (GNU) 4.2.0 20061030 (prerelease) (PLD-Linux) .section.note.GNU-stack,,@progbits -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915
[Bug target/26915] missed sized opt returning -1.0
--- Comment #6 from uros at gcc dot gnu dot org 2006-11-04 23:12 --- Subject: Bug 26915 Author: uros Date: Sat Nov 4 23:12:16 2006 New Revision: 118484 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=118484 Log: PR target/26915 * config/i386/i386.c (standard_80387_constant_p): Treat -0.0 and -1.0 as a valid 80387 constant. (standard_80387_constant_opcode): Return # for -0.0 and -1.0. * config/i386/i386.md (unnamed splitter): Split the load of constant -0.0 or -1.0 into the load of 0.0 or 1.0, followed by negation. testsuite/ChangeLog: PR target/26915 * gcc.target/i386/387-12.c: New test. ~ Added: trunk/gcc/testsuite/gcc.target/i386/387-12.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.md trunk/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915
[Bug target/26915] missed sized opt returning -1.0
--- Comment #7 from ubizjak at gmail dot com 2006-11-04 23:15 --- Fixed. -- ubizjak at gmail dot com changed: What|Removed |Added URL||http://gcc.gnu.org/ml/gcc- ||patches/2006- ||11/msg00220.html Status|NEW |RESOLVED Keywords||patch Known to fail|4.1.2 4.2.0 |4.1.2 Known to work||4.2.0 Resolution||FIXED Target Milestone|--- |4.3.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915
[Bug target/26915] missed sized opt returning -1.0
--- Comment #3 from pinskia at gcc dot gnu dot org 2006-05-02 08:08 --- Confirmed. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added OtherBugsDependingO||16996 nThis|| Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 GCC build triplet|i686-pld-linux | GCC host triplet|i686-pld-linux | GCC target triplet|i686-pld-linux | Keywords||missed-optimization Last reconfirmed|-00-00 00:00:00 |2006-05-02 08:08:49 date|| Summary|missed optimization / |missed sized opt returning - |returning -1.0 |1.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915