[Bug target/26915] missed sized opt returning -1.0

2021-10-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:247c407c83f0015f4b92d5f71e45b63192f6757e

commit r12-4475-g247c407c83f0015f4b92d5f71e45b63192f6757e
Author: Roger Sayle 
Date:   Mon Oct 18 12:15:40 2021 +0100

Try placing RTL folded constants in the constant pool.

My recent attempts to come up with a testcase for my patch to evaluate
ss_plus in simplify-rtx.c, identified a missed optimization opportunity
(that's potentially a long-time regression): The RTL optimizers no longer
place constants in the constant pool.

The motivating x86_64 example is the simple program:

typedef char v8qi __attribute__ ((vector_size (8)));

v8qi foo()
{
  v8qi tx = { 1, 0, 0, 0, 0, 0, 0, 0 };
  v8qi ty = { 2, 0, 0, 0, 0, 0, 0, 0 };
  v8qi t = __builtin_ia32_paddsb(tx, ty);
  return t;
}

which (with my previous patch) currently results in:
foo:movq.LC0(%rip), %xmm0
movq.LC1(%rip), %xmm1
paddsb  %xmm1, %xmm0
ret

even though the RTL contains the result in a REG_EQUAL note:

(insn 7 6 12 2 (set (reg:V8QI 83)
(ss_plus:V8QI (reg:V8QI 84)
(reg:V8QI 85))) "ssaddqi3.c":7:12 1419 {*mmx_ssaddv8qi3}
 (expr_list:REG_DEAD (reg:V8QI 85)
(expr_list:REG_DEAD (reg:V8QI 84)
(expr_list:REG_EQUAL (const_vector:V8QI [
(const_int 3 [0x3])
(const_int 0 [0]) repeated x7
])
(nil)

Together with the patch below, GCC will now generate the much
more sensible:
foo:movq.LC2(%rip), %xmm0
ret

My first approach was to look in cse.c (where the REG_EQUAL note gets
added) and notice that the constant pool handling functionality has been
unreachable for a while.  A quick search for constant_pool_entries_cost
shows that it's initialized to zero, but never set to a non-zero value,
meaning that force_const_mem is never called.  This functionality used
to work way back in 2003, but has been lost over time:
https://gcc.gnu.org/pipermail/gcc-patches/2003-October/116435.html

The changes to cse.c below restore this functionality (placing suitable
constants in the constant pool) with two significant refinements;
(i) it only attempts to do this if the function already uses a constant
pool (thanks to the availability of crtl->uses_constant_pool since 2003).
(ii) it allows different constants (i.e. modes) to have different costs,
so that floating point "doubles" and 64-bit, 128-bit, 256-bit and 512-bit
vectors don't all have the share the same cost.  Back in 2003, the
assumption was that everything in a constant pool had the same
cost, hence the global variable constant_pool_entries_cost.

Although this is a useful CSE fix, it turns out that it doesn't cure
my motivating problem above.  CSE only considers a single instruction,
so determines that it's cheaper to perform the ss_plus (COSTS_N_INSNS(1))
than read the result from the constant pool (COSTS_N_INSNS(2)).  It's
only when the other reads from the constant pool are also eliminated,
that this transformation is a win.  Hence a better place to perform
this transformation is in combine, where after failing to "recog" the
load of a suitable constant, it can retry after calling force_const_mem.
This achieves the desired transformation and allows the backend insn_cost
call-back to control whether or not using the constant pool is preferrable.

Alas, it's rare to change code generation without affecting something in
GCC's testsuite.  On x86_64-pc-linux-gnu there were two families of new
failures (and I'd predict similar benign fallout on other platforms).
One failure was gcc.target/i386/387-12.c (aka PR target/26915), where
the test is missing an explicit -m32 flag.  On i686, it's very reasonable
to materialize -1.0 using "fld1; fchs", but on x86_64-pc-linux-gnu we
currently generate the awkward:
testm1: fld1
fchs
fstpl   -8(%rsp)
movsd   -8(%rsp), %xmm0
ret

which combine now very reasonably simplifies to just:
testm1: movsd   .LC3(%rip), %xmm0
ret

The other class of x86_64-pc-linux-gnu failure was from materialization
of vector constants using vpbroadcast (e.g. gcc.target/i386/pr90773-17.c)
where the decision is finely balanced; the load of an integer register
with an immediate constant, followed by a vpbroadcast is deemed to be
COSTS_N_INSNS(2), whereas a load from the constant pool is also reported
as COSTS_N_INSNS(2).  My solution is to tweak the i386.c's rtx_costs
so that all other things being equal, an instruction (sequence) that
accesses memory is 

[Bug target/26915] missed sized opt returning -1.0

2006-11-30 Thread chaoyingfu at gcc dot gnu dot org


--- Comment #8 from chaoyingfu at gcc dot gnu dot org  2006-12-01 00:07 
---
Subject: Bug 26915

Author: chaoyingfu
Date: Fri Dec  1 00:05:26 2006
New Revision: 119383

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=119383
Log:
Merged revisions 118455-118543 via svnmerge from 
svn+ssh://[EMAIL PROTECTED]/svn/gcc/trunk


  r118455 | fxcoudert | 2006-11-03 03:51:09 -0800 (Fri, 03 Nov 2006) | 20 lines

PR libfortran/27895

* intrinsics/reshape_generic.c (reshape_internal): Fix so that it
works correctly for zero-sized arrays.
* m4/reshape.m4: Likewise.
* generated/reshape_r16.c: Regenerate.
* generated/reshape_c4.c: Regenerate.
* generated/reshape_i4.c: Regenerate.
* generated/reshape_c16.c: Regenerate.
* generated/reshape_r10.c: Regenerate.
* generated/reshape_r8.c: Regenerate.
* generated/reshape_c10.c: Regenerate.
* generated/reshape_c8.c: Regenerate.
* generated/reshape_i8.c: Regenerate.
* generated/reshape_i16.c: Regenerate.
* generated/reshape_r4.c: Regenerate.

* gcc/testsuite/gfortran.dg/zero_sized_1.f90: Uncomment checks
for RESHAPE.

  r118458 | amylaar | 2006-11-03 06:52:19 -0800 (Fri, 03 Nov 2006) | 97 lines

  gcc:

  2006-11-03  Jorn Rennecke  [EMAIL PROTECTED]

* config/sh/crt1.asm: Fix #ifdef indent.

  2006-11-03  Jorn Rennecke  [EMAIL PROTECTED]
Merged from STMicroelectronics sources:
2006-10-06  Andrew Stubbs  [EMAIL PROTECTED]
  * config/sh/crt1.asm (vbr_600): Add missing #if.
2006-08-03  Jorn Rennecke  [EMAIL PROTECTED]
  * sh.opt (mfused-madd): New option.
  * sh.md (mac_media, macsf3): Make conditional on TARGET_FMAC.
2006-07-04  Andrew Stubbs  [EMAIL PROTECTED]
  * config/sh/crt1.asm (vbr_start): Move to new section .test.vbr.
  Remove pointless handler at VBR+0.
  (vbr_200, vbr_300, vbr_500): Remove pointless handler.
  (vbr_600): Save and restore mach and macl, fpul and fpscr and fr0 to
  fr7. Make sure the timer handler is called with the correct FPU
  precision setting, according to the ABI.
2006-06-14  Jorn Rennecke [EMAIL PROTECTED]
  * config/sh/sh.opt (m2a-single, m2a-single-only): Fix Condition.
  * config/sh/sh.h (SUPPORT_SH2A_NOFPU): Fix condition.
  (SUPPORT_SH2A_SINGLE_ONLY, SUPPORT_SH2A_SINGLE_ONLY): Likewise.
2006-06-09  Jorn Rennecke [EMAIL PROTECTED]
  * sh.md (cmpgeusi_t): Change into define_insn_and_split.  Accept
  zero as second operand.
2006-04-28  Jorn Rennecke [EMAIL PROTECTED]
  * config/sh/divtab-sh4-300.c, config/sh/lib1funcs-4-300.asm:
  Fixed some bugs related to negative values, in particular -0
  and overflow at -0x8000.
  * config/sh/divcost-analysis: Added sh4-300 figures.
2006-04-27  Jorn Rennecke [EMAIL PROTECTED]
  * config/sh/t-sh (MULTILIB_MATCHES): Add -m4-300* / -m4-340 options.
2006-04-26  Jorn Rennecke [EMAIL PROTECTED]
  * config/sh/t-sh (OPT_EXTRA_PARTS): Add libgcc-4-300.a.
  ($(T)div_table-4-300.o, $(T)libgcc-4-300.a): New rules.
  * config/sh/divtab-sh4-300.c, config/sh/lib1funcs-4-300.asm:
New files.
  * config/sh/embed-elf.h (LIBGCC_SPEC): Use -lgcc-4-300 for -m4-300* /
  -m4-340.
2006-04-24  Jorn Rennecke [EMAIL PROTECTED]
  SH4-300 scheduling description  fixes to SH4-[12]00 description:
  * sh.md: New instruction types: fstore, movi8, fpscr_toggle, gp_mac,
  mac_mem, mem_mac, dfp_mul, fp_cmp.
  (insn_class, dfp_comp, any_fp_comp): Update.
  (push_fpul, movsf_ie, fpu_switch, toggle_sz, toggle_pr): Update type.
  (cmpgtsf_t, cmpeqsf_t, cmpgtsf_t_i4, cmpeqsf_t_i4): Likewise.
  (muldf3_i): Likewise.
  (movsi_i): Split rI08 alternative into two separate alternatives.
  Update type.
  (movsi_ie, movsi_i_lowpart): Likewise.
  (movqi_i): Split ri alternative into two separate alternatives.
  Update type.
  * sh1.md (sh1_load_store, sh1_fp): Update.
  * sh4.md (sh4_store, sh4_mac_gp, fp_arith, fp_double_arith): Update.
  (mac_mem, sh4_fpscr_toggle): New insn_reservations.
  * sh4a.md (sh4a_mov, sh4a_load, sh4a_store, sh4a_fp_arith): Update.
  (sh4a_fp_double_arith): Likewise.
  * sh4-300.md: New file.
  * sh.c (sh_handle_option): Handle m4-300* options.
  (sh_adjust_cost): Fix latency of auto-increments.
  Handle SH4-300 differently than other SH4s.  Check for new insn
types.
  * sh.h (OVERRIDE_OPTIONS): Initilize sh_branch_cost if it has not
  been set by an option.
  * sh.opt (m4-300, m4-100-nofpu, m4-200-nofpu): New options.
  (m4-300-nofpu, -m4-340, m4-300-single, m4-300-single-only): 

[Bug target/26915] missed sized opt returning -1.0

2006-11-04 Thread pluto at agmk dot net


--- Comment #4 from pluto at agmk dot net  2006-11-04 09:10 ---
Created an attachment (id=12545)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12545action=view)
patch from Uros Bizjak.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915



[Bug target/26915] missed sized opt returning -1.0

2006-11-04 Thread pluto at agmk dot net


--- Comment #5 from pluto at agmk dot net  2006-11-04 09:11 ---
with attached patch gcc42 produces:

$ ./xgcc -B. PR26915.c -m32 -S -Os -fomit-frame-pointer

.file   PR26915.c
.text
.globl minus1
.type   minus1, @function
minus1:
fld1
fchs
ret
.size   minus1, .-minus1
.ident  GCC: (GNU) 4.2.0 20061030 (prerelease) (PLD-Linux)
.section.note.GNU-stack,,@progbits


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915



[Bug target/26915] missed sized opt returning -1.0

2006-11-04 Thread uros at gcc dot gnu dot org


--- Comment #6 from uros at gcc dot gnu dot org  2006-11-04 23:12 ---
Subject: Bug 26915

Author: uros
Date: Sat Nov  4 23:12:16 2006
New Revision: 118484

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=118484
Log:
PR target/26915
* config/i386/i386.c (standard_80387_constant_p): Treat -0.0 and -1.0
as a valid 80387 constant.
(standard_80387_constant_opcode): Return # for -0.0 and -1.0.
* config/i386/i386.md (unnamed splitter): Split the load of
constant -0.0 or -1.0  into the load of 0.0 or 1.0, followed
by negation.

testsuite/ChangeLog:

PR target/26915
* gcc.target/i386/387-12.c: New test.
~

Added:
trunk/gcc/testsuite/gcc.target/i386/387-12.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.md
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915



[Bug target/26915] missed sized opt returning -1.0

2006-11-04 Thread ubizjak at gmail dot com


--- Comment #7 from ubizjak at gmail dot com  2006-11-04 23:15 ---
Fixed.


-- 

ubizjak at gmail dot com changed:

   What|Removed |Added

URL||http://gcc.gnu.org/ml/gcc-
   ||patches/2006-
   ||11/msg00220.html
 Status|NEW |RESOLVED
   Keywords||patch
  Known to fail|4.1.2 4.2.0 |4.1.2
  Known to work||4.2.0
 Resolution||FIXED
   Target Milestone|--- |4.3.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915



[Bug target/26915] missed sized opt returning -1.0

2006-05-02 Thread pinskia at gcc dot gnu dot org


--- Comment #3 from pinskia at gcc dot gnu dot org  2006-05-02 08:08 ---
Confirmed.


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

OtherBugsDependingO||16996
  nThis||
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
  GCC build triplet|i686-pld-linux  |
   GCC host triplet|i686-pld-linux  |
 GCC target triplet|i686-pld-linux  |
   Keywords||missed-optimization
   Last reconfirmed|-00-00 00:00:00 |2006-05-02 08:08:49
   date||
Summary|missed optimization /   |missed sized opt returning -
   |returning -1.0  |1.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26915