On Fri, Aug 20, 2021 at 9:55 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > Hi Richard, > > Benchmarking this patch using CSiBE on x86_64-pc-linux-gnu with -Os -m32 > saves 2432 bytes. > Of the 893 tests, 34 have size differences, 30 are improvements, 4 are > regressions (of a few bytes). > > > Also I'm missing a 'else' - in the default case there's no cost/benefit of > > using SSE vs. GPR regs? > > For SSE it would be a constant pool load. > > The code size regression I primarily wanted to tackle was the zero vs. > non-zero case when > dealing with immediate operands, which was the piece affected by my and > Jakub's xor > improvements. > > Alas my first attempt to specify a non-zero gain in the default (doesn't fit > in SImode) case, > increased the code size slightly. The use of the constant pool complicates > things, as the number > of times the same value is used becomes an issue. If the constant being > loaded is unique, then > clearly the increase in constant pool size should (ideally) be taken into > account. But if the same > constant is used multiple times in a chain (or is already in the constant > pool), the observed cost > is much cheaper. Empirically, a value of zero isn't a poor choice, so the > decision on whether to > use vector instructions is shifted to the gains from operations being > performed, rather than the > loading of integer constants. No doubt, like rtx_costs, these are free > parameters that future > generations will continue to tweak and refine. > > Given that this patch reduces code size with -Os, both with and without -m32, > ok for mainline?
OK if you add a comment for the missing 'else'. Thanks, Richard. > Thanks in advance, > Roger > -- > > -----Original Message----- > From: Richard Biener <richard.guent...@gmail.com> > Sent: 20 August 2021 08:29 > To: Roger Sayle <ro...@nextmovesoftware.com> > Cc: GCC Patches <gcc-patches@gcc.gnu.org> > Subject: Re: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass. > > On Thu, Aug 19, 2021 at 6:01 PM Roger Sayle <ro...@nextmovesoftware.com> > wrote: > > > > > > Doh! ENOPATCH. > > > > -----Original Message----- > > From: Roger Sayle <ro...@nextmovesoftware.com> > > Sent: 19 August 2021 16:59 > > To: 'GCC Patches' <gcc-patches@gcc.gnu.org> > > Subject: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass. > > > > > > Back in June I briefly mentioned in one of my gcc-patches posts that a > > change that should have always reduced code size, would mysteriously > > occasionally result in slightly larger code (according to CSiBE): > > https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573233.html > > > > Investigating further, the cause turns out to be that x86_64's > > scalar-to-vector (stv) pass is relying on poor estimates of the size > > costs/benefits. This patch tweaks the backend's compute_convert_gain > > method to provide slightly more accurate values when compiling with -Os. > > Compilation without -Os is (should be) unaffected. And for > > completeness, I'll mention that the stv pass is a net win for code > > size so it's much better to improve its heuristics than simply gate > > the pass on !optimize_for_size. > > > > The net effect of this change is to save 1399 bytes on the CSiBE code > > size benchmark when compiling with -Os. > > > > This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" > > and "make -k check" with no new failures. > > > > Ok for mainline? > > + /* xor (2 bytes) vs. xorps (3 bytes). */ > + if (src == const0_rtx) > + igain -= COSTS_N_BYTES (1); > + /* movdi_internal vs. movv2di_internal. */ > + /* => mov (5 bytes) vs. movaps (7 bytes). */ > + else if (x86_64_immediate_operand (src, SImode)) > + igain -= COSTS_N_BYTES (2); > > doesn't it need two GPR xor for 32bit DImode and two mov? Thus the non-SSE > cost should be times 'm'? For const0_rtx we may eventually re-use the zero > reg for the high part so that is eventually correct. > > Also I'm missing a 'else' - in the default case there's no cost/benefit of > using SSE vs. GPR regs? For SSE it would be a constant pool load. > > I also wonder, since I now see COSTS_N_BYTES for the first time (heh), > whether with -Os we'd need to replace all COSTS_N_INSNS (1) scaling with > COSTS_N_BYTES scaling? OTOH costs_add_n_insns uses COSTS_N_INSNS for the > size part as well. > > That said, it looks like we're eventually mixing apples and oranges now or > even previously? > > Thanks, > Richard. > > > > > > > 2021-08-19 Roger Sayle <ro...@nextmovesoftware.com> > > > > gcc/ChangeLog > > * config/i386/i386-features.c (compute_convert_gain): Provide > > more accurate values for CONST_INT, when optimizing for size. > > * config/i386/i386.c (COSTS_N_BYTES): Move definition from here... > > * config/i386/i386.h (COSTS_N_BYTES): to here. > > > > Roger > > -- > > >