On Tue, Nov 19, 2019 at 10:04 AM Jakub Jelinek <ja...@redhat.com> wrote: > > Hi! > > xchg instruction is smaller, in some cases much smaller than 3 moves, > (e.g. in the testcase 2 bytes vs. 8 bytes), and is not a performance > disaster, but from Agner Fog tables and > https://stackoverflow.com/questions/45766444/why-is-xchg-reg-reg-a-3-micro-op-instruction-on-modern-intel-architectures > it doesn't seem to be something we'd want to use when optimizing for speed, > at least not on Intel. > > While we have *swap<mode> patterns, those are very unlikely to be triggered > during combine, usually we have different pseudos in there and the actual > need for swapping is only materialized during RA. > > The following patch does it when optimizing the insn for size only. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
I wonder if IRA/LRA should be made aware of an xchg instruction (and it's cost?) so it knows it doesn't actually need a temporary register? Richard. > 2019-11-19 Jakub Jelinek <ja...@redhat.com> > > PR target/92549 > * config/i386/i386.md (peephole2 for *swap<mode>): New peephole2. > > * gcc.target/i386/pr92549.c: New test. > > --- gcc/config/i386/i386.md.jj 2019-10-28 22:16:14.583008121 +0100 > +++ gcc/config/i386/i386.md 2019-11-18 17:06:36.050742545 +0100 > @@ -2787,6 +2787,17 @@ > (set_attr "amdfam10_decode" "double") > (set_attr "bdver1_decode" "double")]) > > +(define_peephole2 > + [(set (match_operand:SWI 0 "register_operand") > + (match_operand:SWI 1 "register_operand")) > + (set (match_dup 1) > + (match_operand:SWI 2 "register_operand")) > + (set (match_dup 2) (match_dup 0))] > + "peep2_reg_dead_p (3, operands[0]) > + && optimize_insn_for_size_p ()" > + [(parallel [(set (match_dup 1) (match_dup 2)) > + (set (match_dup 2) (match_dup 1))])]) > + > (define_expand "movstrict<mode>" > [(set (strict_low_part (match_operand:SWI12 0 "register_operand")) > (match_operand:SWI12 1 "general_operand"))] > --- gcc/testsuite/gcc.target/i386/pr92549.c.jj 2019-11-18 17:48:35.533177377 > +0100 > +++ gcc/testsuite/gcc.target/i386/pr92549.c 2019-11-18 17:49:31.888336444 > +0100 > @@ -0,0 +1,28 @@ > +/* PR target/92549 */ > +/* { dg-do compile } */ > +/* { dg-options "-Os -masm=att" } */ > +/* { dg-final { scan-assembler "xchgl" } } */ > + > +#ifdef __i386__ > +#define R , regparm (2) > +#else > +#define R > +#endif > + > +__attribute__((noipa R)) int > +bar (int a, int b) > +{ > + return b - a + 5; > +} > + > +__attribute__((noipa R)) int > +foo (int a, int b) > +{ > + return 1 + bar (b, a); > +} > + > +int > +main () > +{ > + return foo (39, 3); > +} > > Jakub >