> On Sat, Sep 21, 2013 at 3:51 PM, Xinliang David Li <davi...@google.com> wrote: > > On Sat, Sep 21, 2013 at 12:54 PM, Jan Hubicka <hubi...@ucw.cz> wrote: > >> Hi, > >> this is upated version of patch discussed at > >> http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00841.html > >> > >> It makes CORE tuning to more follow the optimization guidelines. > >> In particular it removes some tuning flags for features I implemented years > >> back specifically for K7/K8 chips that ended up in Core tunning becuase > >> it was based on generic. Incrementally I plan to drop some of these from > >> generic, too. > >> > >> Compared to previous version of patch I left out INC_DEC change, even > >> though Core I7+ should resolve dependencies on partial flags correctly. > >> Optimization manual still seems to suggest to not use this: > >> > >> Assembly/Compiler Coding Rule 33. (M impact, H generality) > >> INC and DEC instructions should be replaced with ADD or SUB instructions, > >> because ADD and SUB overwrite all flags, whereas INC and DEC do not, > >> therefore > >> creating false dependencies on earlier instructions that set the flags. > >> > >> Other change dropped is use_vector_fp_converts that seems to improve > >> Core perofrmance. > > > > I did not see this in your patch, but Wei has this tuning in this patch: > > > > Sorry, I meant to ask why dropping this part?
Because I wanted to go with obvious changes first. > > David > > > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00884.html This patch seems resonable. (in fact I have pretty much same in my tree) use_vector_fp_converts is actually trying to solve the same problem in AMD hardware - you need to type the whole register when converting. So it may work well for AMD chips too or may be the difference is that Intel chips somehow handle "cvtpd2ps %xmm0, %xmm0" well even though the upper half of xmm0 is ill defined, while AMD chips doesn't. The patch seems OK. I do not see rason for && peep2_reg_dead_p (0, operands[0]) test. Reg has to be dead since it is full destination of the operation. Lets wait few days before commit so we know effect of individual changes. I will test it on AMD hardware and we can decide on generic tuning then. Honza