On Mon, Mar 22, 2021 at 4:57 PM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On Mon, Mar 22, 2021 at 7:10 AM Jan Hubicka <hubi...@ucw.cz> wrote:
> >
> > >
> > > gcc/
> > >
> > >       * config/i386/i386-expand.c (expand_set_or_cpymem_via_rep):
> > >       For TARGET_PREFER_KNOWN_REP_MOVSB_STOSB, don't convert QImode
> > >       to SImode.
> > >       (decide_alg): For TARGET_PREFER_KNOWN_REP_MOVSB_STOSB, use
> > >       "rep movsb/stosb" only for known sizes.
> > >       * config/i386/i386-options.c (processor_cost_table): Use Ice
> > >       Lake cost for Cannon Lake, Ice Lake, Tiger Lake, Sapphire
> > >       Rapids and Alder Lake.
> > >       * config/i386/i386.h (TARGET_PREFER_KNOWN_REP_MOVSB_STOSB): New.
> > >       * config/i386/x86-tune-costs.h (icelake_memcpy): New.
> > >       (icelake_memset): Likewise.
> > >       (icelake_cost): Likewise.
> > >       * config/i386/x86-tune.def (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB):
> > >       New.
> >
> > It looks like X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB is quite obviously
> > benefical and independent of the rest of changes.  I think we will need
> > to discuss bit more the move ratio and the code size/uop cache polution
> > issues - one option would be to use increased limits for -O3 only.
>
> My change only increases CLEAR_RATIO, not MOVE_RATIO.   We are
> checking code size impacts on SPEC CPU 2017 and eembc.
>
> > Can you break this out to independent patch?  I also wonder if it owuld
>
> X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB improves performance
> only when memcpy/memset costs and MOVE_RATIO are updated the same time,
> like:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567096.html
>
> Make it a standalone means moving from Ice Lake patch to Skylake patch.
>
> > not be more readable to special case this just on the beggining of
> > decide_alg.
> > > @@ -6890,6 +6891,7 @@ decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT 
> > > expected_size,
> > >    const struct processor_costs *cost;
> > >    int i;
> > >    bool any_alg_usable_p = false;
> > > +  bool known_size_p = expected_size != -1;
> >
> > expected_size is not -1 if we have profile feedback and we detected from
> > histogram average size of a block.  It seems to me that from description
> > that you want the const to be actual compile time constant that would be
> > min_size == max_size I guess.
> >
>
> You are right.  Here is the v2 patch with min_size != max_size check for
> unknown size.
>

Hi Honza,

This patch only impacts Ice Lake.   Do you have any comments for the v2
patch?

Thanks.

--
H.J.

Reply via email to