Thanks Boris!
So the patch is helpful and no impact for other/older machines,
I will re-send new version according to comments.
Any further comments are appreciated!

Regards
Ling

> -----Original Message-----
> From: Borislav Petkov [mailto:[email protected]]
> Sent: Sunday, October 14, 2012 6:58 PM
> To: Ma, Ling
> Cc: Konrad Rzeszutek Wilk; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> George Spelvin
> Subject: Re: [PATCH RFC 2/2] [x86] Optimize copy_page by re-arranging
> instruction sequence and saving register
> 
> On Fri, Oct 12, 2012 at 08:04:11PM +0200, Borislav Petkov wrote:
> > Right, so benchmark shows around 20% speedup on Bulldozer but this is
> > a microbenchmark and before pursue this further, we need to verify
> > whether this brings any palpable speedup with a real benchmark, I
> > don't know, kernbench, netbench, whatever. Even something as boring
> as
> > kernel build. And probably check for perf regressions on the rest of
> > the uarches.
> 
> Ok, so to summarize, on AMD we're using REP MOVSQ which is even faster
> than the unrolled version. I've added the REP MOVSQ version to the
> µbenchmark. It nicely validates that we're correctly setting
> X86_FEATURE_REP_GOOD on everything >= F10h and some K8s.
> 
> So, to answer Konrad's question: those patches don't concern AMD
> machines.
> 
> Thanks.
> 
> --
> Regards/Gruss,
>     Boris.

Reply via email to