Re: [PATCH] x86: Accelerate copy_page with non-temporal in X86

2021-04-13 Thread Kemeng Shi
on 2021/4/13 22:53, Borislav Petkov wrote: > I thought "should be better" too last time when I measured rep; movs vs > NT stores but actual measurements showed no real difference. Mabye the NT stores make difference when store to slow dimms, like the persistent memory I just tested. Also, it lik

Re: Re: [PATCH] x86: Accelerate copy_page with non-temporal in X86

2021-04-13 Thread Borislav Petkov
On Tue, Apr 13, 2021 at 08:54:55PM +0800, Kemeng Shi wrote: > Yes. And NT stores should be better for copy_page especially copying a lot > of pages as only partial memory of copied page will be access recently. I thought "should be better" too last time when I measured rep; movs vs NT stores but a

Re:Re: [PATCH] x86: Accelerate copy_page with non-temporal in X86

2021-04-13 Thread Kemeng Shi
on 2021/4/13 19:01, Borislav Petkov wrote: > + linux-nvdimm > > Original mail at > https://lkml.kernel.org/r/3f28adee-8214-fa8e-b368-eaf8b1934...@huawei.com > > On Tue, Apr 13, 2021 at 02:25:58PM +0800, Kemeng Shi wrote: >> I'm using AEP with dax_kmem drvier, and AEP is export as a NUMA node

Re: [PATCH] x86: Accelerate copy_page with non-temporal in X86

2021-04-13 Thread Borislav Petkov
+ linux-nvdimm Original mail at https://lkml.kernel.org/r/3f28adee-8214-fa8e-b368-eaf8b1934...@huawei.com On Tue, Apr 13, 2021 at 02:25:58PM +0800, Kemeng Shi wrote: > I'm using AEP with dax_kmem drvier, and AEP is export as a NUMA node in What is AEP? > my system. I will move cold pages from

[PATCH] x86: Accelerate copy_page with non-temporal in X86

2021-04-12 Thread Kemeng Shi
I'm using AEP with dax_kmem drvier, and AEP is export as a NUMA node in my system. I will move cold pages from DRAM node to AEP node with move_pages system call. With old "rep movsq', it costs 2030ms to move 1 GB pages. With "movnti", it only cost about 890ms to move 1GB pages. I also test move 1GB