Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes

2021-03-24 Thread Robin Murphy
On 2021-03-24 16:38, David Laight wrote: From: Robin Murphy Sent: 23 March 2021 12:09 On 2021-03-23 07:34, Yang Yingliang wrote: When copy over 128 bytes, src/dst is added after each ldp/stp instruction, it will cost more time. To improve this, we only add src/dst after load or store 64

RE: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes

2021-03-24 Thread David Laight
From: Robin Murphy > Sent: 23 March 2021 12:09 > > On 2021-03-23 07:34, Yang Yingliang wrote: > > When copy over 128 bytes, src/dst is added after > > each ldp/stp instruction, it will cost more time. > > To improve this, we only add src/dst after load > > or store 64 bytes. > > This breaks the

Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes

2021-03-23 Thread Catalin Marinas
On Tue, Mar 23, 2021 at 01:32:18PM +, Will Deacon wrote: > On Tue, Mar 23, 2021 at 12:08:56PM +, Robin Murphy wrote: > > On 2021-03-23 07:34, Yang Yingliang wrote: > > > When copy over 128 bytes, src/dst is added after > > > each ldp/stp instruction, it will cost more time. > > > To

Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes

2021-03-23 Thread Robin Murphy
On 2021-03-23 13:32, Will Deacon wrote: On Tue, Mar 23, 2021 at 12:08:56PM +, Robin Murphy wrote: On 2021-03-23 07:34, Yang Yingliang wrote: When copy over 128 bytes, src/dst is added after each ldp/stp instruction, it will cost more time. To improve this, we only add src/dst after load or

Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes

2021-03-23 Thread Will Deacon
On Tue, Mar 23, 2021 at 12:08:56PM +, Robin Murphy wrote: > On 2021-03-23 07:34, Yang Yingliang wrote: > > When copy over 128 bytes, src/dst is added after > > each ldp/stp instruction, it will cost more time. > > To improve this, we only add src/dst after load > > or store 64 bytes. > > This

Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes

2021-03-23 Thread Robin Murphy
On 2021-03-23 07:34, Yang Yingliang wrote: When copy over 128 bytes, src/dst is added after each ldp/stp instruction, it will cost more time. To improve this, we only add src/dst after load or store 64 bytes. This breaks the required behaviour for copy_*_user(), since the fault handler

[PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes

2021-03-23 Thread Yang Yingliang
When copy over 128 bytes, src/dst is added after each ldp/stp instruction, it will cost more time. To improve this, we only add src/dst after load or store 64 bytes. Copy 4096 bytes cost on Kunpeng920 (ms): Without this patch: memcpy: 143.85 copy_from_user: 172.69 copy_to_user: 199.23 With this