Re: [PATCH v2 0/4] mm: arm64: bring up BATCHED_UNMAP_TLB_FLUSH

2022-07-23 Thread xhao
On 7/20/22 7:18 PM, Barry Song wrote: On Tue, Jul 19, 2022 at 1:28 AM Yicong Yang wrote: On 2022/7/14 12:51, Barry Song wrote: On Thu, Jul 14, 2022 at 3:29 PM Xin Hao wrote: Hi barry. I do some test on Kunpeng arm64 machine use Unixbench. The test result as below. One core, we can see

Re: [PATCH v2 0/4] mm: arm64: bring up BATCHED_UNMAP_TLB_FLUSH

2022-07-23 Thread xhao
On 7/18/22 9:28 PM, Yicong Yang wrote: On 2022/7/14 12:51, Barry Song wrote: On Thu, Jul 14, 2022 at 3:29 PM Xin Hao wrote: Hi barry. I do some test on Kunpeng arm64 machine use Unixbench. The test result as below. One core, we can see the performance improvement above +30%. I am really

Re: [PATCH v2 0/4] mm: arm64: bring up BATCHED_UNMAP_TLB_FLUSH

2022-07-20 Thread Barry Song
On Tue, Jul 19, 2022 at 1:28 AM Yicong Yang wrote: > > On 2022/7/14 12:51, Barry Song wrote: > > On Thu, Jul 14, 2022 at 3:29 PM Xin Hao wrote: > >> > >> Hi barry. > >> > >> I do some test on Kunpeng arm64 machine use Unixbench. > >> > >> The test result as below. > >> > >> One core, we can see

Re: [PATCH v2 0/4] mm: arm64: bring up BATCHED_UNMAP_TLB_FLUSH

2022-07-13 Thread Barry Song
On Thu, Jul 14, 2022 at 3:29 PM Xin Hao wrote: > > Hi barry. > > I do some test on Kunpeng arm64 machine use Unixbench. > > The test result as below. > > One core, we can see the performance improvement above +30%. I am really pleased to see the 30%+ improvement on unixbench on single core. > .

Re: [PATCH v2 0/4] mm: arm64: bring up BATCHED_UNMAP_TLB_FLUSH

2022-07-13 Thread Xin Hao
Hi barry. I do some test on Kunpeng arm64 machine use Unixbench. The test  result as below. One core, we can see the performance improvement above +30%. ./Run -c 1 -i 1 shell1 w/o System Benchmarks Partial Index  BASELINE RESULT INDEX Shell Scripts (1 concurrent)

[PATCH v2 0/4] mm: arm64: bring up BATCHED_UNMAP_TLB_FLUSH

2022-07-10 Thread Barry Song
Though ARM64 has the hardware to do tlb shootdown, the hardware broadcasting is not free. A simplest micro benchmark shows even on snapdragon 888 with only 8 cores, the overhead for ptep_clear_flush is huge even for paging out one page mapped by only one process: 5.36% a.out[kernel.kallsyms]