Re: [PATCH v8 0/9] rwsem performance optimizations

2013-11-04 Thread Tim Chen
Ingo, Sorry for the late response. My old 4 socket Westmere test machine went down and I have to find a new one, which is a 4 socket Ivybridge machine with 15 cores per socket. I've updated the workload as a perf benchmark (see patch) attached. The workload will mmap, then access memory in the

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-17 Thread Ingo Molnar
* Tim Chen wrote: > > > > > It would be _really_ nice to stick this into tools/perf/bench/ as: > > > > perf bench mem pagefaults > > > > or so, with a number of parallelism and workload patterns. See > > tools/perf/bench/numa.c for a couple of workload generators - although > > those a

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-16 Thread Tim Chen
> > It would be _really_ nice to stick this into tools/perf/bench/ as: > > perf bench mem pagefaults > > or so, with a number of parallelism and workload patterns. See > tools/perf/bench/numa.c for a couple of workload generators - although > those are not page fault intense. > > So th

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-16 Thread Tim Chen
On Wed, 2013-10-16 at 08:55 +0200, Ingo Molnar wrote: > * Tim Chen wrote: > > > On Thu, 2013-10-10 at 09:54 +0200, Ingo Molnar wrote: > > > * Tim Chen wrote: > > > > > > > The throughput of pure mmap with mutex is below vs pure mmap is below: > > > > > > > > % change in performance of the mmap

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-15 Thread Ingo Molnar
* Tim Chen wrote: > On Thu, 2013-10-10 at 09:54 +0200, Ingo Molnar wrote: > > * Tim Chen wrote: > > > > > The throughput of pure mmap with mutex is below vs pure mmap is below: > > > > > > % change in performance of the mmap with pthread-mutex vs pure mmap > > > #threadsvanilla all

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-15 Thread Tim Chen
On Thu, 2013-10-10 at 09:54 +0200, Ingo Molnar wrote: > * Tim Chen wrote: > > > The throughput of pure mmap with mutex is below vs pure mmap is below: > > > > % change in performance of the mmap with pthread-mutex vs pure mmap > > #threadsvanilla all rwsem without optspin > >

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-10 Thread Ingo Molnar
* Tim Chen wrote: > The throughput of pure mmap with mutex is below vs pure mmap is below: > > % change in performance of the mmap with pthread-mutex vs pure mmap > #threadsvanilla all rwsem without optspin > patches > 1 3.0%

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-09 Thread Davidlohr Bueso
On Wed, 2013-10-09 at 20:14 -0700, Linus Torvalds wrote: > On Wed, Oct 9, 2013 at 12:28 AM, Peter Zijlstra wrote: > > > > The workload that I got the report from was a virus scanner, it would > > spawn nr_cpus threads and {mmap file, scan content, munmap} through your > > filesystem. > > So I sus

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-09 Thread Linus Torvalds
On Wed, Oct 9, 2013 at 12:28 AM, Peter Zijlstra wrote: > > The workload that I got the report from was a virus scanner, it would > spawn nr_cpus threads and {mmap file, scan content, munmap} through your > filesystem. So I suspect we could make the mmap_sem write area *much* smaller for the norma

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-09 Thread Tim Chen
On Wed, 2013-10-09 at 08:15 +0200, Ingo Molnar wrote: > * Tim Chen wrote: > > > Ingo, > > > > I ran the vanilla kernel, the kernel with all rwsem patches and the > > kernel with all patches except the optimistic spin one. I am listing > > two presentations of the data. Please note that there

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-09 Thread Peter Zijlstra
On Wed, Oct 09, 2013 at 08:15:51AM +0200, Ingo Molnar wrote: > So I'd expect this to be a rather sensitive workload and you'd have to > actively engineer it to hit the effect PeterZ mentioned. I could imagine > MPI workloads to run into such patterns - but not deterministically. The workload tha

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-08 Thread Ingo Molnar
* Tim Chen wrote: > Ingo, > > I ran the vanilla kernel, the kernel with all rwsem patches and the > kernel with all patches except the optimistic spin one. I am listing > two presentations of the data. Please note that there is about 5% > run-run variation. > > % change in performance vs

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-07 Thread Tim Chen
On Thu, 2013-10-03 at 09:32 +0200, Ingo Molnar wrote: > * Tim Chen wrote: > > > For version 8 of the patchset, we included the patch from Waiman to > > streamline wakeup operations and also optimize the MCS lock used in > > rwsem and mutex. > > I'd be feeling a lot easier about this patch seri

Re: [PATCH v8 0/9] rwsem performance optimizations

2013-10-03 Thread Ingo Molnar
* Tim Chen wrote: > For version 8 of the patchset, we included the patch from Waiman to > streamline wakeup operations and also optimize the MCS lock used in > rwsem and mutex. I'd be feeling a lot easier about this patch series if you also had performance figures that show how mmap_sem is a