Ingo,
Sorry for the late response. My old 4 socket Westmere
test machine went down and I have to find a new one,
which is a 4 socket Ivybridge machine with 15 cores per socket.
I've updated the workload as a perf benchmark (see patch)
attached. The workload will mmap, then access memory
in the
* Tim Chen wrote:
>
> >
> > It would be _really_ nice to stick this into tools/perf/bench/ as:
> >
> > perf bench mem pagefaults
> >
> > or so, with a number of parallelism and workload patterns. See
> > tools/perf/bench/numa.c for a couple of workload generators - although
> > those a
>
> It would be _really_ nice to stick this into tools/perf/bench/ as:
>
> perf bench mem pagefaults
>
> or so, with a number of parallelism and workload patterns. See
> tools/perf/bench/numa.c for a couple of workload generators - although
> those are not page fault intense.
>
> So th
On Wed, 2013-10-16 at 08:55 +0200, Ingo Molnar wrote:
> * Tim Chen wrote:
>
> > On Thu, 2013-10-10 at 09:54 +0200, Ingo Molnar wrote:
> > > * Tim Chen wrote:
> > >
> > > > The throughput of pure mmap with mutex is below vs pure mmap is below:
> > > >
> > > > % change in performance of the mmap
* Tim Chen wrote:
> On Thu, 2013-10-10 at 09:54 +0200, Ingo Molnar wrote:
> > * Tim Chen wrote:
> >
> > > The throughput of pure mmap with mutex is below vs pure mmap is below:
> > >
> > > % change in performance of the mmap with pthread-mutex vs pure mmap
> > > #threadsvanilla all
On Thu, 2013-10-10 at 09:54 +0200, Ingo Molnar wrote:
> * Tim Chen wrote:
>
> > The throughput of pure mmap with mutex is below vs pure mmap is below:
> >
> > % change in performance of the mmap with pthread-mutex vs pure mmap
> > #threadsvanilla all rwsem without optspin
> >
* Tim Chen wrote:
> The throughput of pure mmap with mutex is below vs pure mmap is below:
>
> % change in performance of the mmap with pthread-mutex vs pure mmap
> #threadsvanilla all rwsem without optspin
> patches
> 1 3.0%
On Wed, 2013-10-09 at 20:14 -0700, Linus Torvalds wrote:
> On Wed, Oct 9, 2013 at 12:28 AM, Peter Zijlstra wrote:
> >
> > The workload that I got the report from was a virus scanner, it would
> > spawn nr_cpus threads and {mmap file, scan content, munmap} through your
> > filesystem.
>
> So I sus
On Wed, Oct 9, 2013 at 12:28 AM, Peter Zijlstra wrote:
>
> The workload that I got the report from was a virus scanner, it would
> spawn nr_cpus threads and {mmap file, scan content, munmap} through your
> filesystem.
So I suspect we could make the mmap_sem write area *much* smaller for
the norma
On Wed, 2013-10-09 at 08:15 +0200, Ingo Molnar wrote:
> * Tim Chen wrote:
>
> > Ingo,
> >
> > I ran the vanilla kernel, the kernel with all rwsem patches and the
> > kernel with all patches except the optimistic spin one. I am listing
> > two presentations of the data. Please note that there
On Wed, Oct 09, 2013 at 08:15:51AM +0200, Ingo Molnar wrote:
> So I'd expect this to be a rather sensitive workload and you'd have to
> actively engineer it to hit the effect PeterZ mentioned. I could imagine
> MPI workloads to run into such patterns - but not deterministically.
The workload tha
* Tim Chen wrote:
> Ingo,
>
> I ran the vanilla kernel, the kernel with all rwsem patches and the
> kernel with all patches except the optimistic spin one. I am listing
> two presentations of the data. Please note that there is about 5%
> run-run variation.
>
> % change in performance vs
On Thu, 2013-10-03 at 09:32 +0200, Ingo Molnar wrote:
> * Tim Chen wrote:
>
> > For version 8 of the patchset, we included the patch from Waiman to
> > streamline wakeup operations and also optimize the MCS lock used in
> > rwsem and mutex.
>
> I'd be feeling a lot easier about this patch seri
* Tim Chen wrote:
> For version 8 of the patchset, we included the patch from Waiman to
> streamline wakeup operations and also optimize the MCS lock used in
> rwsem and mutex.
I'd be feeling a lot easier about this patch series if you also had
performance figures that show how mmap_sem is a
14 matches
Mail list logo