Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-27 Thread Namhyung Kim
Hi Peter,

On Fri, 24 Oct 2014 15:14:40 +0200, Peter Zijlstra wrote:
> On Fri, Oct 24, 2014 at 09:54:23AM +0200, Ingo Molnar wrote:
>> 
>> * Peter Zijlstra  wrote:
>> > Its what I thought initially, I tried doing perf record with and
>> > without, but then I ran into perf diff not quite working for me and I've
>> > yet to find time to kick that thing into shape.
>> 
>> Might be the 'perf diff' regression fixed by this:
>> 
>>   9ab1f50876db perf diff: Add missing hists__init() call at tool start
>> 
>> I just pushed it out into tip:master.
>
> I was on tip/master, so unlikely to be that as I was likely already
> having it.
>
> perf-report was affected too, for some reason my CONFIG_DEBUG_INFO=y
> vmlinux wasn't showing symbols (and I double checked that KASLR crap was
> disabled, so that wasn't confusing stuff either).
>
> When I forced perf-report to use kallsyms it works, however perf-diff
> doesn't have that option.
>
> So there's two issues there, 1) perf-report failing to generate useful
> output and 2) per-diff lacking options to force it to behave.

Did the perf-report fail to show any (kernel) symbols or are they wrong
symbols?  Maybe it's related to this:

https://lkml.org/lkml/2014/9/22/78

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-27 Thread Namhyung Kim
Hi Peter,

On Fri, 24 Oct 2014 15:14:40 +0200, Peter Zijlstra wrote:
 On Fri, Oct 24, 2014 at 09:54:23AM +0200, Ingo Molnar wrote:
 
 * Peter Zijlstra pet...@infradead.org wrote:
  Its what I thought initially, I tried doing perf record with and
  without, but then I ran into perf diff not quite working for me and I've
  yet to find time to kick that thing into shape.
 
 Might be the 'perf diff' regression fixed by this:
 
   9ab1f50876db perf diff: Add missing hists__init() call at tool start
 
 I just pushed it out into tip:master.

 I was on tip/master, so unlikely to be that as I was likely already
 having it.

 perf-report was affected too, for some reason my CONFIG_DEBUG_INFO=y
 vmlinux wasn't showing symbols (and I double checked that KASLR crap was
 disabled, so that wasn't confusing stuff either).

 When I forced perf-report to use kallsyms it works, however perf-diff
 doesn't have that option.

 So there's two issues there, 1) perf-report failing to generate useful
 output and 2) per-diff lacking options to force it to behave.

Did the perf-report fail to show any (kernel) symbols or are they wrong
symbols?  Maybe it's related to this:

https://lkml.org/lkml/2014/9/22/78

Thanks,
Namhyung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-24 Thread Peter Zijlstra
On Fri, Oct 24, 2014 at 09:54:23AM +0200, Ingo Molnar wrote:
> 
> * Peter Zijlstra  wrote:
> 
> > On Thu, Oct 23, 2014 at 06:40:05PM +0800, Lai Jiangshan wrote:
> > > On 10/22/2014 01:56 AM, Peter Zijlstra wrote:
> > > > On Tue, Oct 21, 2014 at 08:09:48PM +0300, Kirill A. Shutemov wrote:
> > > >> It would be interesting to see if the patchset affects non-condended 
> > > >> case.
> > > >> Like a one-threaded workload.
> > > > 
> > > > It does, and not in a good way, I'll have to look at that... :/
> > > 
> > > Maybe it is blamed to find_vma_srcu() that it doesn't take the advantage 
> > > of
> > > the vmacache_find() and cause more cache-misses.
> > 
> > Its what I thought initially, I tried doing perf record with and
> > without, but then I ran into perf diff not quite working for me and I've
> > yet to find time to kick that thing into shape.
> 
> Might be the 'perf diff' regression fixed by this:
> 
>   9ab1f50876db perf diff: Add missing hists__init() call at tool start
> 
> I just pushed it out into tip:master.

I was on tip/master, so unlikely to be that as I was likely already
having it.

perf-report was affected too, for some reason my CONFIG_DEBUG_INFO=y
vmlinux wasn't showing symbols (and I double checked that KASLR crap was
disabled, so that wasn't confusing stuff either).

When I forced perf-report to use kallsyms it works, however perf-diff
doesn't have that option.

So there's two issues there, 1) perf-report failing to generate useful
output and 2) per-diff lacking options to force it to behave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-24 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> On Thu, Oct 23, 2014 at 06:40:05PM +0800, Lai Jiangshan wrote:
> > On 10/22/2014 01:56 AM, Peter Zijlstra wrote:
> > > On Tue, Oct 21, 2014 at 08:09:48PM +0300, Kirill A. Shutemov wrote:
> > >> It would be interesting to see if the patchset affects non-condended 
> > >> case.
> > >> Like a one-threaded workload.
> > > 
> > > It does, and not in a good way, I'll have to look at that... :/
> > 
> > Maybe it is blamed to find_vma_srcu() that it doesn't take the advantage of
> > the vmacache_find() and cause more cache-misses.
> 
> Its what I thought initially, I tried doing perf record with and
> without, but then I ran into perf diff not quite working for me and I've
> yet to find time to kick that thing into shape.

Might be the 'perf diff' regression fixed by this:

  9ab1f50876db perf diff: Add missing hists__init() call at tool start

I just pushed it out into tip:master.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-24 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 On Thu, Oct 23, 2014 at 06:40:05PM +0800, Lai Jiangshan wrote:
  On 10/22/2014 01:56 AM, Peter Zijlstra wrote:
   On Tue, Oct 21, 2014 at 08:09:48PM +0300, Kirill A. Shutemov wrote:
   It would be interesting to see if the patchset affects non-condended 
   case.
   Like a one-threaded workload.
   
   It does, and not in a good way, I'll have to look at that... :/
  
  Maybe it is blamed to find_vma_srcu() that it doesn't take the advantage of
  the vmacache_find() and cause more cache-misses.
 
 Its what I thought initially, I tried doing perf record with and
 without, but then I ran into perf diff not quite working for me and I've
 yet to find time to kick that thing into shape.

Might be the 'perf diff' regression fixed by this:

  9ab1f50876db perf diff: Add missing hists__init() call at tool start

I just pushed it out into tip:master.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-24 Thread Peter Zijlstra
On Fri, Oct 24, 2014 at 09:54:23AM +0200, Ingo Molnar wrote:
 
 * Peter Zijlstra pet...@infradead.org wrote:
 
  On Thu, Oct 23, 2014 at 06:40:05PM +0800, Lai Jiangshan wrote:
   On 10/22/2014 01:56 AM, Peter Zijlstra wrote:
On Tue, Oct 21, 2014 at 08:09:48PM +0300, Kirill A. Shutemov wrote:
It would be interesting to see if the patchset affects non-condended 
case.
Like a one-threaded workload.

It does, and not in a good way, I'll have to look at that... :/
   
   Maybe it is blamed to find_vma_srcu() that it doesn't take the advantage 
   of
   the vmacache_find() and cause more cache-misses.
  
  Its what I thought initially, I tried doing perf record with and
  without, but then I ran into perf diff not quite working for me and I've
  yet to find time to kick that thing into shape.
 
 Might be the 'perf diff' regression fixed by this:
 
   9ab1f50876db perf diff: Add missing hists__init() call at tool start
 
 I just pushed it out into tip:master.

I was on tip/master, so unlikely to be that as I was likely already
having it.

perf-report was affected too, for some reason my CONFIG_DEBUG_INFO=y
vmlinux wasn't showing symbols (and I double checked that KASLR crap was
disabled, so that wasn't confusing stuff either).

When I forced perf-report to use kallsyms it works, however perf-diff
doesn't have that option.

So there's two issues there, 1) perf-report failing to generate useful
output and 2) per-diff lacking options to force it to behave.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-23 Thread Peter Zijlstra
On Thu, Oct 23, 2014 at 06:40:05PM +0800, Lai Jiangshan wrote:
> On 10/22/2014 01:56 AM, Peter Zijlstra wrote:
> > On Tue, Oct 21, 2014 at 08:09:48PM +0300, Kirill A. Shutemov wrote:
> >> It would be interesting to see if the patchset affects non-condended case.
> >> Like a one-threaded workload.
> > 
> > It does, and not in a good way, I'll have to look at that... :/
> 
> Maybe it is blamed to find_vma_srcu() that it doesn't take the advantage of
> the vmacache_find() and cause more cache-misses.

Its what I thought initially, I tried doing perf record with and
without, but then I ran into perf diff not quite working for me and I've
yet to find time to kick that thing into shape.

> Is it hard to use the vmacache in the find_vma_srcu()?

I've not had time to look at it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-23 Thread Lai Jiangshan
On 10/22/2014 01:56 AM, Peter Zijlstra wrote:
> On Tue, Oct 21, 2014 at 08:09:48PM +0300, Kirill A. Shutemov wrote:
>> It would be interesting to see if the patchset affects non-condended case.
>> Like a one-threaded workload.
> 
> It does, and not in a good way, I'll have to look at that... :/

Maybe it is blamed to find_vma_srcu() that it doesn't take the advantage of
the vmacache_find() and cause more cache-misses.


Is it hard to use the vmacache in the find_vma_srcu()?

> 
>  Performance counter stats for './multi-fault 1' (5 runs):
> 
> 73,860,251  page-faults   
> ( +-  0.28% )
> 40,914  cache-misses  
> ( +- 41.26% )
> 
>   60.001484913 seconds time elapsed   
>( +-  0.00% )
> 
> 
>  Performance counter stats for './multi-fault 1' (5 runs):
> 
> 70,700,838  page-faults   
> ( +-  0.03% )
> 31,466  cache-misses  
> ( +-  8.62% )
> 
>   60.001753906 seconds time elapsed   
>( +-  0.00% )
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-23 Thread Lai Jiangshan
On 10/22/2014 01:56 AM, Peter Zijlstra wrote:
 On Tue, Oct 21, 2014 at 08:09:48PM +0300, Kirill A. Shutemov wrote:
 It would be interesting to see if the patchset affects non-condended case.
 Like a one-threaded workload.
 
 It does, and not in a good way, I'll have to look at that... :/

Maybe it is blamed to find_vma_srcu() that it doesn't take the advantage of
the vmacache_find() and cause more cache-misses.


Is it hard to use the vmacache in the find_vma_srcu()?

 
  Performance counter stats for './multi-fault 1' (5 runs):
 
 73,860,251  page-faults   
 ( +-  0.28% )
 40,914  cache-misses  
 ( +- 41.26% )
 
   60.001484913 seconds time elapsed   
( +-  0.00% )
 
 
  Performance counter stats for './multi-fault 1' (5 runs):
 
 70,700,838  page-faults   
 ( +-  0.03% )
 31,466  cache-misses  
 ( +-  8.62% )
 
   60.001753906 seconds time elapsed   
( +-  0.00% )
 .
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-23 Thread Peter Zijlstra
On Thu, Oct 23, 2014 at 06:40:05PM +0800, Lai Jiangshan wrote:
 On 10/22/2014 01:56 AM, Peter Zijlstra wrote:
  On Tue, Oct 21, 2014 at 08:09:48PM +0300, Kirill A. Shutemov wrote:
  It would be interesting to see if the patchset affects non-condended case.
  Like a one-threaded workload.
  
  It does, and not in a good way, I'll have to look at that... :/
 
 Maybe it is blamed to find_vma_srcu() that it doesn't take the advantage of
 the vmacache_find() and cause more cache-misses.

Its what I thought initially, I tried doing perf record with and
without, but then I ran into perf diff not quite working for me and I've
yet to find time to kick that thing into shape.

 Is it hard to use the vmacache in the find_vma_srcu()?

I've not had time to look at it.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-22 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> On Tue, Oct 21, 2014 at 06:23:40PM +0200, Ingo Molnar wrote:
> > 
> > * Peter Zijlstra  wrote:
> > 
> > > My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
> > > 
> > > PRE:
> > >149,441,555  page-faults  ( +-  1.25% )
> > >
> > > POST:
> > >236,442,626  page-faults  ( +-  0.08% )
> > 
> > > My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
> > > 
> > > PRE:
> > >105,789,078  page-faults ( +-  2.24% )
> > >
> > > POST:
> > >187,751,767  page-faults ( +-  2.24% )
> > 
> > I guess the 'PRE' and 'POST' numbers should be flipped around?
> 
> Nope, its the number of page-faults serviced in a fixed amount of time
> (60 seconds), therefore higher is better.

Ah, okay!

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-22 Thread Kirill A. Shutemov
On Wed, Oct 22, 2014 at 01:45:58PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 22, 2014 at 02:29:25PM +0300, Kirill A. Shutemov wrote:
> > On Wed, Oct 22, 2014 at 12:34:49AM -0700, Davidlohr Bueso wrote:
> > > On Mon, 2014-10-20 at 23:56 +0200, Peter Zijlstra wrote:
> > > > Hi,
> > > > 
> > > > I figured I'd give my 2010 speculative fault series another spin:
> > > > 
> > > >   https://lkml.org/lkml/2010/1/4/257
> > > > 
> > > > Since then I think many of the outstanding issues have changed 
> > > > sufficiently to
> > > > warrant another go. In particular Al Viro's delayed fput seems to have 
> > > > made it
> > > > entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite 
> > > > provided us
> > > > with call_srcu() and my preemptible mmu_gather removed the TLB flushes 
> > > > from
> > > > under the PTL.
> > > > 
> > > > The code needs way more attention but builds a kernel and runs the
> > > > micro-benchmark so I figured I'd post it before sinking more time into 
> > > > it.
> > > > 
> > > > I realize the micro-bench is about as good as it gets for this series 
> > > > and not
> > > > very realistic otherwise, but I think it does show the potential 
> > > > benefit the
> > > > approach has.
> > > > 
> > > > (patches go against .18-rc1+)
> > > 
> > > I think patch 2/6 is borken:
> > > 
> > > error: patch failed: mm/memory.c:2025
> > > error: mm/memory.c: patch does not apply
> > > 
> > > and related, as you mention, I would very much welcome having the
> > > introduction of 'struct faut_env' as a separate cleanup patch. May I
> > > suggest renaming it to fault_cxt?
> > 
> > What about extend start using 'struct vm_fault' earlier by stack?
> 
> I'm not sure we should mix the environment for vm_ops::fault, which
> acquires the page, and the fault path, which deals with changing the
> PTE. Ideally we should not expose the page-table information to file
> ops, its a layering violating if nothing else, drivers should not have
> access to the page tables.

We already have this for ->map_pages() :-P
I have asked if it's considered layering violation and seems nobody
cares...

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-22 Thread Peter Zijlstra
On Wed, Oct 22, 2014 at 02:29:25PM +0300, Kirill A. Shutemov wrote:
> On Wed, Oct 22, 2014 at 12:34:49AM -0700, Davidlohr Bueso wrote:
> > On Mon, 2014-10-20 at 23:56 +0200, Peter Zijlstra wrote:
> > > Hi,
> > > 
> > > I figured I'd give my 2010 speculative fault series another spin:
> > > 
> > >   https://lkml.org/lkml/2010/1/4/257
> > > 
> > > Since then I think many of the outstanding issues have changed 
> > > sufficiently to
> > > warrant another go. In particular Al Viro's delayed fput seems to have 
> > > made it
> > > entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided 
> > > us
> > > with call_srcu() and my preemptible mmu_gather removed the TLB flushes 
> > > from
> > > under the PTL.
> > > 
> > > The code needs way more attention but builds a kernel and runs the
> > > micro-benchmark so I figured I'd post it before sinking more time into it.
> > > 
> > > I realize the micro-bench is about as good as it gets for this series and 
> > > not
> > > very realistic otherwise, but I think it does show the potential benefit 
> > > the
> > > approach has.
> > > 
> > > (patches go against .18-rc1+)
> > 
> > I think patch 2/6 is borken:
> > 
> > error: patch failed: mm/memory.c:2025
> > error: mm/memory.c: patch does not apply
> > 
> > and related, as you mention, I would very much welcome having the
> > introduction of 'struct faut_env' as a separate cleanup patch. May I
> > suggest renaming it to fault_cxt?
> 
> What about extend start using 'struct vm_fault' earlier by stack?

I'm not sure we should mix the environment for vm_ops::fault, which
acquires the page, and the fault path, which deals with changing the
PTE. Ideally we should not expose the page-table information to file
ops, its a layering violating if nothing else, drivers should not have
access to the page tables.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-22 Thread Kirill A. Shutemov
On Wed, Oct 22, 2014 at 12:34:49AM -0700, Davidlohr Bueso wrote:
> On Mon, 2014-10-20 at 23:56 +0200, Peter Zijlstra wrote:
> > Hi,
> > 
> > I figured I'd give my 2010 speculative fault series another spin:
> > 
> >   https://lkml.org/lkml/2010/1/4/257
> > 
> > Since then I think many of the outstanding issues have changed sufficiently 
> > to
> > warrant another go. In particular Al Viro's delayed fput seems to have made 
> > it
> > entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided us
> > with call_srcu() and my preemptible mmu_gather removed the TLB flushes from
> > under the PTL.
> > 
> > The code needs way more attention but builds a kernel and runs the
> > micro-benchmark so I figured I'd post it before sinking more time into it.
> > 
> > I realize the micro-bench is about as good as it gets for this series and 
> > not
> > very realistic otherwise, but I think it does show the potential benefit the
> > approach has.
> > 
> > (patches go against .18-rc1+)
> 
> I think patch 2/6 is borken:
> 
> error: patch failed: mm/memory.c:2025
> error: mm/memory.c: patch does not apply
> 
> and related, as you mention, I would very much welcome having the
> introduction of 'struct faut_env' as a separate cleanup patch. May I
> suggest renaming it to fault_cxt?

What about extend start using 'struct vm_fault' earlier by stack?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-22 Thread Davidlohr Bueso
On Mon, 2014-10-20 at 23:56 +0200, Peter Zijlstra wrote:
> Hi,
> 
> I figured I'd give my 2010 speculative fault series another spin:
> 
>   https://lkml.org/lkml/2010/1/4/257
> 
> Since then I think many of the outstanding issues have changed sufficiently to
> warrant another go. In particular Al Viro's delayed fput seems to have made it
> entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided us
> with call_srcu() and my preemptible mmu_gather removed the TLB flushes from
> under the PTL.
> 
> The code needs way more attention but builds a kernel and runs the
> micro-benchmark so I figured I'd post it before sinking more time into it.
> 
> I realize the micro-bench is about as good as it gets for this series and not
> very realistic otherwise, but I think it does show the potential benefit the
> approach has.
> 
> (patches go against .18-rc1+)

I think patch 2/6 is borken:

error: patch failed: mm/memory.c:2025
error: mm/memory.c: patch does not apply

and related, as you mention, I would very much welcome having the
introduction of 'struct faut_env' as a separate cleanup patch. May I
suggest renaming it to fault_cxt?

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-22 Thread Davidlohr Bueso
On Mon, 2014-10-20 at 23:56 +0200, Peter Zijlstra wrote:
 Hi,
 
 I figured I'd give my 2010 speculative fault series another spin:
 
   https://lkml.org/lkml/2010/1/4/257
 
 Since then I think many of the outstanding issues have changed sufficiently to
 warrant another go. In particular Al Viro's delayed fput seems to have made it
 entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided us
 with call_srcu() and my preemptible mmu_gather removed the TLB flushes from
 under the PTL.
 
 The code needs way more attention but builds a kernel and runs the
 micro-benchmark so I figured I'd post it before sinking more time into it.
 
 I realize the micro-bench is about as good as it gets for this series and not
 very realistic otherwise, but I think it does show the potential benefit the
 approach has.
 
 (patches go against .18-rc1+)

I think patch 2/6 is borken:

error: patch failed: mm/memory.c:2025
error: mm/memory.c: patch does not apply

and related, as you mention, I would very much welcome having the
introduction of 'struct faut_env' as a separate cleanup patch. May I
suggest renaming it to fault_cxt?

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-22 Thread Kirill A. Shutemov
On Wed, Oct 22, 2014 at 12:34:49AM -0700, Davidlohr Bueso wrote:
 On Mon, 2014-10-20 at 23:56 +0200, Peter Zijlstra wrote:
  Hi,
  
  I figured I'd give my 2010 speculative fault series another spin:
  
https://lkml.org/lkml/2010/1/4/257
  
  Since then I think many of the outstanding issues have changed sufficiently 
  to
  warrant another go. In particular Al Viro's delayed fput seems to have made 
  it
  entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided us
  with call_srcu() and my preemptible mmu_gather removed the TLB flushes from
  under the PTL.
  
  The code needs way more attention but builds a kernel and runs the
  micro-benchmark so I figured I'd post it before sinking more time into it.
  
  I realize the micro-bench is about as good as it gets for this series and 
  not
  very realistic otherwise, but I think it does show the potential benefit the
  approach has.
  
  (patches go against .18-rc1+)
 
 I think patch 2/6 is borken:
 
 error: patch failed: mm/memory.c:2025
 error: mm/memory.c: patch does not apply
 
 and related, as you mention, I would very much welcome having the
 introduction of 'struct faut_env' as a separate cleanup patch. May I
 suggest renaming it to fault_cxt?

What about extend start using 'struct vm_fault' earlier by stack?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-22 Thread Peter Zijlstra
On Wed, Oct 22, 2014 at 02:29:25PM +0300, Kirill A. Shutemov wrote:
 On Wed, Oct 22, 2014 at 12:34:49AM -0700, Davidlohr Bueso wrote:
  On Mon, 2014-10-20 at 23:56 +0200, Peter Zijlstra wrote:
   Hi,
   
   I figured I'd give my 2010 speculative fault series another spin:
   
 https://lkml.org/lkml/2010/1/4/257
   
   Since then I think many of the outstanding issues have changed 
   sufficiently to
   warrant another go. In particular Al Viro's delayed fput seems to have 
   made it
   entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided 
   us
   with call_srcu() and my preemptible mmu_gather removed the TLB flushes 
   from
   under the PTL.
   
   The code needs way more attention but builds a kernel and runs the
   micro-benchmark so I figured I'd post it before sinking more time into it.
   
   I realize the micro-bench is about as good as it gets for this series and 
   not
   very realistic otherwise, but I think it does show the potential benefit 
   the
   approach has.
   
   (patches go against .18-rc1+)
  
  I think patch 2/6 is borken:
  
  error: patch failed: mm/memory.c:2025
  error: mm/memory.c: patch does not apply
  
  and related, as you mention, I would very much welcome having the
  introduction of 'struct faut_env' as a separate cleanup patch. May I
  suggest renaming it to fault_cxt?
 
 What about extend start using 'struct vm_fault' earlier by stack?

I'm not sure we should mix the environment for vm_ops::fault, which
acquires the page, and the fault path, which deals with changing the
PTE. Ideally we should not expose the page-table information to file
ops, its a layering violating if nothing else, drivers should not have
access to the page tables.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-22 Thread Kirill A. Shutemov
On Wed, Oct 22, 2014 at 01:45:58PM +0200, Peter Zijlstra wrote:
 On Wed, Oct 22, 2014 at 02:29:25PM +0300, Kirill A. Shutemov wrote:
  On Wed, Oct 22, 2014 at 12:34:49AM -0700, Davidlohr Bueso wrote:
   On Mon, 2014-10-20 at 23:56 +0200, Peter Zijlstra wrote:
Hi,

I figured I'd give my 2010 speculative fault series another spin:

  https://lkml.org/lkml/2010/1/4/257

Since then I think many of the outstanding issues have changed 
sufficiently to
warrant another go. In particular Al Viro's delayed fput seems to have 
made it
entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite 
provided us
with call_srcu() and my preemptible mmu_gather removed the TLB flushes 
from
under the PTL.

The code needs way more attention but builds a kernel and runs the
micro-benchmark so I figured I'd post it before sinking more time into 
it.

I realize the micro-bench is about as good as it gets for this series 
and not
very realistic otherwise, but I think it does show the potential 
benefit the
approach has.

(patches go against .18-rc1+)
   
   I think patch 2/6 is borken:
   
   error: patch failed: mm/memory.c:2025
   error: mm/memory.c: patch does not apply
   
   and related, as you mention, I would very much welcome having the
   introduction of 'struct faut_env' as a separate cleanup patch. May I
   suggest renaming it to fault_cxt?
  
  What about extend start using 'struct vm_fault' earlier by stack?
 
 I'm not sure we should mix the environment for vm_ops::fault, which
 acquires the page, and the fault path, which deals with changing the
 PTE. Ideally we should not expose the page-table information to file
 ops, its a layering violating if nothing else, drivers should not have
 access to the page tables.

We already have this for -map_pages() :-P
I have asked if it's considered layering violation and seems nobody
cares...

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-22 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 On Tue, Oct 21, 2014 at 06:23:40PM +0200, Ingo Molnar wrote:
  
  * Peter Zijlstra pet...@infradead.org wrote:
  
   My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
   
   PRE:
  149,441,555  page-faults  ( +-  1.25% )
  
   POST:
  236,442,626  page-faults  ( +-  0.08% )
  
   My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
   
   PRE:
  105,789,078  page-faults ( +-  2.24% )
  
   POST:
  187,751,767  page-faults ( +-  2.24% )
  
  I guess the 'PRE' and 'POST' numbers should be flipped around?
 
 Nope, its the number of page-faults serviced in a fixed amount of time
 (60 seconds), therefore higher is better.

Ah, okay!

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-21 Thread Peter Zijlstra
On Tue, Oct 21, 2014 at 08:09:48PM +0300, Kirill A. Shutemov wrote:
> It would be interesting to see if the patchset affects non-condended case.
> Like a one-threaded workload.

It does, and not in a good way, I'll have to look at that... :/

 Performance counter stats for './multi-fault 1' (5 runs):

73,860,251  page-faults 
  ( +-  0.28% )
40,914  cache-misses
  ( +- 41.26% )

  60.001484913 seconds time elapsed 
 ( +-  0.00% )


 Performance counter stats for './multi-fault 1' (5 runs):

70,700,838  page-faults 
  ( +-  0.03% )
31,466  cache-misses
  ( +-  8.62% )

  60.001753906 seconds time elapsed 
 ( +-  0.00% )
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-21 Thread Peter Zijlstra
On Tue, Oct 21, 2014 at 06:23:40PM +0200, Ingo Molnar wrote:
> 
> * Peter Zijlstra  wrote:
> 
> > My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
> > 
> > PRE:
> >149,441,555  page-faults  ( +-  1.25% )
> >
> > POST:
> >236,442,626  page-faults  ( +-  0.08% )
> 
> > My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
> > 
> > PRE:
> >105,789,078  page-faults ( +-  2.24% )
> >
> > POST:
> >187,751,767  page-faults ( +-  2.24% )
> 
> I guess the 'PRE' and 'POST' numbers should be flipped around?

Nope, its the number of page-faults serviced in a fixed amount of time
(60 seconds), therefore higher is better.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-21 Thread Kirill A. Shutemov
On Tue, Oct 21, 2014 at 06:23:40PM +0200, Ingo Molnar wrote:
> 
> * Peter Zijlstra  wrote:
> 
> > My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
> > 
> > PRE:
> >149,441,555  page-faults  ( +-  1.25% )
> >
> > POST:
> >236,442,626  page-faults  ( +-  0.08% )
> 
> > My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
> > 
> > PRE:
> >105,789,078  page-faults ( +-  2.24% )
> >
> > POST:
> >187,751,767  page-faults ( +-  2.24% )
> 
> I guess the 'PRE' and 'POST' numbers should be flipped around?

I think it's faults per second.

It would be interesting to see if the patchset affects non-condended case.
Like a one-threaded workload.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-21 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
> 
> PRE:
>149,441,555  page-faults  ( +-  1.25% )
>
> POST:
>236,442,626  page-faults  ( +-  0.08% )

> My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
> 
> PRE:
>105,789,078  page-faults ( +-  2.24% )
>
> POST:
>187,751,767  page-faults ( +-  2.24% )

I guess the 'PRE' and 'POST' numbers should be flipped around?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-21 Thread Peter Zijlstra
On Mon, Oct 20, 2014 at 05:07:02PM -0700, Andy Lutomirski wrote:
> On 10/20/2014 02:56 PM, Peter Zijlstra wrote:
> > Hi,
> > 
> > I figured I'd give my 2010 speculative fault series another spin:
> > 
> >   https://lkml.org/lkml/2010/1/4/257
> > 
> > Since then I think many of the outstanding issues have changed sufficiently 
> > to
> > warrant another go. In particular Al Viro's delayed fput seems to have made 
> > it
> > entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided us
> > with call_srcu() and my preemptible mmu_gather removed the TLB flushes from
> > under the PTL.
> > 
> > The code needs way more attention but builds a kernel and runs the
> > micro-benchmark so I figured I'd post it before sinking more time into it.
> > 
> > I realize the micro-bench is about as good as it gets for this series and 
> > not
> > very realistic otherwise, but I think it does show the potential benefit the
> > approach has.
> 
> Does this mean that an entire fault can complete without ever taking
> mmap_sem at all?  If so, that's a *huge* win.

Yep.

> I'm a bit concerned about drivers that assume that the vma is unchanged
> during .fault processing.  In particular, is there a race between .close
> and .fault?  Would it make sense to add a per-vma rw lock and hold it
> during vma modification and .fault calls?

VMA granularity contention would be about as bad as mmap_sem for many
workloads. But yes, that is one of the things we need to look at, I was
_hoping_ that holding the file open would sort most these problems, but
I'm sure there plenty 'interesting' cruft left.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-21 Thread Peter Zijlstra
On Mon, Oct 20, 2014 at 05:07:02PM -0700, Andy Lutomirski wrote:
 On 10/20/2014 02:56 PM, Peter Zijlstra wrote:
  Hi,
  
  I figured I'd give my 2010 speculative fault series another spin:
  
https://lkml.org/lkml/2010/1/4/257
  
  Since then I think many of the outstanding issues have changed sufficiently 
  to
  warrant another go. In particular Al Viro's delayed fput seems to have made 
  it
  entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided us
  with call_srcu() and my preemptible mmu_gather removed the TLB flushes from
  under the PTL.
  
  The code needs way more attention but builds a kernel and runs the
  micro-benchmark so I figured I'd post it before sinking more time into it.
  
  I realize the micro-bench is about as good as it gets for this series and 
  not
  very realistic otherwise, but I think it does show the potential benefit the
  approach has.
 
 Does this mean that an entire fault can complete without ever taking
 mmap_sem at all?  If so, that's a *huge* win.

Yep.

 I'm a bit concerned about drivers that assume that the vma is unchanged
 during .fault processing.  In particular, is there a race between .close
 and .fault?  Would it make sense to add a per-vma rw lock and hold it
 during vma modification and .fault calls?

VMA granularity contention would be about as bad as mmap_sem for many
workloads. But yes, that is one of the things we need to look at, I was
_hoping_ that holding the file open would sort most these problems, but
I'm sure there plenty 'interesting' cruft left.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-21 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
 
 PRE:
149,441,555  page-faults  ( +-  1.25% )

 POST:
236,442,626  page-faults  ( +-  0.08% )

 My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
 
 PRE:
105,789,078  page-faults ( +-  2.24% )

 POST:
187,751,767  page-faults ( +-  2.24% )

I guess the 'PRE' and 'POST' numbers should be flipped around?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-21 Thread Kirill A. Shutemov
On Tue, Oct 21, 2014 at 06:23:40PM +0200, Ingo Molnar wrote:
 
 * Peter Zijlstra pet...@infradead.org wrote:
 
  My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
  
  PRE:
 149,441,555  page-faults  ( +-  1.25% )
 
  POST:
 236,442,626  page-faults  ( +-  0.08% )
 
  My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
  
  PRE:
 105,789,078  page-faults ( +-  2.24% )
 
  POST:
 187,751,767  page-faults ( +-  2.24% )
 
 I guess the 'PRE' and 'POST' numbers should be flipped around?

I think it's faults per second.

It would be interesting to see if the patchset affects non-condended case.
Like a one-threaded workload.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-21 Thread Peter Zijlstra
On Tue, Oct 21, 2014 at 06:23:40PM +0200, Ingo Molnar wrote:
 
 * Peter Zijlstra pet...@infradead.org wrote:
 
  My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
  
  PRE:
 149,441,555  page-faults  ( +-  1.25% )
 
  POST:
 236,442,626  page-faults  ( +-  0.08% )
 
  My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
  
  PRE:
 105,789,078  page-faults ( +-  2.24% )
 
  POST:
 187,751,767  page-faults ( +-  2.24% )
 
 I guess the 'PRE' and 'POST' numbers should be flipped around?

Nope, its the number of page-faults serviced in a fixed amount of time
(60 seconds), therefore higher is better.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-21 Thread Peter Zijlstra
On Tue, Oct 21, 2014 at 08:09:48PM +0300, Kirill A. Shutemov wrote:
 It would be interesting to see if the patchset affects non-condended case.
 Like a one-threaded workload.

It does, and not in a good way, I'll have to look at that... :/

 Performance counter stats for './multi-fault 1' (5 runs):

73,860,251  page-faults 
  ( +-  0.28% )
40,914  cache-misses
  ( +- 41.26% )

  60.001484913 seconds time elapsed 
 ( +-  0.00% )


 Performance counter stats for './multi-fault 1' (5 runs):

70,700,838  page-faults 
  ( +-  0.03% )
31,466  cache-misses
  ( +-  8.62% )

  60.001753906 seconds time elapsed 
 ( +-  0.00% )
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-20 Thread Andy Lutomirski
On 10/20/2014 02:56 PM, Peter Zijlstra wrote:
> Hi,
> 
> I figured I'd give my 2010 speculative fault series another spin:
> 
>   https://lkml.org/lkml/2010/1/4/257
> 
> Since then I think many of the outstanding issues have changed sufficiently to
> warrant another go. In particular Al Viro's delayed fput seems to have made it
> entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided us
> with call_srcu() and my preemptible mmu_gather removed the TLB flushes from
> under the PTL.
> 
> The code needs way more attention but builds a kernel and runs the
> micro-benchmark so I figured I'd post it before sinking more time into it.
> 
> I realize the micro-bench is about as good as it gets for this series and not
> very realistic otherwise, but I think it does show the potential benefit the
> approach has.

Does this mean that an entire fault can complete without ever taking
mmap_sem at all?  If so, that's a *huge* win.

I'm a bit concerned about drivers that assume that the vma is unchanged
during .fault processing.  In particular, is there a race between .close
and .fault?  Would it make sense to add a per-vma rw lock and hold it
during vma modification and .fault calls?

--Andy

> 
> (patches go against .18-rc1+)
> 
> ---
> 
> Using Kamezawa's multi-fault micro-bench from: 
> https://lkml.org/lkml/2010/1/6/28
> 
> My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
> 
> PRE:
> 
> root@ivb-ep:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 
> 20
> 
>  Performance counter stats for './multi-fault 20' (5 runs):
> 
>149,441,555  page-faults  ( +-  1.25% )
>  2,153,651,828  cache-misses ( +-  1.09% )
> 
>   60.003082014 seconds time elapsed  ( +-  0.00% )
> 
> POST:
> 
> root@ivb-ep:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 
> 20
> 
>  Performance counter stats for './multi-fault 20' (5 runs):
> 
>236,442,626  page-faults  ( +-  0.08% )
>  2,796,353,939  cache-misses ( +-  1.01% )
> 
>   60.002792431 seconds time elapsed  ( +-  0.00% )
> 
> 
> My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
> 
> PRE:
> 
> root@ivb-ex:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 
> 60
> 
>  Performance counter stats for './multi-fault 60' (5 runs):
> 
>105,789,078  page-faults ( +-  2.24% )
>  1,314,072,090  cache-misses( +-  1.17% )
> 
>   60.009243533 seconds time elapsed ( +-  0.00% )
> 
> POST:
> 
> root@ivb-ex:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 
> 60
> 
>  Performance counter stats for './multi-fault 60' (5 runs):
> 
>187,751,767  page-faults ( +-  2.24% )
>  1,792,758,664  cache-misses( +-  2.30% )
> 
>   60.011611579 seconds time elapsed ( +-  0.00% )
> 
> (I've not yet looked at why the EX sucks chunks compared to the EP box, I
>  suspect we contend on other locks, but it could be anything.)
> 
> ---
> 
>  arch/x86/mm/fault.c  |  35 ++-
>  include/linux/mm.h   |  19 +-
>  include/linux/mm_types.h |   5 +
>  kernel/fork.c|   1 +
>  mm/init-mm.c |   1 +
>  mm/internal.h|  18 ++
>  mm/memory.c  | 672 
> ---
>  mm/mmap.c| 101 +--
>  8 files changed, 544 insertions(+), 308 deletions(-)
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 0/6] Another go at speculative page faults

2014-10-20 Thread Andy Lutomirski
On 10/20/2014 02:56 PM, Peter Zijlstra wrote:
 Hi,
 
 I figured I'd give my 2010 speculative fault series another spin:
 
   https://lkml.org/lkml/2010/1/4/257
 
 Since then I think many of the outstanding issues have changed sufficiently to
 warrant another go. In particular Al Viro's delayed fput seems to have made it
 entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided us
 with call_srcu() and my preemptible mmu_gather removed the TLB flushes from
 under the PTL.
 
 The code needs way more attention but builds a kernel and runs the
 micro-benchmark so I figured I'd post it before sinking more time into it.
 
 I realize the micro-bench is about as good as it gets for this series and not
 very realistic otherwise, but I think it does show the potential benefit the
 approach has.

Does this mean that an entire fault can complete without ever taking
mmap_sem at all?  If so, that's a *huge* win.

I'm a bit concerned about drivers that assume that the vma is unchanged
during .fault processing.  In particular, is there a race between .close
and .fault?  Would it make sense to add a per-vma rw lock and hold it
during vma modification and .fault calls?

--Andy

 
 (patches go against .18-rc1+)
 
 ---
 
 Using Kamezawa's multi-fault micro-bench from: 
 https://lkml.org/lkml/2010/1/6/28
 
 My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
 
 PRE:
 
 root@ivb-ep:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 
 20
 
  Performance counter stats for './multi-fault 20' (5 runs):
 
149,441,555  page-faults  ( +-  1.25% )
  2,153,651,828  cache-misses ( +-  1.09% )
 
   60.003082014 seconds time elapsed  ( +-  0.00% )
 
 POST:
 
 root@ivb-ep:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 
 20
 
  Performance counter stats for './multi-fault 20' (5 runs):
 
236,442,626  page-faults  ( +-  0.08% )
  2,796,353,939  cache-misses ( +-  1.01% )
 
   60.002792431 seconds time elapsed  ( +-  0.00% )
 
 
 My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
 
 PRE:
 
 root@ivb-ex:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 
 60
 
  Performance counter stats for './multi-fault 60' (5 runs):
 
105,789,078  page-faults ( +-  2.24% )
  1,314,072,090  cache-misses( +-  1.17% )
 
   60.009243533 seconds time elapsed ( +-  0.00% )
 
 POST:
 
 root@ivb-ex:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 
 60
 
  Performance counter stats for './multi-fault 60' (5 runs):
 
187,751,767  page-faults ( +-  2.24% )
  1,792,758,664  cache-misses( +-  2.30% )
 
   60.011611579 seconds time elapsed ( +-  0.00% )
 
 (I've not yet looked at why the EX sucks chunks compared to the EP box, I
  suspect we contend on other locks, but it could be anything.)
 
 ---
 
  arch/x86/mm/fault.c  |  35 ++-
  include/linux/mm.h   |  19 +-
  include/linux/mm_types.h |   5 +
  kernel/fork.c|   1 +
  mm/init-mm.c |   1 +
  mm/internal.h|  18 ++
  mm/memory.c  | 672 
 ---
  mm/mmap.c| 101 +--
  8 files changed, 544 insertions(+), 308 deletions(-)
 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/