Re: [PATCH v2 0/7] x86/mm/tlb: make lazy TLB mode even lazier
On Tue, 2018-10-02 at 09:44 +0200, Peter Zijlstra wrote: > On Tue, Sep 25, 2018 at 11:58:37PM -0400, Rik van Riel wrote: > > > This v2 is "identical" to the version I posted yesterday, > > except this one is actually against current -tip (not sure > > what went wrong before), with a number of relevant patches > > on top: > > - tip x86/core > > 012e77a903d ("x86/nmi: Fix NMI uaccess race against CR3 > > switching") > > - arm64 tlb/asm-generic (entire branch) > > - peterz queue mm/tlb > > 12b2b80ec6f4 ("x86/mm: Page size aware flush_tlb_mm_range()") > > If you could please double check: > > > https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=x86/mm > > If 0day comes up green on that, I'll attempt to push it to tip. That looks good to me. Thank you! -- All Rights Reversed. signature.asc Description: This is a digitally signed message part
Re: [PATCH v2 0/7] x86/mm/tlb: make lazy TLB mode even lazier
On Tue, 2018-10-02 at 09:44 +0200, Peter Zijlstra wrote: > On Tue, Sep 25, 2018 at 11:58:37PM -0400, Rik van Riel wrote: > > > This v2 is "identical" to the version I posted yesterday, > > except this one is actually against current -tip (not sure > > what went wrong before), with a number of relevant patches > > on top: > > - tip x86/core > > 012e77a903d ("x86/nmi: Fix NMI uaccess race against CR3 > > switching") > > - arm64 tlb/asm-generic (entire branch) > > - peterz queue mm/tlb > > 12b2b80ec6f4 ("x86/mm: Page size aware flush_tlb_mm_range()") > > If you could please double check: > > > https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=x86/mm > > If 0day comes up green on that, I'll attempt to push it to tip. That looks good to me. Thank you! -- All Rights Reversed. signature.asc Description: This is a digitally signed message part
Re: [PATCH v2 0/7] x86/mm/tlb: make lazy TLB mode even lazier
On Tue, Sep 25, 2018 at 11:58:37PM -0400, Rik van Riel wrote: > This v2 is "identical" to the version I posted yesterday, > except this one is actually against current -tip (not sure > what went wrong before), with a number of relevant patches > on top: > - tip x86/core > 012e77a903d ("x86/nmi: Fix NMI uaccess race against CR3 switching") > - arm64 tlb/asm-generic (entire branch) > - peterz queue mm/tlb > 12b2b80ec6f4 ("x86/mm: Page size aware flush_tlb_mm_range()") If you could please double check: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=x86/mm If 0day comes up green on that, I'll attempt to push it to tip.
Re: [PATCH v2 0/7] x86/mm/tlb: make lazy TLB mode even lazier
On Tue, Sep 25, 2018 at 11:58:37PM -0400, Rik van Riel wrote: > This v2 is "identical" to the version I posted yesterday, > except this one is actually against current -tip (not sure > what went wrong before), with a number of relevant patches > on top: > - tip x86/core > 012e77a903d ("x86/nmi: Fix NMI uaccess race against CR3 switching") > - arm64 tlb/asm-generic (entire branch) > - peterz queue mm/tlb > 12b2b80ec6f4 ("x86/mm: Page size aware flush_tlb_mm_range()") If you could please double check: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=x86/mm If 0day comes up green on that, I'll attempt to push it to tip.
Re: [PATCH v2 0/7] x86/mm/tlb: make lazy TLB mode even lazier
Looks good to me, Acked-by: Peter Zijlstra (Intel)
Re: [PATCH v2 0/7] x86/mm/tlb: make lazy TLB mode even lazier
Looks good to me, Acked-by: Peter Zijlstra (Intel)
[PATCH v2 0/7] x86/mm/tlb: make lazy TLB mode even lazier
Linus asked me to come up with a smaller patch set to get the benefits of lazy TLB mode, so I spent some time trying out various permutations of the code, with a few workloads that do lots of context switches, and also happen to have a fair number of TLB flushes a second. Both of the workloads tested are memcache style workloads, running on two socket systems. One of the workloads has around 300,000 context switches a second, and around 19,000 TLB flushes. The first patch in the series, of always using lazy TLB mode, reduces CPU use around 1% on both Haswell and Broadwell systems. The rest of the series reduces the number of TLB flush IPIs by about 1,500 a second, resulting in a 0.2% reduction in CPU use, on top of the 1% seen by just enabling lazy TLB mode. These are the low hanging fruits in the context switch code. The big thing remaining is the reference count overhead of the lazy TLB mm_struct, but getting rid of that is rather a lot of code for a small performance gain. Not quite what Linus asked for :) This v2 is "identical" to the version I posted yesterday, except this one is actually against current -tip (not sure what went wrong before), with a number of relevant patches on top: - tip x86/core 012e77a903d ("x86/nmi: Fix NMI uaccess race against CR3 switching") - arm64 tlb/asm-generic (entire branch) - peterz queue mm/tlb 12b2b80ec6f4 ("x86/mm: Page size aware flush_tlb_mm_range()")
[PATCH v2 0/7] x86/mm/tlb: make lazy TLB mode even lazier
Linus asked me to come up with a smaller patch set to get the benefits of lazy TLB mode, so I spent some time trying out various permutations of the code, with a few workloads that do lots of context switches, and also happen to have a fair number of TLB flushes a second. Both of the workloads tested are memcache style workloads, running on two socket systems. One of the workloads has around 300,000 context switches a second, and around 19,000 TLB flushes. The first patch in the series, of always using lazy TLB mode, reduces CPU use around 1% on both Haswell and Broadwell systems. The rest of the series reduces the number of TLB flush IPIs by about 1,500 a second, resulting in a 0.2% reduction in CPU use, on top of the 1% seen by just enabling lazy TLB mode. These are the low hanging fruits in the context switch code. The big thing remaining is the reference count overhead of the lazy TLB mm_struct, but getting rid of that is rather a lot of code for a small performance gain. Not quite what Linus asked for :) This v2 is "identical" to the version I posted yesterday, except this one is actually against current -tip (not sure what went wrong before), with a number of relevant patches on top: - tip x86/core 012e77a903d ("x86/nmi: Fix NMI uaccess race against CR3 switching") - arm64 tlb/asm-generic (entire branch) - peterz queue mm/tlb 12b2b80ec6f4 ("x86/mm: Page size aware flush_tlb_mm_range()")