Re: [tip:locking/core] locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

2015-05-31 Thread Waiman Long

On 05/30/2015 12:09 AM, Sasha Levin wrote:

On 05/08/2015 09:27 AM, tip-bot for Peter Zijlstra (Intel) wrote:

Commit-ID:  f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
Gitweb: http://git.kernel.org/tip/f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
Author: Peter Zijlstra (Intel)
AuthorDate: Fri, 24 Apr 2015 14:56:38 -0400
Committer:  Ingo Molnar
CommitDate: Fri, 8 May 2015 12:37:09 +0200

locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

We use the regular paravirt call patching to switch between:

   native_queued_spin_lock_slowpath()   __pv_queued_spin_lock_slowpath()
   native_queued_spin_unlock()  __pv_queued_spin_unlock()

We use a callee saved call for the unlock function which reduces the
i-cache footprint and allows 'inlining' of SPIN_UNLOCK functions
again.

We further optimize the unlock path by patching the direct call with a
"movb $0,%arg1" if we are indeed using the native unlock code. This
makes the unlock code almost as fast as the !PARAVIRT case.

This significantly lowers the overhead of having
CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualizat...@lists.linux-foundation.org
Cc: xen-de...@lists.xenproject.org
Link: 
http://lkml.kernel.org/r/1429901803-29771-10-git-send-email-waiman.l...@hp.com
Signed-off-by: Ingo Molnar

Hey Peter,

I'm seeing this on the latest -next kernel:

[ 8693.503262] BUG: KASan: out of bounds access in 
__pv_queued_spin_lock_slowpath+0x84e/0x8c0 at addr b9495950
[ 8693.503271] Read of size 8 by task swapper/9/0
[ 8693.503289] Address belongs to variable pv_lock_ops+0x10/0x240


I would like to clarify what the message means. pv_locks_ops + 0x10 
should be the pv_wait function pointer. Also the structure should be 
just 32 bytes in size and so what does the "/0x240" mean?


Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tip:locking/core] locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

2015-05-31 Thread Waiman Long

On 05/30/2015 12:09 AM, Sasha Levin wrote:

On 05/08/2015 09:27 AM, tip-bot for Peter Zijlstra (Intel) wrote:

Commit-ID:  f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
Gitweb: http://git.kernel.org/tip/f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
Author: Peter Zijlstra (Intel)pet...@infradead.org
AuthorDate: Fri, 24 Apr 2015 14:56:38 -0400
Committer:  Ingo Molnarmi...@kernel.org
CommitDate: Fri, 8 May 2015 12:37:09 +0200

locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

We use the regular paravirt call patching to switch between:

   native_queued_spin_lock_slowpath()   __pv_queued_spin_lock_slowpath()
   native_queued_spin_unlock()  __pv_queued_spin_unlock()

We use a callee saved call for the unlock function which reduces the
i-cache footprint and allows 'inlining' of SPIN_UNLOCK functions
again.

We further optimize the unlock path by patching the direct call with a
movb $0,%arg1 if we are indeed using the native unlock code. This
makes the unlock code almost as fast as the !PARAVIRT case.

This significantly lowers the overhead of having
CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code.

Signed-off-by: Peter Zijlstra (Intel)pet...@infradead.org
Signed-off-by: Waiman Longwaiman.l...@hp.com
Signed-off-by: Peter Zijlstra (Intel)pet...@infradead.org
Cc: Andrew Mortona...@linux-foundation.org
Cc: Boris Ostrovskyboris.ostrov...@oracle.com
Cc: Borislav Petkovb...@alien8.de
Cc: Daniel J Bluemandan...@numascale.com
Cc: David Vrabeldavid.vra...@citrix.com
Cc: Douglas Hatchdoug.ha...@hp.com
Cc: H. Peter Anvinh...@zytor.com
Cc: Konrad Rzeszutek Wilkkonrad.w...@oracle.com
Cc: Linus Torvaldstorva...@linux-foundation.org
Cc: Oleg Nesterovo...@redhat.com
Cc: Paolo Bonzinipaolo.bonz...@gmail.com
Cc: Paul E. McKenneypaul...@linux.vnet.ibm.com
Cc: Peter Zijlstrapet...@infradead.org
Cc: Raghavendra K Traghavendra...@linux.vnet.ibm.com
Cc: Rik van Rielr...@redhat.com
Cc: Scott J Nortonscott.nor...@hp.com
Cc: Thomas Gleixnert...@linutronix.de
Cc: virtualizat...@lists.linux-foundation.org
Cc: xen-de...@lists.xenproject.org
Link: 
http://lkml.kernel.org/r/1429901803-29771-10-git-send-email-waiman.l...@hp.com
Signed-off-by: Ingo Molnarmi...@kernel.org

Hey Peter,

I'm seeing this on the latest -next kernel:

[ 8693.503262] BUG: KASan: out of bounds access in 
__pv_queued_spin_lock_slowpath+0x84e/0x8c0 at addr b9495950
[ 8693.503271] Read of size 8 by task swapper/9/0
[ 8693.503289] Address belongs to variable pv_lock_ops+0x10/0x240


I would like to clarify what the message means. pv_locks_ops + 0x10 
should be the pv_wait function pointer. Also the structure should be 
just 32 bytes in size and so what does the /0x240 mean?


Cheers,
Longman
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tip:locking/core] locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

2015-05-29 Thread Sasha Levin
On 05/08/2015 09:27 AM, tip-bot for Peter Zijlstra (Intel) wrote:
> Commit-ID:  f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
> Gitweb: http://git.kernel.org/tip/f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
> Author: Peter Zijlstra (Intel) 
> AuthorDate: Fri, 24 Apr 2015 14:56:38 -0400
> Committer:  Ingo Molnar 
> CommitDate: Fri, 8 May 2015 12:37:09 +0200
> 
> locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching
> 
> We use the regular paravirt call patching to switch between:
> 
>   native_queued_spin_lock_slowpath()  __pv_queued_spin_lock_slowpath()
>   native_queued_spin_unlock() __pv_queued_spin_unlock()
> 
> We use a callee saved call for the unlock function which reduces the
> i-cache footprint and allows 'inlining' of SPIN_UNLOCK functions
> again.
> 
> We further optimize the unlock path by patching the direct call with a
> "movb $0,%arg1" if we are indeed using the native unlock code. This
> makes the unlock code almost as fast as the !PARAVIRT case.
> 
> This significantly lowers the overhead of having
> CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> Signed-off-by: Waiman Long 
> Signed-off-by: Peter Zijlstra (Intel) 
> Cc: Andrew Morton 
> Cc: Boris Ostrovsky 
> Cc: Borislav Petkov 
> Cc: Daniel J Blueman 
> Cc: David Vrabel 
> Cc: Douglas Hatch 
> Cc: H. Peter Anvin 
> Cc: Konrad Rzeszutek Wilk 
> Cc: Linus Torvalds 
> Cc: Oleg Nesterov 
> Cc: Paolo Bonzini 
> Cc: Paul E. McKenney 
> Cc: Peter Zijlstra 
> Cc: Raghavendra K T 
> Cc: Rik van Riel 
> Cc: Scott J Norton 
> Cc: Thomas Gleixner 
> Cc: virtualizat...@lists.linux-foundation.org
> Cc: xen-de...@lists.xenproject.org
> Link: 
> http://lkml.kernel.org/r/1429901803-29771-10-git-send-email-waiman.l...@hp.com
> Signed-off-by: Ingo Molnar 

Hey Peter,

I'm seeing this on the latest -next kernel:

[ 8693.503262] BUG: KASan: out of bounds access in 
__pv_queued_spin_lock_slowpath+0x84e/0x8c0 at addr b9495950
[ 8693.503271] Read of size 8 by task swapper/9/0
[ 8693.503289] Address belongs to variable pv_lock_ops+0x10/0x240
[ 8693.503301] CPU: 9 PID: 0 Comm: swapper/9 Tainted: G  D 
4.1.0-rc5-next-20150529-sasha-00039-g7fd455d-dirty #2263
[ 8693.503335]  b6a1423a b6f92731d7a76ba3 8802b349f918 
b6a1423a
[ 8693.503355]   8802b349f9a8 8802b349f998 
ad5c70ee
[ 8693.503375]  ad2eb58e 0004 0086 
11011953cbb4
[ 8693.503379] Call Trace:
[ 8693.503409] ? dump_stack (lib/dump_stack.c:52)
[ 8693.503426] dump_stack (lib/dump_stack.c:52)
[ 8693.503454] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[ 8693.503463] ? __pv_queued_spin_lock_slowpath 
(./arch/x86/include/asm/paravirt.h:730 kernel/locking/qspinlock.c:410)
[ 8693.503474] ? kasan_report_error (mm/kasan/report.c:186)
[ 8693.503488] ? trace_hardirqs_off_caller (./arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:2652)
[ 8693.503504] __asan_report_load8_noabort (mm/kasan/report.c:230 
mm/kasan/report.c:251)
[ 8693.503517] ? __pv_queued_spin_lock_slowpath 
(./arch/x86/include/asm/paravirt.h:730 kernel/locking/qspinlock.c:410)
[ 8693.503526] __pv_queued_spin_lock_slowpath 
(./arch/x86/include/asm/paravirt.h:730 kernel/locking/qspinlock.c:410)
[ 8693.503541] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503557] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503566] ? trace_hardirqs_off_caller (./arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:2652)
[ 8693.503578] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503589] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503605] ? native_iret (arch/x86/kernel/entry_64.S:806)
[ 8693.503614] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503623] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503631] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503639] ? async_page_fault (arch/x86/kernel/entry_64.S:1261)
[ 8693.503663] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503681] ? native_iret (arch/x86/kernel/entry_64.S:806)
[ 8693.503691] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503699] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503730] ? trace_hardirqs_off_caller (./arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:2652)
[ 8693.503743] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503754] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503772] ? native_iret (arch/x86/kernel/entry_64.S:806)
[ 8693.503784] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503794] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503802] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503814] ? async_page_fault (arch/x86/kernel/entry_64.S:1261)
[ 8693.503829] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503845] ? native_iret (arch/x86/kernel/entry_64.S:806)
[ 8693.503854] ? error_sti 

Re: [tip:locking/core] locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

2015-05-29 Thread Sasha Levin
On 05/08/2015 09:27 AM, tip-bot for Peter Zijlstra (Intel) wrote:
 Commit-ID:  f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
 Gitweb: http://git.kernel.org/tip/f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
 Author: Peter Zijlstra (Intel) pet...@infradead.org
 AuthorDate: Fri, 24 Apr 2015 14:56:38 -0400
 Committer:  Ingo Molnar mi...@kernel.org
 CommitDate: Fri, 8 May 2015 12:37:09 +0200
 
 locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching
 
 We use the regular paravirt call patching to switch between:
 
   native_queued_spin_lock_slowpath()  __pv_queued_spin_lock_slowpath()
   native_queued_spin_unlock() __pv_queued_spin_unlock()
 
 We use a callee saved call for the unlock function which reduces the
 i-cache footprint and allows 'inlining' of SPIN_UNLOCK functions
 again.
 
 We further optimize the unlock path by patching the direct call with a
 movb $0,%arg1 if we are indeed using the native unlock code. This
 makes the unlock code almost as fast as the !PARAVIRT case.
 
 This significantly lowers the overhead of having
 CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code.
 
 Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
 Signed-off-by: Waiman Long waiman.l...@hp.com
 Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Boris Ostrovsky boris.ostrov...@oracle.com
 Cc: Borislav Petkov b...@alien8.de
 Cc: Daniel J Blueman dan...@numascale.com
 Cc: David Vrabel david.vra...@citrix.com
 Cc: Douglas Hatch doug.ha...@hp.com
 Cc: H. Peter Anvin h...@zytor.com
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Cc: Linus Torvalds torva...@linux-foundation.org
 Cc: Oleg Nesterov o...@redhat.com
 Cc: Paolo Bonzini paolo.bonz...@gmail.com
 Cc: Paul E. McKenney paul...@linux.vnet.ibm.com
 Cc: Peter Zijlstra pet...@infradead.org
 Cc: Raghavendra K T raghavendra...@linux.vnet.ibm.com
 Cc: Rik van Riel r...@redhat.com
 Cc: Scott J Norton scott.nor...@hp.com
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: virtualizat...@lists.linux-foundation.org
 Cc: xen-de...@lists.xenproject.org
 Link: 
 http://lkml.kernel.org/r/1429901803-29771-10-git-send-email-waiman.l...@hp.com
 Signed-off-by: Ingo Molnar mi...@kernel.org

Hey Peter,

I'm seeing this on the latest -next kernel:

[ 8693.503262] BUG: KASan: out of bounds access in 
__pv_queued_spin_lock_slowpath+0x84e/0x8c0 at addr b9495950
[ 8693.503271] Read of size 8 by task swapper/9/0
[ 8693.503289] Address belongs to variable pv_lock_ops+0x10/0x240
[ 8693.503301] CPU: 9 PID: 0 Comm: swapper/9 Tainted: G  D 
4.1.0-rc5-next-20150529-sasha-00039-g7fd455d-dirty #2263
[ 8693.503335]  b6a1423a b6f92731d7a76ba3 8802b349f918 
b6a1423a
[ 8693.503355]   8802b349f9a8 8802b349f998 
ad5c70ee
[ 8693.503375]  ad2eb58e 0004 0086 
11011953cbb4
[ 8693.503379] Call Trace:
[ 8693.503409] ? dump_stack (lib/dump_stack.c:52)
[ 8693.503426] dump_stack (lib/dump_stack.c:52)
[ 8693.503454] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[ 8693.503463] ? __pv_queued_spin_lock_slowpath 
(./arch/x86/include/asm/paravirt.h:730 kernel/locking/qspinlock.c:410)
[ 8693.503474] ? kasan_report_error (mm/kasan/report.c:186)
[ 8693.503488] ? trace_hardirqs_off_caller (./arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:2652)
[ 8693.503504] __asan_report_load8_noabort (mm/kasan/report.c:230 
mm/kasan/report.c:251)
[ 8693.503517] ? __pv_queued_spin_lock_slowpath 
(./arch/x86/include/asm/paravirt.h:730 kernel/locking/qspinlock.c:410)
[ 8693.503526] __pv_queued_spin_lock_slowpath 
(./arch/x86/include/asm/paravirt.h:730 kernel/locking/qspinlock.c:410)
[ 8693.503541] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503557] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503566] ? trace_hardirqs_off_caller (./arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:2652)
[ 8693.503578] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503589] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503605] ? native_iret (arch/x86/kernel/entry_64.S:806)
[ 8693.503614] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503623] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503631] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503639] ? async_page_fault (arch/x86/kernel/entry_64.S:1261)
[ 8693.503663] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503681] ? native_iret (arch/x86/kernel/entry_64.S:806)
[ 8693.503691] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503699] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503730] ? trace_hardirqs_off_caller (./arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:2652)
[ 8693.503743] ? error_sti (arch/x86/kernel/entry_64.S:1334)
[ 8693.503754] ? trace_hardirqs_off_thunk (arch/x86/lib/thunk_64.S:43)
[ 8693.503772] ? native_iret (arch/x86/kernel/entry_64.S:806)
[ 

[tip:locking/core] locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

2015-05-08 Thread tip-bot for Peter Zijlstra (Intel)
Commit-ID:  f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
Gitweb: http://git.kernel.org/tip/f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
Author: Peter Zijlstra (Intel) 
AuthorDate: Fri, 24 Apr 2015 14:56:38 -0400
Committer:  Ingo Molnar 
CommitDate: Fri, 8 May 2015 12:37:09 +0200

locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

We use the regular paravirt call patching to switch between:

  native_queued_spin_lock_slowpath()__pv_queued_spin_lock_slowpath()
  native_queued_spin_unlock()   __pv_queued_spin_unlock()

We use a callee saved call for the unlock function which reduces the
i-cache footprint and allows 'inlining' of SPIN_UNLOCK functions
again.

We further optimize the unlock path by patching the direct call with a
"movb $0,%arg1" if we are indeed using the native unlock code. This
makes the unlock code almost as fast as the !PARAVIRT case.

This significantly lowers the overhead of having
CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Waiman Long 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andrew Morton 
Cc: Boris Ostrovsky 
Cc: Borislav Petkov 
Cc: Daniel J Blueman 
Cc: David Vrabel 
Cc: Douglas Hatch 
Cc: H. Peter Anvin 
Cc: Konrad Rzeszutek Wilk 
Cc: Linus Torvalds 
Cc: Oleg Nesterov 
Cc: Paolo Bonzini 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Raghavendra K T 
Cc: Rik van Riel 
Cc: Scott J Norton 
Cc: Thomas Gleixner 
Cc: virtualizat...@lists.linux-foundation.org
Cc: xen-de...@lists.xenproject.org
Link: 
http://lkml.kernel.org/r/1429901803-29771-10-git-send-email-waiman.l...@hp.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/Kconfig  |  2 +-
 arch/x86/include/asm/paravirt.h   | 29 -
 arch/x86/include/asm/paravirt_types.h | 10 ++
 arch/x86/include/asm/qspinlock.h  | 25 -
 arch/x86/include/asm/qspinlock_paravirt.h |  6 ++
 arch/x86/kernel/paravirt-spinlocks.c  | 24 +++-
 arch/x86/kernel/paravirt_patch_32.c   | 22 ++
 arch/x86/kernel/paravirt_patch_64.c   | 22 ++
 8 files changed, 128 insertions(+), 12 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 90b1b54..50ec043 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -667,7 +667,7 @@ config PARAVIRT_DEBUG
 config PARAVIRT_SPINLOCKS
bool "Paravirtualization layer for spinlocks"
depends on PARAVIRT && SMP
-   select UNINLINE_SPIN_UNLOCK
+   select UNINLINE_SPIN_UNLOCK if !QUEUED_SPINLOCK
---help---
  Paravirtualized spinlocks allow a pvops backend to replace the
  spinlock implementation with something virtualization-friendly
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 8957810..266c353 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -712,6 +712,31 @@ static inline void __set_fixmap(unsigned /* enum 
fixed_addresses */ idx,
 
 #if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS)
 
+#ifdef CONFIG_QUEUED_SPINLOCK
+
+static __always_inline void pv_queued_spin_lock_slowpath(struct qspinlock 
*lock,
+   u32 val)
+{
+   PVOP_VCALL2(pv_lock_ops.queued_spin_lock_slowpath, lock, val);
+}
+
+static __always_inline void pv_queued_spin_unlock(struct qspinlock *lock)
+{
+   PVOP_VCALLEE1(pv_lock_ops.queued_spin_unlock, lock);
+}
+
+static __always_inline void pv_wait(u8 *ptr, u8 val)
+{
+   PVOP_VCALL2(pv_lock_ops.wait, ptr, val);
+}
+
+static __always_inline void pv_kick(int cpu)
+{
+   PVOP_VCALL1(pv_lock_ops.kick, cpu);
+}
+
+#else /* !CONFIG_QUEUED_SPINLOCK */
+
 static __always_inline void __ticket_lock_spinning(struct arch_spinlock *lock,
__ticket_t ticket)
 {
@@ -724,7 +749,9 @@ static __always_inline void __ticket_unlock_kick(struct 
arch_spinlock *lock,
PVOP_VCALL2(pv_lock_ops.unlock_kick, lock, ticket);
 }
 
-#endif
+#endif /* CONFIG_QUEUED_SPINLOCK */
+
+#endif /* SMP && PARAVIRT_SPINLOCKS */
 
 #ifdef CONFIG_X86_32
 #define PV_SAVE_REGS "pushl %ecx; pushl %edx;"
diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index f7b0b5c..76cd684 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -333,9 +333,19 @@ struct arch_spinlock;
 typedef u16 __ticket_t;
 #endif
 
+struct qspinlock;
+
 struct pv_lock_ops {
+#ifdef CONFIG_QUEUED_SPINLOCK
+   void (*queued_spin_lock_slowpath)(struct qspinlock *lock, u32 val);
+   struct paravirt_callee_save queued_spin_unlock;
+
+   void (*wait)(u8 *ptr, u8 val);
+   void (*kick)(int cpu);
+#else /* !CONFIG_QUEUED_SPINLOCK */
struct paravirt_callee_save lock_spinning;
void (*unlock_kick)(struct arch_spinlock *lock, __ticket_t ticket);

[tip:locking/core] locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

2015-05-08 Thread tip-bot for Peter Zijlstra (Intel)
Commit-ID:  f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
Gitweb: http://git.kernel.org/tip/f233f7f1581e78fd9b4023f2e7d8c1ed89020cc9
Author: Peter Zijlstra (Intel) pet...@infradead.org
AuthorDate: Fri, 24 Apr 2015 14:56:38 -0400
Committer:  Ingo Molnar mi...@kernel.org
CommitDate: Fri, 8 May 2015 12:37:09 +0200

locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching

We use the regular paravirt call patching to switch between:

  native_queued_spin_lock_slowpath()__pv_queued_spin_lock_slowpath()
  native_queued_spin_unlock()   __pv_queued_spin_unlock()

We use a callee saved call for the unlock function which reduces the
i-cache footprint and allows 'inlining' of SPIN_UNLOCK functions
again.

We further optimize the unlock path by patching the direct call with a
movb $0,%arg1 if we are indeed using the native unlock code. This
makes the unlock code almost as fast as the !PARAVIRT case.

This significantly lowers the overhead of having
CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code.

Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
Cc: Andrew Morton a...@linux-foundation.org
Cc: Boris Ostrovsky boris.ostrov...@oracle.com
Cc: Borislav Petkov b...@alien8.de
Cc: Daniel J Blueman dan...@numascale.com
Cc: David Vrabel david.vra...@citrix.com
Cc: Douglas Hatch doug.ha...@hp.com
Cc: H. Peter Anvin h...@zytor.com
Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Cc: Linus Torvalds torva...@linux-foundation.org
Cc: Oleg Nesterov o...@redhat.com
Cc: Paolo Bonzini paolo.bonz...@gmail.com
Cc: Paul E. McKenney paul...@linux.vnet.ibm.com
Cc: Peter Zijlstra pet...@infradead.org
Cc: Raghavendra K T raghavendra...@linux.vnet.ibm.com
Cc: Rik van Riel r...@redhat.com
Cc: Scott J Norton scott.nor...@hp.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: virtualizat...@lists.linux-foundation.org
Cc: xen-de...@lists.xenproject.org
Link: 
http://lkml.kernel.org/r/1429901803-29771-10-git-send-email-waiman.l...@hp.com
Signed-off-by: Ingo Molnar mi...@kernel.org
---
 arch/x86/Kconfig  |  2 +-
 arch/x86/include/asm/paravirt.h   | 29 -
 arch/x86/include/asm/paravirt_types.h | 10 ++
 arch/x86/include/asm/qspinlock.h  | 25 -
 arch/x86/include/asm/qspinlock_paravirt.h |  6 ++
 arch/x86/kernel/paravirt-spinlocks.c  | 24 +++-
 arch/x86/kernel/paravirt_patch_32.c   | 22 ++
 arch/x86/kernel/paravirt_patch_64.c   | 22 ++
 8 files changed, 128 insertions(+), 12 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 90b1b54..50ec043 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -667,7 +667,7 @@ config PARAVIRT_DEBUG
 config PARAVIRT_SPINLOCKS
bool Paravirtualization layer for spinlocks
depends on PARAVIRT  SMP
-   select UNINLINE_SPIN_UNLOCK
+   select UNINLINE_SPIN_UNLOCK if !QUEUED_SPINLOCK
---help---
  Paravirtualized spinlocks allow a pvops backend to replace the
  spinlock implementation with something virtualization-friendly
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 8957810..266c353 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -712,6 +712,31 @@ static inline void __set_fixmap(unsigned /* enum 
fixed_addresses */ idx,
 
 #if defined(CONFIG_SMP)  defined(CONFIG_PARAVIRT_SPINLOCKS)
 
+#ifdef CONFIG_QUEUED_SPINLOCK
+
+static __always_inline void pv_queued_spin_lock_slowpath(struct qspinlock 
*lock,
+   u32 val)
+{
+   PVOP_VCALL2(pv_lock_ops.queued_spin_lock_slowpath, lock, val);
+}
+
+static __always_inline void pv_queued_spin_unlock(struct qspinlock *lock)
+{
+   PVOP_VCALLEE1(pv_lock_ops.queued_spin_unlock, lock);
+}
+
+static __always_inline void pv_wait(u8 *ptr, u8 val)
+{
+   PVOP_VCALL2(pv_lock_ops.wait, ptr, val);
+}
+
+static __always_inline void pv_kick(int cpu)
+{
+   PVOP_VCALL1(pv_lock_ops.kick, cpu);
+}
+
+#else /* !CONFIG_QUEUED_SPINLOCK */
+
 static __always_inline void __ticket_lock_spinning(struct arch_spinlock *lock,
__ticket_t ticket)
 {
@@ -724,7 +749,9 @@ static __always_inline void __ticket_unlock_kick(struct 
arch_spinlock *lock,
PVOP_VCALL2(pv_lock_ops.unlock_kick, lock, ticket);
 }
 
-#endif
+#endif /* CONFIG_QUEUED_SPINLOCK */
+
+#endif /* SMP  PARAVIRT_SPINLOCKS */
 
 #ifdef CONFIG_X86_32
 #define PV_SAVE_REGS pushl %ecx; pushl %edx;
diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index f7b0b5c..76cd684 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -333,9 +333,19 @@ struct arch_spinlock;
 typedef u16 __ticket_t;