Re: [PATCH 2/5] x86,smp: proportional backoff for ticket spinlocks
On Tue, Jan 08, 2013 at 05:32:41PM -0500, Rik van Riel wrote: > Subject: x86,smp: proportional backoff for ticket spinlocks > > Simple fixed value proportional backoff for ticket spinlocks. > By pounding on the cacheline with the spin lock less often, > bus traffic is reduced. In cases of a data structure with > embedded spinlock, the lock holder has a better chance of > making progress. > > If we are next in line behind the current holder of the > lock, we do a fast spin, so as not to waste any time when > the lock is released. > > The number 50 is likely to be wrong for many setups, and > this patch is mostly to illustrate the concept of proportional > backup. The next patch automatically tunes the delay value. > > Signed-off-by: Rik van Riel > Signed-off-by: Michel Lespinasse > --- Acked-by: Rafael Aquini > arch/x86/kernel/smp.c | 23 --- > 1 files changed, 20 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c > index 20da354..aa743e9 100644 > --- a/arch/x86/kernel/smp.c > +++ b/arch/x86/kernel/smp.c > @@ -117,11 +117,28 @@ static bool smp_no_nmi_ipi = false; > */ > void ticket_spin_lock_wait(arch_spinlock_t *lock, struct __raw_tickets inc) > { > + __ticket_t head = inc.head, ticket = inc.tail; > + __ticket_t waiters_ahead; > + unsigned loops; > + > for (;;) { > - cpu_relax(); > - inc.head = ACCESS_ONCE(lock->tickets.head); > + waiters_ahead = ticket - head - 1; > + /* > + * We are next after the current lock holder. Check often > + * to avoid wasting time when the lock is released. > + */ > + if (!waiters_ahead) { > + do { > + cpu_relax(); > + } while (ACCESS_ONCE(lock->tickets.head) != ticket); > + break; > + } > + loops = 50 * waiters_ahead; > + while (loops--) > + cpu_relax(); > > - if (inc.head == inc.tail) > + head = ACCESS_ONCE(lock->tickets.head); > + if (head == ticket) > break; > } > } > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/5] x86,smp: proportional backoff for ticket spinlocks
On 01/08/2013 05:50 PM, Eric Dumazet wrote: On Tue, 2013-01-08 at 17:32 -0500, Rik van Riel wrote: Subject: x86,smp: proportional backoff for ticket spinlocks Simple fixed value proportional backoff for ticket spinlocks. By pounding on the cacheline with the spin lock less often, bus traffic is reduced. In cases of a data structure with embedded spinlock, the lock holder has a better chance of making progress. If we are next in line behind the current holder of the lock, we do a fast spin, so as not to waste any time when the lock is released. The number 50 is likely to be wrong for many setups, and this patch is mostly to illustrate the concept of proportional backup. The next patch automatically tunes the delay value. Signed-off-by: Rik van Riel Signed-off-by: Michel Lespinasse --- arch/x86/kernel/smp.c | 23 --- 1 files changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 20da354..aa743e9 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -117,11 +117,28 @@ static bool smp_no_nmi_ipi = false; */ void ticket_spin_lock_wait(arch_spinlock_t *lock, struct __raw_tickets inc) { + __ticket_t head = inc.head, ticket = inc.tail; + __ticket_t waiters_ahead; + unsigned loops; + for (;;) { - cpu_relax(); - inc.head = ACCESS_ONCE(lock->tickets.head); + waiters_ahead = ticket - head - 1; + /* +* We are next after the current lock holder. Check often +* to avoid wasting time when the lock is released. +*/ + if (!waiters_ahead) { + do { + cpu_relax(); + } while (ACCESS_ONCE(lock->tickets.head) != ticket); + break; + } + loops = 50 * waiters_ahead; + while (loops--) + cpu_relax(); - if (inc.head == inc.tail) + head = ACCESS_ONCE(lock->tickets.head); + if (head == ticket) break; } } Reviewed-by: Eric Dumazet In my tests, I used the following formula : loops = 50 * ((ticket - head) - 1/2); or : delta = ticket - head; loops = delay * delta - (delay >> 1); I suppose that rounding down the delta might result in more stable results, due to undersleeping less often. (And I didnt use the special : if (!waiters_ahead) { do { cpu_relax(); } while (ACCESS_ONCE(lock->tickets.head) != ticket); break; } Because it means this wont help machines with 2 cpus. (or more generally if there _is_ contention, but with one lock holder and one lock waiter) Machines with 2 CPUs should not need help, because the cpu_relax() alone gives enough of a pause that the lock holder can make progress. It may be interesting to try out your rounding-down of delta, to see if that makes things better. -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/5] x86,smp: proportional backoff for ticket spinlocks
On Tue, 2013-01-08 at 17:32 -0500, Rik van Riel wrote: > Subject: x86,smp: proportional backoff for ticket spinlocks > > Simple fixed value proportional backoff for ticket spinlocks. > By pounding on the cacheline with the spin lock less often, > bus traffic is reduced. In cases of a data structure with > embedded spinlock, the lock holder has a better chance of > making progress. > > If we are next in line behind the current holder of the > lock, we do a fast spin, so as not to waste any time when > the lock is released. > > The number 50 is likely to be wrong for many setups, and > this patch is mostly to illustrate the concept of proportional > backup. The next patch automatically tunes the delay value. > > Signed-off-by: Rik van Riel > Signed-off-by: Michel Lespinasse > --- > arch/x86/kernel/smp.c | 23 --- > 1 files changed, 20 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c > index 20da354..aa743e9 100644 > --- a/arch/x86/kernel/smp.c > +++ b/arch/x86/kernel/smp.c > @@ -117,11 +117,28 @@ static bool smp_no_nmi_ipi = false; > */ > void ticket_spin_lock_wait(arch_spinlock_t *lock, struct __raw_tickets inc) > { > + __ticket_t head = inc.head, ticket = inc.tail; > + __ticket_t waiters_ahead; > + unsigned loops; > + > for (;;) { > - cpu_relax(); > - inc.head = ACCESS_ONCE(lock->tickets.head); > + waiters_ahead = ticket - head - 1; > + /* > + * We are next after the current lock holder. Check often > + * to avoid wasting time when the lock is released. > + */ > + if (!waiters_ahead) { > + do { > + cpu_relax(); > + } while (ACCESS_ONCE(lock->tickets.head) != ticket); > + break; > + } > + loops = 50 * waiters_ahead; > + while (loops--) > + cpu_relax(); > > - if (inc.head == inc.tail) > + head = ACCESS_ONCE(lock->tickets.head); > + if (head == ticket) > break; > } > } > Reviewed-by: Eric Dumazet In my tests, I used the following formula : loops = 50 * ((ticket - head) - 1/2); or : delta = ticket - head; loops = delay * delta - (delay >> 1); (And I didnt use the special : if (!waiters_ahead) { do { cpu_relax(); } while (ACCESS_ONCE(lock->tickets.head) != ticket); break; } Because it means this wont help machines with 2 cpus. (or more generally if there _is_ contention, but with one lock holder and one lock waiter) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/5] x86,smp: proportional backoff for ticket spinlocks
Subject: x86,smp: proportional backoff for ticket spinlocks Simple fixed value proportional backoff for ticket spinlocks. By pounding on the cacheline with the spin lock less often, bus traffic is reduced. In cases of a data structure with embedded spinlock, the lock holder has a better chance of making progress. If we are next in line behind the current holder of the lock, we do a fast spin, so as not to waste any time when the lock is released. The number 50 is likely to be wrong for many setups, and this patch is mostly to illustrate the concept of proportional backup. The next patch automatically tunes the delay value. Signed-off-by: Rik van Riel Signed-off-by: Michel Lespinasse --- arch/x86/kernel/smp.c | 23 --- 1 files changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 20da354..aa743e9 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -117,11 +117,28 @@ static bool smp_no_nmi_ipi = false; */ void ticket_spin_lock_wait(arch_spinlock_t *lock, struct __raw_tickets inc) { + __ticket_t head = inc.head, ticket = inc.tail; + __ticket_t waiters_ahead; + unsigned loops; + for (;;) { - cpu_relax(); - inc.head = ACCESS_ONCE(lock->tickets.head); + waiters_ahead = ticket - head - 1; + /* +* We are next after the current lock holder. Check often +* to avoid wasting time when the lock is released. +*/ + if (!waiters_ahead) { + do { + cpu_relax(); + } while (ACCESS_ONCE(lock->tickets.head) != ticket); + break; + } + loops = 50 * waiters_ahead; + while (loops--) + cpu_relax(); - if (inc.head == inc.tail) + head = ACCESS_ONCE(lock->tickets.head); + if (head == ticket) break; } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 2/5] x86,smp: proportional backoff for ticket spinlocks
On 01/03/2013 05:12 PM, Michel Lespinasse wrote: On Thu, Jan 3, 2013 at 3:35 AM, Raghavendra KT wrote: [Ccing IBM id] On Thu, Jan 3, 2013 at 10:52 AM, Rik van Riel wrote: Simple fixed value proportional backoff for ticket spinlocks. By pounding on the cacheline with the spin lock less often, bus traffic is reduced. In cases of a data structure with embedded spinlock, the lock holder has a better chance of making progress. If we are next in line behind the current holder of the lock, we do a fast spin, so as not to waste any time when the lock is released. The number 50 is likely to be wrong for many setups, and this patch is mostly to illustrate the concept of proportional backup. The next patch automatically tunes the delay value. Signed-off-by: Rik van Riel Signed-off-by: Michel Lespinasse --- arch/x86/kernel/smp.c | 23 --- 1 files changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 20da354..9c56fe3 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -117,11 +117,28 @@ static bool smp_no_nmi_ipi = false; */ void ticket_spin_lock_wait(arch_spinlock_t *lock, struct __raw_tickets inc) { + __ticket_t head = inc.head, ticket = inc.tail; + __ticket_t waiters_ahead; + unsigned loops; + for (;;) { - cpu_relax(); - inc.head = ACCESS_ONCE(lock->tickets.head); + waiters_ahead = ticket - head - 1; ^^ Just wondering, Does wraparound affects this? The result gets stored in waiters_ahead, which is unsigned and has same bit size as ticket and head. So, this takes care of the wraparound issue. In other words, you may have to add 1<<8 or 1<<16 if the integer difference was negative; but you get that for free by just computing the difference as a 8 or 16 bit unsigned value. Michael, Sorry for the noise and for missing the simple math :) and Thanks for explanation. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 2/5] x86,smp: proportional backoff for ticket spinlocks
On Thu, Jan 3, 2013 at 3:35 AM, Raghavendra KT wrote: > [Ccing IBM id] > On Thu, Jan 3, 2013 at 10:52 AM, Rik van Riel wrote: >> Simple fixed value proportional backoff for ticket spinlocks. >> By pounding on the cacheline with the spin lock less often, >> bus traffic is reduced. In cases of a data structure with >> embedded spinlock, the lock holder has a better chance of >> making progress. >> >> If we are next in line behind the current holder of the >> lock, we do a fast spin, so as not to waste any time when >> the lock is released. >> >> The number 50 is likely to be wrong for many setups, and >> this patch is mostly to illustrate the concept of proportional >> backup. The next patch automatically tunes the delay value. >> >> Signed-off-by: Rik van Riel >> Signed-off-by: Michel Lespinasse >> --- >> arch/x86/kernel/smp.c | 23 --- >> 1 files changed, 20 insertions(+), 3 deletions(-) >> >> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c >> index 20da354..9c56fe3 100644 >> --- a/arch/x86/kernel/smp.c >> +++ b/arch/x86/kernel/smp.c >> @@ -117,11 +117,28 @@ static bool smp_no_nmi_ipi = false; >> */ >> void ticket_spin_lock_wait(arch_spinlock_t *lock, struct __raw_tickets inc) >> { >> + __ticket_t head = inc.head, ticket = inc.tail; >> + __ticket_t waiters_ahead; >> + unsigned loops; >> + >> for (;;) { >> - cpu_relax(); >> - inc.head = ACCESS_ONCE(lock->tickets.head); >> + waiters_ahead = ticket - head - 1; > ^^ > Just wondering, > Does wraparound affects this? The result gets stored in waiters_ahead, which is unsigned and has same bit size as ticket and head. So, this takes care of the wraparound issue. In other words, you may have to add 1<<8 or 1<<16 if the integer difference was negative; but you get that for free by just computing the difference as a 8 or 16 bit unsigned value. -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 2/5] x86,smp: proportional backoff for ticket spinlocks
[Ccing IBM id] On Thu, Jan 3, 2013 at 10:52 AM, Rik van Riel wrote: > Simple fixed value proportional backoff for ticket spinlocks. > By pounding on the cacheline with the spin lock less often, > bus traffic is reduced. In cases of a data structure with > embedded spinlock, the lock holder has a better chance of > making progress. > > If we are next in line behind the current holder of the > lock, we do a fast spin, so as not to waste any time when > the lock is released. > > The number 50 is likely to be wrong for many setups, and > this patch is mostly to illustrate the concept of proportional > backup. The next patch automatically tunes the delay value. > > Signed-off-by: Rik van Riel > Signed-off-by: Michel Lespinasse > --- > arch/x86/kernel/smp.c | 23 --- > 1 files changed, 20 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c > index 20da354..9c56fe3 100644 > --- a/arch/x86/kernel/smp.c > +++ b/arch/x86/kernel/smp.c > @@ -117,11 +117,28 @@ static bool smp_no_nmi_ipi = false; > */ > void ticket_spin_lock_wait(arch_spinlock_t *lock, struct __raw_tickets inc) > { > + __ticket_t head = inc.head, ticket = inc.tail; > + __ticket_t waiters_ahead; > + unsigned loops; > + > for (;;) { > - cpu_relax(); > - inc.head = ACCESS_ONCE(lock->tickets.head); > + waiters_ahead = ticket - head - 1; ^^ Just wondering, Does wraparound affects this? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 2/5] x86,smp: proportional backoff for ticket spinlocks
Simple fixed value proportional backoff for ticket spinlocks. By pounding on the cacheline with the spin lock less often, bus traffic is reduced. In cases of a data structure with embedded spinlock, the lock holder has a better chance of making progress. If we are next in line behind the current holder of the lock, we do a fast spin, so as not to waste any time when the lock is released. The number 50 is likely to be wrong for many setups, and this patch is mostly to illustrate the concept of proportional backup. The next patch automatically tunes the delay value. Signed-off-by: Rik van Riel Signed-off-by: Michel Lespinasse --- arch/x86/kernel/smp.c | 23 --- 1 files changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 20da354..9c56fe3 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -117,11 +117,28 @@ static bool smp_no_nmi_ipi = false; */ void ticket_spin_lock_wait(arch_spinlock_t *lock, struct __raw_tickets inc) { + __ticket_t head = inc.head, ticket = inc.tail; + __ticket_t waiters_ahead; + unsigned loops; + for (;;) { - cpu_relax(); - inc.head = ACCESS_ONCE(lock->tickets.head); + waiters_ahead = ticket - head - 1; + /* +* We are next after the current lock holder. Check often +* to avoid wasting time when the lock is released. +*/ + if (!waiters_ahead) { + do { + cpu_relax(); + } while (ACCESS_ONCE(lock->tickets.head) != ticket); + break; + } + loops = 50 * waiters_ahead; + while (loops--) + cpu_relax(); - if (inc.head == inc.tail) + head = ACCESS_ONCE(lock->tickets.head); + if (head == ticket) break; } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/