Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-21 Thread Oleg Nesterov
On 11/21, Ivo Sieben wrote:
> Hi
>
> 2012/11/19 Oleg Nesterov :
> >
> > Because on a second thought I suspect this change is wrong.
> >
> > Just for example, please look at kauditd_thread(). It does
> >
> > set_current_state(TASK_INTERRUPTIBLE);
> >
> > add_wait_queue(_wait, );
> >
> > if (!CONDITION) // <-- LOAD
> > schedule();
> >
> > And the last LOAD can leak into the critical section protected by
> > wait_queue_head_t->lock, and it can be reordered with list_add()
> > inside this critical section. In this case we can race with wake_up()
> > unless it takes the same lock.
> >
> > Oleg.
> >
>
> I agree that I should solve my problem using the waitqueue_active()
> function locally. I'll abandon this patch and fix it in the
> tty_ldisc.c.
>
> But we try to understand your fault scenario: How can the LOAD leak
> into the critical section? As far as we understand the spin_unlock()
> function also contains a memory barrier
   ^^

Not really, in general unlock is a one-way barrier.

> to prevent such a reordering
> from happening.

Please look at the comment above prepare_to_wait(), for example. Or
look at wmb() in try_to_wake_up().

I guess this is not possible on x86, but in general

X;
LOCK();
UNLOCK();
Y;

can be reordered as

LOCK();
Y;
X;
UNLOCK();

UNLOCK + LOCK is the full memory barrier.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-21 Thread Alan Cox
> But we try to understand your fault scenario: How can the LOAD leak
> into the critical section? As far as we understand the spin_unlock()
> function also contains a memory barrier to prevent such a reordering
> from happening.

It does - it would be very interesting for someone to look at the assembly
if this is making a difference.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-21 Thread Ivo Sieben
Hi

2012/11/19 Oleg Nesterov :
>
> Because on a second thought I suspect this change is wrong.
>
> Just for example, please look at kauditd_thread(). It does
>
> set_current_state(TASK_INTERRUPTIBLE);
>
> add_wait_queue(_wait, );
>
> if (!CONDITION) // <-- LOAD
> schedule();
>
> And the last LOAD can leak into the critical section protected by
> wait_queue_head_t->lock, and it can be reordered with list_add()
> inside this critical section. In this case we can race with wake_up()
> unless it takes the same lock.
>
> Oleg.
>

I agree that I should solve my problem using the waitqueue_active()
function locally. I'll abandon this patch and fix it in the
tty_ldisc.c.

But we try to understand your fault scenario: How can the LOAD leak
into the critical section? As far as we understand the spin_unlock()
function also contains a memory barrier to prevent such a reordering
from happening.

Regards,
Ivo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-21 Thread Ivo Sieben
Hi

2012/11/19 Oleg Nesterov o...@redhat.com:

 Because on a second thought I suspect this change is wrong.

 Just for example, please look at kauditd_thread(). It does

 set_current_state(TASK_INTERRUPTIBLE);

 add_wait_queue(kauditd_wait, wait);

 if (!CONDITION) // -- LOAD
 schedule();

 And the last LOAD can leak into the critical section protected by
 wait_queue_head_t-lock, and it can be reordered with list_add()
 inside this critical section. In this case we can race with wake_up()
 unless it takes the same lock.

 Oleg.


I agree that I should solve my problem using the waitqueue_active()
function locally. I'll abandon this patch and fix it in the
tty_ldisc.c.

But we try to understand your fault scenario: How can the LOAD leak
into the critical section? As far as we understand the spin_unlock()
function also contains a memory barrier to prevent such a reordering
from happening.

Regards,
Ivo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-21 Thread Alan Cox
 But we try to understand your fault scenario: How can the LOAD leak
 into the critical section? As far as we understand the spin_unlock()
 function also contains a memory barrier to prevent such a reordering
 from happening.

It does - it would be very interesting for someone to look at the assembly
if this is making a difference.

Alan
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-21 Thread Oleg Nesterov
On 11/21, Ivo Sieben wrote:
 Hi

 2012/11/19 Oleg Nesterov o...@redhat.com:
 
  Because on a second thought I suspect this change is wrong.
 
  Just for example, please look at kauditd_thread(). It does
 
  set_current_state(TASK_INTERRUPTIBLE);
 
  add_wait_queue(kauditd_wait, wait);
 
  if (!CONDITION) // -- LOAD
  schedule();
 
  And the last LOAD can leak into the critical section protected by
  wait_queue_head_t-lock, and it can be reordered with list_add()
  inside this critical section. In this case we can race with wake_up()
  unless it takes the same lock.
 
  Oleg.
 

 I agree that I should solve my problem using the waitqueue_active()
 function locally. I'll abandon this patch and fix it in the
 tty_ldisc.c.

 But we try to understand your fault scenario: How can the LOAD leak
 into the critical section? As far as we understand the spin_unlock()
 function also contains a memory barrier
   ^^

Not really, in general unlock is a one-way barrier.

 to prevent such a reordering
 from happening.

Please look at the comment above prepare_to_wait(), for example. Or
look at wmb() in try_to_wake_up().

I guess this is not possible on x86, but in general

X;
LOCK();
UNLOCK();
Y;

can be reordered as

LOCK();
Y;
X;
UNLOCK();

UNLOCK + LOCK is the full memory barrier.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-19 Thread Oleg Nesterov
On 11/19, Ivo Sieben wrote:
>
> Hi
>
> 2012/11/19 Oleg Nesterov :
> >
> > I am wondering if it makes sense unconditionally. A lot of callers do
> >
> > if (waitqueue_active(q))
> > wake_up(...);
> >
> > this patch makes the optimization above pointless and adds mb().
> >
> >
> > But I won't argue.
> >
> > Oleg.
> >
>
> This patch solved an issue for me that I had with the TTY line
> discipline idle handling:
> Testing on a PREEMPT_RT system with TTY serial communication. Each
> time the TTY line discipline is dereferenced the Idle handling wait
> queue is woken up (see function put_ldisc in /drivers/tty/tty_ldisc.c)
> However line discipline idle handling is not used very often so the
> wait queue is empty most of the time. But still the wake_up() function
> enters the critical section guarded by spin locks. This causes
> additional scheduling overhead when a lower priority thread has
> control of that same lock.
>
> The /drivers/tty/tty_ldisc.c did not use the waitqueue_active() call
> to check if the waitqueue was filled maybe I should solve this
> problem the other way around: and make tty_ldisc.c first do the
> waitqueue_active() call?

IMHO yes...

Because on a second thought I suspect this change is wrong.

Just for example, please look at kauditd_thread(). It does

set_current_state(TASK_INTERRUPTIBLE);

add_wait_queue(_wait, );

if (!CONDITION) // <-- LOAD
schedule();

And the last LOAD can leak into the critical section protected by
wait_queue_head_t->lock, and it can be reordered with list_add()
inside this critical section. In this case we can race with wake_up()
unless it takes the same lock.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-19 Thread Ivo Sieben
Hi

2012/11/19 Oleg Nesterov :
>
> I am wondering if it makes sense unconditionally. A lot of callers do
>
> if (waitqueue_active(q))
> wake_up(...);
>
> this patch makes the optimization above pointless and adds mb().
>
>
> But I won't argue.
>
> Oleg.
>

This patch solved an issue for me that I had with the TTY line
discipline idle handling:
Testing on a PREEMPT_RT system with TTY serial communication. Each
time the TTY line discipline is dereferenced the Idle handling wait
queue is woken up (see function put_ldisc in /drivers/tty/tty_ldisc.c)
However line discipline idle handling is not used very often so the
wait queue is empty most of the time. But still the wake_up() function
enters the critical section guarded by spin locks. This causes
additional scheduling overhead when a lower priority thread has
control of that same lock.

The /drivers/tty/tty_ldisc.c did not use the waitqueue_active() call
to check if the waitqueue was filled maybe I should solve this
problem the other way around: and make tty_ldisc.c first do the
waitqueue_active() call?

Regards,
Ivo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-19 Thread Oleg Nesterov
On 11/19, Ivo Sieben wrote:
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3090,9 +3090,22 @@ void __wake_up(wait_queue_head_t *q, unsigned int mode,
>  {
>   unsigned long flags;
>
> - spin_lock_irqsave(>lock, flags);
> - __wake_up_common(q, mode, nr_exclusive, 0, key);
> - spin_unlock_irqrestore(>lock, flags);
> + /*
> +  * We check for list emptiness outside the lock. This prevents the wake
> +  * up to enter the critical section needlessly when the task list is
> +  * empty.
> +  *
> +  * Placed a full memory barrier before checking list emptiness to make
> +  * 100% sure this function sees an up-to-date list administration.
> +  * Note that other code that manipulates the list uses a spin_lock and
> +  * therefore doesn't need additional memory barriers.
> +  */
> + smp_mb();
> + if (!list_empty(>task_list)) {

waitqueue_active() ?

> + spin_lock_irqsave(>lock, flags);
> + __wake_up_common(q, mode, nr_exclusive, 0, key);
> + spin_unlock_irqrestore(>lock, flags);
> + }

I am wondering if it makes sense unconditionally. A lot of callers do

if (waitqueue_active(q))
wake_up(...);

this patch makes the optimization above pointless and adds mb().


But I won't argue.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-19 Thread Preeti U Murthy
Hi Ivo,

On 11/19/2012 01:00 PM, Ivo Sieben wrote:
> Check the waitqueue task list to be non empty before entering the critical
> section. This prevents locking the spin lock needlessly in case the queue
> was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
> system.
> 
> Signed-off-by: Ivo Sieben 
> ---
> 
>  a second repost of this patch v2: Can anyone respond?
>  Did I apply the memory barrier correct?
> 
>  v2:
>  - We don't need the "careful" list empty, a normal list empty is sufficient:
>if you miss an update it was just as it happened a little later.
>  - Because of memory ordering problems we can observe an unupdated list
>administration. This can cause an wait_event-like code to miss an event.
>Adding a memory barrier befor checking the list to be empty will guarantee 
> we
>evaluate a 100% updated list adminsitration.
> 
>  kernel/sched/core.c |   19 ---
>  1 file changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 2d8927f..168a9b2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3090,9 +3090,22 @@ void __wake_up(wait_queue_head_t *q, unsigned int mode,
>  {
>   unsigned long flags;
>  
> - spin_lock_irqsave(>lock, flags);
> - __wake_up_common(q, mode, nr_exclusive, 0, key);
> - spin_unlock_irqrestore(>lock, flags);
> + /*
> +  * We check for list emptiness outside the lock. This prevents the wake
> +  * up to enter the critical section needlessly when the task list is
> +  * empty.
> +  *
> +  * Placed a full memory barrier before checking list emptiness to make
> +  * 100% sure this function sees an up-to-date list administration.
> +  * Note that other code that manipulates the list uses a spin_lock and
> +  * therefore doesn't need additional memory barriers.
> +  */
> + smp_mb();
> + if (!list_empty(>task_list)) {
> + spin_lock_irqsave(>lock, flags);
> + __wake_up_common(q, mode, nr_exclusive, 0, key);
> + spin_unlock_irqrestore(>lock, flags);
> + }
>  }
>  EXPORT_SYMBOL(__wake_up);
>  
> 
Looks good to me.
Reviewed-by: Preeti U Murthy 

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-19 Thread Preeti U Murthy
Hi Ivo,

On 11/19/2012 01:00 PM, Ivo Sieben wrote:
 Check the waitqueue task list to be non empty before entering the critical
 section. This prevents locking the spin lock needlessly in case the queue
 was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
 system.
 
 Signed-off-by: Ivo Sieben meltedpiano...@gmail.com
 ---
 
  a second repost of this patch v2: Can anyone respond?
  Did I apply the memory barrier correct?
 
  v2:
  - We don't need the careful list empty, a normal list empty is sufficient:
if you miss an update it was just as it happened a little later.
  - Because of memory ordering problems we can observe an unupdated list
administration. This can cause an wait_event-like code to miss an event.
Adding a memory barrier befor checking the list to be empty will guarantee 
 we
evaluate a 100% updated list adminsitration.
 
  kernel/sched/core.c |   19 ---
  1 file changed, 16 insertions(+), 3 deletions(-)
 
 diff --git a/kernel/sched/core.c b/kernel/sched/core.c
 index 2d8927f..168a9b2 100644
 --- a/kernel/sched/core.c
 +++ b/kernel/sched/core.c
 @@ -3090,9 +3090,22 @@ void __wake_up(wait_queue_head_t *q, unsigned int mode,
  {
   unsigned long flags;
  
 - spin_lock_irqsave(q-lock, flags);
 - __wake_up_common(q, mode, nr_exclusive, 0, key);
 - spin_unlock_irqrestore(q-lock, flags);
 + /*
 +  * We check for list emptiness outside the lock. This prevents the wake
 +  * up to enter the critical section needlessly when the task list is
 +  * empty.
 +  *
 +  * Placed a full memory barrier before checking list emptiness to make
 +  * 100% sure this function sees an up-to-date list administration.
 +  * Note that other code that manipulates the list uses a spin_lock and
 +  * therefore doesn't need additional memory barriers.
 +  */
 + smp_mb();
 + if (!list_empty(q-task_list)) {
 + spin_lock_irqsave(q-lock, flags);
 + __wake_up_common(q, mode, nr_exclusive, 0, key);
 + spin_unlock_irqrestore(q-lock, flags);
 + }
  }
  EXPORT_SYMBOL(__wake_up);
  
 
Looks good to me.
Reviewed-by: Preeti U Murthy pre...@linux.vnet.ibm.com

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-19 Thread Oleg Nesterov
On 11/19, Ivo Sieben wrote:

 --- a/kernel/sched/core.c
 +++ b/kernel/sched/core.c
 @@ -3090,9 +3090,22 @@ void __wake_up(wait_queue_head_t *q, unsigned int mode,
  {
   unsigned long flags;

 - spin_lock_irqsave(q-lock, flags);
 - __wake_up_common(q, mode, nr_exclusive, 0, key);
 - spin_unlock_irqrestore(q-lock, flags);
 + /*
 +  * We check for list emptiness outside the lock. This prevents the wake
 +  * up to enter the critical section needlessly when the task list is
 +  * empty.
 +  *
 +  * Placed a full memory barrier before checking list emptiness to make
 +  * 100% sure this function sees an up-to-date list administration.
 +  * Note that other code that manipulates the list uses a spin_lock and
 +  * therefore doesn't need additional memory barriers.
 +  */
 + smp_mb();
 + if (!list_empty(q-task_list)) {

waitqueue_active() ?

 + spin_lock_irqsave(q-lock, flags);
 + __wake_up_common(q, mode, nr_exclusive, 0, key);
 + spin_unlock_irqrestore(q-lock, flags);
 + }

I am wondering if it makes sense unconditionally. A lot of callers do

if (waitqueue_active(q))
wake_up(...);

this patch makes the optimization above pointless and adds mb().


But I won't argue.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-19 Thread Ivo Sieben
Hi

2012/11/19 Oleg Nesterov o...@redhat.com:

 I am wondering if it makes sense unconditionally. A lot of callers do

 if (waitqueue_active(q))
 wake_up(...);

 this patch makes the optimization above pointless and adds mb().


 But I won't argue.

 Oleg.


This patch solved an issue for me that I had with the TTY line
discipline idle handling:
Testing on a PREEMPT_RT system with TTY serial communication. Each
time the TTY line discipline is dereferenced the Idle handling wait
queue is woken up (see function put_ldisc in /drivers/tty/tty_ldisc.c)
However line discipline idle handling is not used very often so the
wait queue is empty most of the time. But still the wake_up() function
enters the critical section guarded by spin locks. This causes
additional scheduling overhead when a lower priority thread has
control of that same lock.

The /drivers/tty/tty_ldisc.c did not use the waitqueue_active() call
to check if the waitqueue was filled maybe I should solve this
problem the other way around: and make tty_ldisc.c first do the
waitqueue_active() call?

Regards,
Ivo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-19 Thread Oleg Nesterov
On 11/19, Ivo Sieben wrote:

 Hi

 2012/11/19 Oleg Nesterov o...@redhat.com:
 
  I am wondering if it makes sense unconditionally. A lot of callers do
 
  if (waitqueue_active(q))
  wake_up(...);
 
  this patch makes the optimization above pointless and adds mb().
 
 
  But I won't argue.
 
  Oleg.
 

 This patch solved an issue for me that I had with the TTY line
 discipline idle handling:
 Testing on a PREEMPT_RT system with TTY serial communication. Each
 time the TTY line discipline is dereferenced the Idle handling wait
 queue is woken up (see function put_ldisc in /drivers/tty/tty_ldisc.c)
 However line discipline idle handling is not used very often so the
 wait queue is empty most of the time. But still the wake_up() function
 enters the critical section guarded by spin locks. This causes
 additional scheduling overhead when a lower priority thread has
 control of that same lock.

 The /drivers/tty/tty_ldisc.c did not use the waitqueue_active() call
 to check if the waitqueue was filled maybe I should solve this
 problem the other way around: and make tty_ldisc.c first do the
 waitqueue_active() call?

IMHO yes...

Because on a second thought I suspect this change is wrong.

Just for example, please look at kauditd_thread(). It does

set_current_state(TASK_INTERRUPTIBLE);

add_wait_queue(kauditd_wait, wait);

if (!CONDITION) // -- LOAD
schedule();

And the last LOAD can leak into the critical section protected by
wait_queue_head_t-lock, and it can be reordered with list_add()
inside this critical section. In this case we can race with wake_up()
unless it takes the same lock.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-18 Thread Ivo Sieben
Check the waitqueue task list to be non empty before entering the critical
section. This prevents locking the spin lock needlessly in case the queue
was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
system.

Signed-off-by: Ivo Sieben 
---

 a second repost of this patch v2: Can anyone respond?
 Did I apply the memory barrier correct?

 v2:
 - We don't need the "careful" list empty, a normal list empty is sufficient:
   if you miss an update it was just as it happened a little later.
 - Because of memory ordering problems we can observe an unupdated list
   administration. This can cause an wait_event-like code to miss an event.
   Adding a memory barrier befor checking the list to be empty will guarantee we
   evaluate a 100% updated list adminsitration.

 kernel/sched/core.c |   19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2d8927f..168a9b2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3090,9 +3090,22 @@ void __wake_up(wait_queue_head_t *q, unsigned int mode,
 {
unsigned long flags;
 
-   spin_lock_irqsave(>lock, flags);
-   __wake_up_common(q, mode, nr_exclusive, 0, key);
-   spin_unlock_irqrestore(>lock, flags);
+   /*
+* We check for list emptiness outside the lock. This prevents the wake
+* up to enter the critical section needlessly when the task list is
+* empty.
+*
+* Placed a full memory barrier before checking list emptiness to make
+* 100% sure this function sees an up-to-date list administration.
+* Note that other code that manipulates the list uses a spin_lock and
+* therefore doesn't need additional memory barriers.
+*/
+   smp_mb();
+   if (!list_empty(>task_list)) {
+   spin_lock_irqsave(>lock, flags);
+   __wake_up_common(q, mode, nr_exclusive, 0, key);
+   spin_unlock_irqrestore(>lock, flags);
+   }
 }
 EXPORT_SYMBOL(__wake_up);
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-11-18 Thread Ivo Sieben
Check the waitqueue task list to be non empty before entering the critical
section. This prevents locking the spin lock needlessly in case the queue
was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
system.

Signed-off-by: Ivo Sieben meltedpiano...@gmail.com
---

 a second repost of this patch v2: Can anyone respond?
 Did I apply the memory barrier correct?

 v2:
 - We don't need the careful list empty, a normal list empty is sufficient:
   if you miss an update it was just as it happened a little later.
 - Because of memory ordering problems we can observe an unupdated list
   administration. This can cause an wait_event-like code to miss an event.
   Adding a memory barrier befor checking the list to be empty will guarantee we
   evaluate a 100% updated list adminsitration.

 kernel/sched/core.c |   19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2d8927f..168a9b2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3090,9 +3090,22 @@ void __wake_up(wait_queue_head_t *q, unsigned int mode,
 {
unsigned long flags;
 
-   spin_lock_irqsave(q-lock, flags);
-   __wake_up_common(q, mode, nr_exclusive, 0, key);
-   spin_unlock_irqrestore(q-lock, flags);
+   /*
+* We check for list emptiness outside the lock. This prevents the wake
+* up to enter the critical section needlessly when the task list is
+* empty.
+*
+* Placed a full memory barrier before checking list emptiness to make
+* 100% sure this function sees an up-to-date list administration.
+* Note that other code that manipulates the list uses a spin_lock and
+* therefore doesn't need additional memory barriers.
+*/
+   smp_mb();
+   if (!list_empty(q-task_list)) {
+   spin_lock_irqsave(q-lock, flags);
+   __wake_up_common(q, mode, nr_exclusive, 0, key);
+   spin_unlock_irqrestore(q-lock, flags);
+   }
 }
 EXPORT_SYMBOL(__wake_up);
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-10-25 Thread Ivo Sieben
Check the waitqueue task list to be non empty before entering the critical
section. This prevents locking the spin lock needlessly in case the queue
was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
system.

Signed-off-by: Ivo Sieben 
---

 repost:
 Did I apply the memory barrier correct?

 v2:
 - We don't need the "careful" list empty, a normal list empty is sufficient:
   if you miss an update it was just as it happened a little later.
 - Because of memory ordering problems we can observe an unupdated list
   administration. This can cause an wait_event-like code to miss an event.
   Adding a memory barrier befor checking the list to be empty will guarantee we
   evaluate a 100% updated list adminsitration.

 kernel/sched/core.c |   19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2d8927f..168a9b2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3090,9 +3090,22 @@ void __wake_up(wait_queue_head_t *q, unsigned int mode,
 {
unsigned long flags;
 
-   spin_lock_irqsave(>lock, flags);
-   __wake_up_common(q, mode, nr_exclusive, 0, key);
-   spin_unlock_irqrestore(>lock, flags);
+   /*
+* We check for list emptiness outside the lock. This prevents the wake
+* up to enter the critical section needlessly when the task list is
+* empty.
+*
+* Placed a full memory barrier before checking list emptiness to make
+* 100% sure this function sees an up-to-date list administration.
+* Note that other code that manipulates the list uses a spin_lock and
+* therefore doesn't need additional memory barriers.
+*/
+   smp_mb();
+   if (!list_empty(>task_list)) {
+   spin_lock_irqsave(>lock, flags);
+   __wake_up_common(q, mode, nr_exclusive, 0, key);
+   spin_unlock_irqrestore(>lock, flags);
+   }
 }
 EXPORT_SYMBOL(__wake_up);
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[REPOST-v2] sched: Prevent wakeup to enter critical section needlessly

2012-10-25 Thread Ivo Sieben
Check the waitqueue task list to be non empty before entering the critical
section. This prevents locking the spin lock needlessly in case the queue
was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
system.

Signed-off-by: Ivo Sieben meltedpiano...@gmail.com
---

 repost:
 Did I apply the memory barrier correct?

 v2:
 - We don't need the careful list empty, a normal list empty is sufficient:
   if you miss an update it was just as it happened a little later.
 - Because of memory ordering problems we can observe an unupdated list
   administration. This can cause an wait_event-like code to miss an event.
   Adding a memory barrier befor checking the list to be empty will guarantee we
   evaluate a 100% updated list adminsitration.

 kernel/sched/core.c |   19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2d8927f..168a9b2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3090,9 +3090,22 @@ void __wake_up(wait_queue_head_t *q, unsigned int mode,
 {
unsigned long flags;
 
-   spin_lock_irqsave(q-lock, flags);
-   __wake_up_common(q, mode, nr_exclusive, 0, key);
-   spin_unlock_irqrestore(q-lock, flags);
+   /*
+* We check for list emptiness outside the lock. This prevents the wake
+* up to enter the critical section needlessly when the task list is
+* empty.
+*
+* Placed a full memory barrier before checking list emptiness to make
+* 100% sure this function sees an up-to-date list administration.
+* Note that other code that manipulates the list uses a spin_lock and
+* therefore doesn't need additional memory barriers.
+*/
+   smp_mb();
+   if (!list_empty(q-task_list)) {
+   spin_lock_irqsave(q-lock, flags);
+   __wake_up_common(q, mode, nr_exclusive, 0, key);
+   spin_unlock_irqrestore(q-lock, flags);
+   }
 }
 EXPORT_SYMBOL(__wake_up);
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/