RE: Add memory barrier when waiting on futex

2013-11-26 Thread Ma, Xindong

> -Original Message-
> From: Peter Zijlstra [mailto:pet...@infradead.org]
> Sent: Tuesday, November 26, 2013 4:26 PM
> To: Ma, Xindong
> Cc: sta...@vger.kernel.org; stable-comm...@vger.kernel.org; Wysocki, Rafael
> J; ccr...@google.com; t...@linutronix.de; dvh...@linux.intel.com;
> mi...@kernel.org; linux-kernel@vger.kernel.org; gre...@linuxfoundation.org;
> Tu, Xiaobing
> Subject: Re: Add memory barrier when waiting on futex
> 
> On Tue, Nov 26, 2013 at 01:07:25AM +, Ma, Xindong wrote:
> > I've already aware that they've protected by spinlock, this is why I adding 
> > a
> memory barrier to fix it.
> 
> That doesn't make sense.. the spinlocks should provide the required
> serialization, there's nothing to fix.
> 
> > I reproduced this issue several times on android which running on IA dual
> core.
> 
> I think you need to be more specific, in particular, if spinlocks do not 
> provide
> serialization on your IA32 device, I'd call that thing broken.
> Adding random memory barriers is _WRONG_.
> 
I agree with you. Thanks, Peter.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Add memory barrier when waiting on futex

2013-11-26 Thread Peter Zijlstra
On Tue, Nov 26, 2013 at 01:07:25AM +, Ma, Xindong wrote:
> [ 1038.694701] putmetho-11202   1...1 1035007289001: futex_wait: LEON, wait 
> ==, addr:41300384, pid:11202
> [ 1038.694716] putmetho-11202   1...1 1035007308860: futex_wait_queue_me: 
> LEON, q->task => 11202
> [ 1038.694731] SharedPr-11272   0...1 1035007319703: futex_wake: LEON, wake 
> xx, addr:41300384, NULL task

> From the log captured, task 11202 runs on cpu1 and wait on futex and set task 
> to its pid in queue_me(). Then
> task 11272 get scheduled on cpu0, it tries to wake up 11202. But the q->task 
> set by cpu1 is not visible at first to
> cpu0, after several instructions, it's visible to cpu0 again. So this is the 
> problem maybe in cache or instruction out
> of order. After adding memory barrier, the issue does not reproduced anymore.

So that suggests the spinlock implementation doesn't actually serialize
proper; why would you place an extra memory barrier at the site that
shows this and not try and fix the spinlock implementation?

That's FAIL 1.

Secondly, the current x86 spinlocks are correct and work for all known
chips. This leads me to believe your chip is broken; esp. since you
haven't specified what kind of chip you're running on (and are somewhat
avoiding the issue).

That's FAIL 2.

Having this information and not enriching the initial changelog with it
to more fully explain your reasoning,

that's FAIL 3.

Peter A, can you please figure out wth these guys are doing?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Add memory barrier when waiting on futex

2013-11-26 Thread Peter Zijlstra
On Tue, Nov 26, 2013 at 01:07:25AM +, Ma, Xindong wrote:
> I've already aware that they've protected by spinlock, this is why I adding a 
> memory barrier to fix it.

That doesn't make sense.. the spinlocks should provide the required
serialization, there's nothing to fix.

> I reproduced this issue several times on android which running on IA dual 
> core.

I think you need to be more specific, in particular, if spinlocks do not
provide serialization on your IA32 device, I'd call that thing broken.
Adding random memory barriers is _WRONG_.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Add memory barrier when waiting on futex

2013-11-25 Thread Ma, Xindong

> -Original Message-
> From: Peter Zijlstra [mailto:pet...@infradead.org]
> Sent: Monday, November 25, 2013 10:40 PM
> To: Ma, Xindong
> Cc: sta...@vger.kernel.org; stable-comm...@vger.kernel.org; Wysocki, Rafael J;
> ccr...@google.com; t...@linutronix.de; dvh...@linux.intel.com;
> mi...@kernel.org; linux-kernel@vger.kernel.org; gre...@linuxfoundation.org;
> Tu, Xiaobing
> Subject: Re: Add memory barrier when waiting on futex
> 
> On Mon, Nov 25, 2013 at 01:15:17PM +, Ma, Xindong wrote:
> > We encountered following panic several times:
> 
> > [   74.671982] BUG: unable to handle kernel NULL pointer dereference at
> 0008
> > [   74.672101] IP: [] wake_futex+0x47/0x80
> 
> > [   74.674144]  [] futex_wake+0xc9/0x110
> > [   74.674195]  [] do_futex+0xeb/0x950
> > [   74.674484]  [] SyS_futex+0x9b/0x140
> > [   74.674582]  [] syscall_call+0x7/0xb
> >
> > On smp systems, setting current task to q->task in queue_me() may not
> > visible immediately to another cpu, some times this will cause panic
> > in wake_futex(). Adding memory barrier to avoid this.
> >
> > Signed-off-by: Leon Ma 
> > Signed-off-by: xiaobing tu 
> > ---
> >  kernel/futex.c |1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> >
> > diff --git a/kernel/futex.c b/kernel/futex.c index 80ba086..792cd41
> > 100644
> > --- a/kernel/futex.c
> > +++ b/kernel/futex.c
> > @@ -1529,6 +1529,7 @@ static inline void queue_me(struct futex_q *q,
> struct futex_hash_bucket *hb)
> > plist_node_init(&q->list, prio);
> > plist_add(&q->list, &hb->chain);
> > q->task = current;
> > +   smp_mb();
> > spin_unlock(&hb->lock);
> >  }
> 
> This is wrong, because an uncommented barrier is wrong per definition.
> 
> This is further wrong because wake_futex() is always called with
> hb->lock held, and since queue_me also has hb->lock held, this is in
> fact guaranteed.
> 
> This is even more wrong for adding stable.

Sorry for sending to stable tree.

I've already aware that they've protected by spinlock, this is why I adding a 
memory barrier to fix it.

I reproduced this issue several times on android which running on IA dual core.
I reproduced with following stress test case running at the same time:
a). glbenchmark 2.7
b). while true; do date; adb wait-for-device; adb shell busybox dd 
of=/data/tt.txt if=/dev/zero bs=350M count=1; done

To troubleshoot this issue, I constructed following debug patch to see what's 
the problem:
diff --git a/kernel/futex.c b/kernel/futex.c
index 8996c97..dce00a3 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -830,10 +830,17 @@ retry:
 static void __unqueue_futex(struct futex_q *q)
 {
struct futex_hash_bucket *hb;
-
+#if 0
if (WARN_ON_SMP(!q->lock_ptr || !spin_is_locked(q->lock_ptr))
|| WARN_ON(plist_node_empty(&q->list)))
return;
+#endif
+   if (WARN(!q->lock_ptr, "LEON, lock_ptr null"))
+   return;
+   if (WARN(!spin_is_locked(q->lock_ptr), "LEON, lock_ptr not locked"))
+   return;
+   if (WARN(plist_node_empty(&q->list), "LEON, plist_node_empty"))
+   return;
 
hb = container_of(q->lock_ptr, struct futex_hash_bucket, lock);
plist_del(&q->list, &hb->chain);
@@ -868,7 +875,7 @@ static void wake_futex(struct futex_q *q)
 */
smp_wmb();
q->lock_ptr = NULL;
-
+   trace_printk("waking up:%d..\n", p->pid);
wake_up_state(p, TASK_NORMAL);
put_task_struct(p);
 }
@@ -980,6 +987,7 @@ double_unlock_hb(struct futex_hash_bucket *hb1, struct 
futex_hash_bucket *hb2)
 /*
  * Wake up waiters matching bitset queued on this futex (uaddr).
  */
+#include 
 static int
 futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
 {
@@ -987,7 +995,7 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int 
nr_wake, u32 bitset)
struct futex_q *this, *next;
struct plist_head *head;
union futex_key key = FUTEX_KEY_INIT;
-   int ret;
+   int ret, should_p = 0;
 
if (!bitset)
return -EINVAL;
@@ -1010,8 +1018,20 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int 
nr_wake, u32 bitset)
/* Check if one of the bits is set in both bitsets */
if (!(this->bitset & bitset))
continue;
+   if(this->task)
+   trace_printk("LEON, wake xx, addr:%p, 
pid:%d\n", uaddr, this->task->pid);
+   else {
+   trace_printk("

Re: Add memory barrier when waiting on futex

2013-11-25 Thread Darren Hart
On Mon, 2013-11-25 at 15:39 +0100, Peter Zijlstra wrote:
> On Mon, Nov 25, 2013 at 01:15:17PM +, Ma, Xindong wrote:
> > We encountered following panic several times:
> 
> > [   74.671982] BUG: unable to handle kernel NULL pointer dereference at 
> > 0008
> > [   74.672101] IP: [] wake_futex+0x47/0x80
> 
> > [   74.674144]  [] futex_wake+0xc9/0x110
> > [   74.674195]  [] do_futex+0xeb/0x950
> > [   74.674484]  [] SyS_futex+0x9b/0x140
> > [   74.674582]  [] syscall_call+0x7/0xb
> > 
> > On smp systems, setting current task to q->task in queue_me() may
> > not visible immediately to another cpu, some times this will
> > cause panic in wake_futex(). Adding memory barrier to avoid this.
> > 
> > Signed-off-by: Leon Ma 
> > Signed-off-by: xiaobing tu 
> > ---
> >  kernel/futex.c |1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> > 
> > diff --git a/kernel/futex.c b/kernel/futex.c
> > index 80ba086..792cd41 100644
> > --- a/kernel/futex.c
> > +++ b/kernel/futex.c
> > @@ -1529,6 +1529,7 @@ static inline void queue_me(struct futex_q *q, struct 
> > futex_hash_bucket *hb)
> > plist_node_init(&q->list, prio);
> > plist_add(&q->list, &hb->chain);
> > q->task = current;
> > +   smp_mb();
> > spin_unlock(&hb->lock);
> >  }
> 
> This is wrong, because an uncommented barrier is wrong per definition.
> 
> This is further wrong because wake_futex() is always called with
> hb->lock held, and since queue_me also has hb->lock held, this is in
> fact guaranteed.

Indeed. Leon Ma, would you have a test case to share? I'm interested to
see how this was possible.

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Add memory barrier when waiting on futex

2013-11-25 Thread Peter Zijlstra
On Mon, Nov 25, 2013 at 01:15:17PM +, Ma, Xindong wrote:
> We encountered following panic several times:

> [   74.671982] BUG: unable to handle kernel NULL pointer dereference at 
> 0008
> [   74.672101] IP: [] wake_futex+0x47/0x80

> [   74.674144]  [] futex_wake+0xc9/0x110
> [   74.674195]  [] do_futex+0xeb/0x950
> [   74.674484]  [] SyS_futex+0x9b/0x140
> [   74.674582]  [] syscall_call+0x7/0xb
> 
> On smp systems, setting current task to q->task in queue_me() may
> not visible immediately to another cpu, some times this will
> cause panic in wake_futex(). Adding memory barrier to avoid this.
> 
> Signed-off-by: Leon Ma 
> Signed-off-by: xiaobing tu 
> ---
>  kernel/futex.c |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/futex.c b/kernel/futex.c
> index 80ba086..792cd41 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -1529,6 +1529,7 @@ static inline void queue_me(struct futex_q *q, struct 
> futex_hash_bucket *hb)
>   plist_node_init(&q->list, prio);
>   plist_add(&q->list, &hb->chain);
>   q->task = current;
> + smp_mb();
>   spin_unlock(&hb->lock);
>  }

This is wrong, because an uncommented barrier is wrong per definition.

This is further wrong because wake_futex() is always called with
hb->lock held, and since queue_me also has hb->lock held, this is in
fact guaranteed.

This is even more wrong for adding stable.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Add memory barrier when waiting on futex

2013-11-25 Thread gre...@linuxfoundation.org
On Mon, Nov 25, 2013 at 01:15:17PM +, Ma, Xindong wrote:
> We encountered following panic several times:
> [   74.671982] BUG: unable to handle kernel NULL pointer dereference at 
> 0008
> [   74.672101] IP: [] wake_futex+0x47/0x80
> [   74.672185] *pdpt = 10108001 *pde =  
> [   74.672278] Oops: 0002 [#1] PREEMPT SMP 
> [   74.672403] Modules linked in: atomisp_css2400b0_v2 atomisp_css2400_v2 
> dfrgx bcm_bt_lpm videobuf_vmalloc videobuf_core hdmi_audio tngdisp bcm4335 
> kct_daemon(O) cfg80211
> [   74.672815] CPU: 0 PID: 1477 Comm: zygote Tainted: GW  O 
> 3.10.1-259934-g0bfb86e #1
> [   74.672855] Hardware name: Intel Corporation Merrifield/SALT BAY, BIOS 404 
> 2013.10.09:15.29.48
> [   74.672894] task: d4c97220 ti: cfaa8000 task.ti: cfaa8000
> [   74.672933] EIP: 0060:[] EFLAGS: 00210246 CPU: 0
> [   74.672975] EIP is at wake_futex+0x47/0x80
> [   74.673012] EAX:  EBX:  ECX:  EDX: 
> [   74.673049] ESI: def4de5c EDI:  EBP: cfaa9eb4 ESP: cfaa9ea0
> [   74.673086]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [   74.673123] CR0: 8005003b CR2: 0008 CR3: 10109000 CR4: 001007f0
> [   74.673160] DR0:  DR1:  DR2:  DR3: 
> [   74.673196] DR6: 0ff0 DR7: 0400
> [   74.673229] Stack:
> [   74.673260]   0001  def4de5c c225eb50 cfaa9ee4 
> c129bc29 
> [   74.673536]   7fff c225eb30 b4f38000 ec1a4b40 0f90 
> 7fff 0001
> [   74.673814]  b4f38f90 cfaa9f58 c129da0b   cfaa9f10 
> c195d835 0001
> [   74.674092] Call Trace:
> [   74.674144]  [] futex_wake+0xc9/0x110
> [   74.674195]  [] do_futex+0xeb/0x950
> [   74.674246]  [] ? sub_preempt_count+0x55/0xe0
> [   74.674293]  [] ? wake_up_new_task+0xee/0x190
> [   74.674341]  [] ? _raw_spin_unlock_irqrestore+0x3b/0x70
> [   74.674388]  [] ? wake_up_new_task+0xee/0x190
> [   74.674436]  [] ? do_fork+0xec/0x350
> [   74.674484]  [] SyS_futex+0x9b/0x140
> [   74.674533]  [] ? SyS_mprotect+0x188/0x1e0
> [   74.674582]  [] syscall_call+0x7/0xb
> 
> On smp systems, setting current task to q->task in queue_me() may
> not visible immediately to another cpu, some times this will
> cause panic in wake_futex(). Adding memory barrier to avoid this.
> 
> Signed-off-by: Leon Ma 
> Signed-off-by: xiaobing tu 
> ---
>  kernel/futex.c |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
for how to do this properly.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/