Re: 3.6rc6 slab corruption.

2012-09-20 Thread Konrad Rzeszutek Wilk
> >An alternative to this, though, might be to never test for *ppos == 0 in > >u32_array_read() and do the format_array_alloc() in u32_array_open() to > >initialize file->private_data. If that allocation fails, just return > >-ENOMEM. Then you never need to add a mutex in the read path. > > > >

Re: 3.6rc6 slab corruption.

2012-09-20 Thread Konrad Rzeszutek Wilk
On Wed, Sep 19, 2012 at 07:46:57PM -0700, David Rientjes wrote: > On Thu, 20 Sep 2012, Raghavendra K T wrote: > > > Only problem, I find is histogram data expands dynamically (because it > > changes). I think having static allocation of 352 bytes as suggested > > Linus is a good idea. > > > > Ce

Re: 3.6rc6 slab corruption.

2012-09-20 Thread Konrad Rzeszutek Wilk
On Wed, Sep 19, 2012 at 04:20:25PM -0700, David Rientjes wrote: > On Wed, 19 Sep 2012, Linus Torvalds wrote: > > > That does look simpler, and avoiding the lock is a good idea. Since we > > don't support lseek() (or pread/pwrite) on that thing anyway, there's > > no way to keep the fd open and jus

Re: 3.6rc6 slab corruption.

2012-09-20 Thread Raghavendra K T
On 09/20/2012 03:19 AM, David Rientjes wrote: On Wed, 19 Sep 2012, David Rientjes wrote: From 0806b133b5b28081adf23d0d04a99636ed3b861b Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Wed, 19 Sep 2012 11:23:01 -0400 Subject: [PATCH 1/2] debugfs: Add lock for u32_array_read Dave Jone

Re: 3.6rc6 slab corruption.

2012-09-20 Thread Raghavendra K T
On 09/20/2012 12:46 AM, Konrad Rzeszutek Wilk wrote: On Tue, Sep 18, 2012 at 01:49:52PM -0700, Linus Torvalds wrote: On Tue, Sep 18, 2012 at 1:37 PM, Konrad Rzeszutek Wilk wrote: 30 words. ~360 + 29 spaces + NULL = 390? Just allocate the max then. That's tiny. And it's actually just 330:

Re: 3.6rc6 slab corruption.

2012-09-19 Thread Raghavendra K T
On 09/20/2012 08:16 AM, David Rientjes wrote: On Thu, 20 Sep 2012, Raghavendra K T wrote: Only problem, I find is histogram data expands dynamically (because it changes). I think having static allocation of 352 bytes as suggested Linus is a good idea. Certainly, but it's a different topic an

Re: 3.6rc6 slab corruption.

2012-09-19 Thread David Rientjes
On Thu, 20 Sep 2012, Raghavendra K T wrote: > Only problem, I find is histogram data expands dynamically (because it > changes). I think having static allocation of 352 bytes as suggested > Linus is a good idea. > Certainly, but it's a different topic and would be a subsequent patch to either m

Re: 3.6rc6 slab corruption.

2012-09-19 Thread Raghavendra K T
On 09/20/2012 03:19 AM, David Rientjes wrote: On Wed, 19 Sep 2012, David Rientjes wrote: From 0806b133b5b28081adf23d0d04a99636ed3b861b Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Wed, 19 Sep 2012 11:23:01 -0400 Subject: [PATCH 1/2] debugfs: Add lock for u32_array_read Dave Jone

Re: 3.6rc6 slab corruption.

2012-09-19 Thread David Rientjes
On Wed, 19 Sep 2012, Linus Torvalds wrote: > That does look simpler, and avoiding the lock is a good idea. Since we > don't support lseek() (or pread/pwrite) on that thing anyway, there's > no way to keep the fd open and just re-use it to read the data over > and over, so populating it at open tim

Re: 3.6rc6 slab corruption.

2012-09-19 Thread Linus Torvalds
On Wed, Sep 19, 2012 at 2:49 PM, David Rientjes wrote: > > An alternative to this, though, might be to never test for *ppos == 0 in > u32_array_read() and do the format_array_alloc() in u32_array_open() to > initialize file->private_data. If that allocation fails, just return > -ENOMEM. Then you

Re: 3.6rc6 slab corruption.

2012-09-19 Thread David Rientjes
On Wed, 19 Sep 2012, David Rientjes wrote: > > From 0806b133b5b28081adf23d0d04a99636ed3b861b Mon Sep 17 00:00:00 2001 > > From: Konrad Rzeszutek Wilk > > Date: Wed, 19 Sep 2012 11:23:01 -0400 > > Subject: [PATCH 1/2] debugfs: Add lock for u32_array_read > > > > Dave Jones spotted that the u32_ar

Re: 3.6rc6 slab corruption.

2012-09-19 Thread Dave Jones
On Wed, Sep 19, 2012 at 02:27:37PM -0700, David Rientjes wrote: > On Wed, 19 Sep 2012, Linus Torvalds wrote: > > > > Create a 350 processes reading > > > /sys/kernel/debug/kvm/spinlocks/histo_blocked > > > file simultaneously in while loop for more than 3 hours on my box. > > > > You need

Re: 3.6rc6 slab corruption.

2012-09-19 Thread David Rientjes
On Wed, 19 Sep 2012, Konrad Rzeszutek Wilk wrote: > From 0806b133b5b28081adf23d0d04a99636ed3b861b Mon Sep 17 00:00:00 2001 > From: Konrad Rzeszutek Wilk > Date: Wed, 19 Sep 2012 11:23:01 -0400 > Subject: [PATCH 1/2] debugfs: Add lock for u32_array_read > > Dave Jones spotted that the u32_array_r

Re: 3.6rc6 slab corruption.

2012-09-19 Thread David Rientjes
On Wed, 19 Sep 2012, Linus Torvalds wrote: > > Create a 350 processes reading /sys/kernel/debug/kvm/spinlocks/histo_blocked > > file simultaneously in while loop for more than 3 hours on my box. > > You need to open the file a single time, and then after that sinelg > open (either threaded or wit

Re: 3.6rc6 slab corruption.

2012-09-19 Thread Konrad Rzeszutek Wilk
On Tue, Sep 18, 2012 at 01:49:52PM -0700, Linus Torvalds wrote: > On Tue, Sep 18, 2012 at 1:37 PM, Konrad Rzeszutek Wilk > wrote: > > > > 30 words. ~360 + 29 spaces + NULL = 390? > > Just allocate the max then. That's tiny. > > And it's actually just 330: max ten characters for an unsigned 32-bi

Re: 3.6rc6 slab corruption.

2012-09-19 Thread Linus Torvalds
On Wed, Sep 19, 2012 at 7:00 AM, Raghavendra K T wrote: > > Create a 350 processes reading /sys/kernel/debug/kvm/spinlocks/histo_blocked > file simultaneously in while loop for more than 3 hours on my box. You need to open the file a single time, and then after that sinelg open (either threaded o

Re: 3.6rc6 slab corruption.

2012-09-19 Thread Raghavendra K T
On 09/19/2012 12:23 AM, Dave Jones wrote: On Tue, Sep 18, 2012 at 11:38:44AM -0700, Linus Torvalds wrote: > Quoting the entire email, since I added Greg to the list of people (as > the documented maintainer of debugfs) along with what I think are the > guilty parties. > > Dave, is t

Re: 3.6rc6 slab corruption.

2012-09-18 Thread Raghavendra K T
On 09/19/2012 02:19 AM, Linus Torvalds wrote: On Tue, Sep 18, 2012 at 1:37 PM, Konrad Rzeszutek Wilk wrote: 30 words. ~360 + 29 spaces + NULL = 390? Just allocate the max then. That's tiny. And it's actually just 330: max ten characters for an unsigned 32-bit number. Okay, I think you m

Re: 3.6rc6 slab corruption.

2012-09-18 Thread Raghavendra K T
On 09/19/2012 01:54 AM, David Rientjes wrote: On Tue, 18 Sep 2012, Konrad Rzeszutek Wilk wrote: diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c index 2340f69..309b235 100644 --- a/fs/debugfs/file.c +++ b/fs/debugfs/file.c @@ -524,6 +524,7 @@ EXPORT_SYMBOL_GPL(debugfs_create_blob); struct a

Re: 3.6rc6 slab corruption.

2012-09-18 Thread Linus Torvalds
On Tue, Sep 18, 2012 at 1:37 PM, Konrad Rzeszutek Wilk wrote: > > 30 words. ~360 + 29 spaces + NULL = 390? Just allocate the max then. That's tiny. And it's actually just 330: max ten characters for an unsigned 32-bit number. Linus -- To unsubscribe from this list: send the line "unsu

Re: 3.6rc6 slab corruption.

2012-09-18 Thread Konrad Rzeszutek Wilk
> >> It should be easyish to fix by just adding a lock around those things. > > > > Like this: > > Not quite. .. new patch in another email... .. snip.. > > Yikes. The fix could be to allocate a buffer large enough for the maximum > > that %u could take * array_size and not bother with the first

Re: 3.6rc6 slab corruption.

2012-09-18 Thread Konrad Rzeszutek Wilk
On Tue, Sep 18, 2012 at 01:24:15PM -0700, David Rientjes wrote: > On Tue, 18 Sep 2012, Konrad Rzeszutek Wilk wrote: > > > diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c > > index 2340f69..309b235 100644 > > --- a/fs/debugfs/file.c > > +++ b/fs/debugfs/file.c > > @@ -524,6 +524,7 @@ EXPORT_SYMB

Re: 3.6rc6 slab corruption.

2012-09-18 Thread Linus Torvalds
On Tue, Sep 18, 2012 at 1:24 PM, David Rientjes wrote: > On Tue, 18 Sep 2012, Konrad Rzeszutek Wilk wrote: > >> u32 elements; >> + struct mutex lock; > > This should be a spinlock. I don't think it can be a spinlock since it needs to protect the user-space accesses (which Konrad admitte

Re: 3.6rc6 slab corruption.

2012-09-18 Thread Linus Torvalds
On Tue, Sep 18, 2012 at 12:23 PM, Konrad Rzeszutek Wilk wrote: >> >> It should be easyish to fix by just adding a lock around those things. > > Like this: Not quite. I suspect you need to protect the "read_from_buffer()" call too, since otherwise the buffer can be free'd by another thread while

Re: 3.6rc6 slab corruption.

2012-09-18 Thread David Rientjes
On Tue, 18 Sep 2012, Konrad Rzeszutek Wilk wrote: > diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c > index 2340f69..309b235 100644 > --- a/fs/debugfs/file.c > +++ b/fs/debugfs/file.c > @@ -524,6 +524,7 @@ EXPORT_SYMBOL_GPL(debugfs_create_blob); > struct array_data { > void *array; >

Re: 3.6rc6 slab corruption.

2012-09-18 Thread Konrad Rzeszutek Wilk
On Tue, Sep 18, 2012 at 11:38:44AM -0700, Linus Torvalds wrote: > Quoting the entire email, since I added Greg to the list of people (as > the documented maintainer of debugfs) along with what I think are the > guilty parties. > > Dave, is trinity perhaps doing read calls on the same file in paral

Re: 3.6rc6 slab corruption.

2012-09-18 Thread Dave Jones
On Tue, Sep 18, 2012 at 11:38:44AM -0700, Linus Torvalds wrote: > Quoting the entire email, since I added Greg to the list of people (as > the documented maintainer of debugfs) along with what I think are the > guilty parties. > > Dave, is trinity perhaps doing read calls on the same file in

Re: 3.6rc6 slab corruption.

2012-09-18 Thread Linus Torvalds
Quoting the entire email, since I added Greg to the list of people (as the documented maintainer of debugfs) along with what I think are the guilty parties. Dave, is trinity perhaps doing read calls on the same file in parallel? Because it looks to me like debugfs is racy for that case. At least

3.6rc6 slab corruption.

2012-09-18 Thread Dave Jones
I was chasing a networking bug, and had trinity reduced to just making read & setsockopt calls, and let that run overnight. I woke up to 800mb of traces from a different bug.. The traces look mostly like this.. = BUG k