Re: dio_get_page() lockdep complaints

2007-11-12 Thread Peter Zijlstra
On Mon, 2007-11-12 at 09:45 +0100, Martin Schwidefsky wrote: > On Sun, 2007-11-11 at 20:49 +0100, Peter Zijlstra wrote: > > Right, which gets us into all kinds of trouble because some sites need > > mmap_sem to resolve some races, notably s390 31-bit and shm. > > You are refering to the mmap_sem

Re: dio_get_page() lockdep complaints

2007-11-12 Thread Martin Schwidefsky
On Sun, 2007-11-11 at 20:49 +0100, Peter Zijlstra wrote: > Right, which gets us into all kinds of trouble because some sites need > mmap_sem to resolve some races, notably s390 31-bit and shm. You are refering to the mmap_sem use in compat_linux.c:do_mmap2, aren't you? That check for adresses >

Re: dio_get_page() lockdep complaints

2007-11-12 Thread Martin Schwidefsky
On Sun, 2007-11-11 at 20:49 +0100, Peter Zijlstra wrote: Right, which gets us into all kinds of trouble because some sites need mmap_sem to resolve some races, notably s390 31-bit and shm. You are refering to the mmap_sem use in compat_linux.c:do_mmap2, aren't you? That check for adresses 2GB

Re: dio_get_page() lockdep complaints

2007-11-12 Thread Peter Zijlstra
On Mon, 2007-11-12 at 09:45 +0100, Martin Schwidefsky wrote: On Sun, 2007-11-11 at 20:49 +0100, Peter Zijlstra wrote: Right, which gets us into all kinds of trouble because some sites need mmap_sem to resolve some races, notably s390 31-bit and shm. You are refering to the mmap_sem use in

Re: dio_get_page() lockdep complaints

2007-11-11 Thread Peter Zijlstra
On Fri, 2007-11-09 at 12:45 -0500, Trond Myklebust wrote: > On Fri, 2007-11-09 at 09:30 -0800, Zach Brown wrote: > > So, reiserfs and NFS are nesting i_mutex inside the mmap_sem. > > > > >>[] mutex_lock+0x1c/0x1f > > >>[] reiserfs_file_release+0x54/0x447 > > >>[]

Re: dio_get_page() lockdep complaints

2007-11-11 Thread Peter Zijlstra
On Fri, 2007-11-09 at 12:45 -0500, Trond Myklebust wrote: On Fri, 2007-11-09 at 09:30 -0800, Zach Brown wrote: So, reiserfs and NFS are nesting i_mutex inside the mmap_sem. [b038c6e5] mutex_lock+0x1c/0x1f [b01b17e9] reiserfs_file_release+0x54/0x447 [b016afe7]

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Chris Mason
On Fri, 09 Nov 2007 11:16:53 -0800 Zach Brown <[EMAIL PROTECTED]> wrote: > > Ugh, I thought the preallocation was getting freed elsewhere, but it > > looks like I was wrong. We can't just skip the i_mutex after all, > > sorry. > > Ah, so none of those tests at the top will stop tail packing if

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Zach Brown
> Ugh, I thought the preallocation was getting freed elsewhere, but it > looks like I was wrong. We can't just skip the i_mutex after all, > sorry. Ah, so none of those tests at the top will stop tail packing if there's been pre-allocation? Like, uh, the inode reference count test? - z [

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Chris Mason
On Fri, 9 Nov 2007 13:53:27 -0500 Chris Mason <[EMAIL PROTECTED]> wrote: > On Fri, 09 Nov 2007 10:35:04 -0800 > Zach Brown <[EMAIL PROTECTED]> wrote: > > > > Without getting into a huge patch, the best fix would just be > > > switching to try lock. If the tail doesn't get packed, the world > >

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Chris Mason
On Fri, 09 Nov 2007 10:35:04 -0800 Zach Brown <[EMAIL PROTECTED]> wrote: > > Without getting into a huge patch, the best fix would just be > > switching to try lock. If the tail doesn't get packed, the world > > doesn't end. > > So, something like this? Reviewed-by: Chris Mason <[EMAIL

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Zach Brown
> Without getting into a huge patch, the best fix would just be switching > to try lock. If the tail doesn't get packed, the world doesn't end. So, something like this? --- reiserfs: trylock i_mutex in file release when packing The mmap_sem is nested under the i_mutex.

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Chris Mason
On Fri, 09 Nov 2007 09:48:22 -0800 Zach Brown <[EMAIL PROTECTED]> wrote: > > >> So reiser and NFS need to be fixed. No? > > > > Actually, it is rather mmap() needs to be fixed. > > Sure, I'm willing to have that demonstrated. My point was that DIO > getting the mmap_sem inside i_mutex is

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Zach Brown
>> So reiser and NFS need to be fixed. No? > > Actually, it is rather mmap() needs to be fixed. Sure, I'm willing to have that demonstrated. My point was that DIO getting the mmap_sem inside i_mutex is currently correct. reiserfs, though, seems to be out on a more precarious limb ;). - z -

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Trond Myklebust
On Fri, 2007-11-09 at 09:30 -0800, Zach Brown wrote: > So, reiserfs and NFS are nesting i_mutex inside the mmap_sem. > > >>[] mutex_lock+0x1c/0x1f > >>[] reiserfs_file_release+0x54/0x447 > >>[] __fput+0x53/0x101 > >>[] fput+0x19/0x1c > >>[]

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Zach Brown
So, reiserfs and NFS are nesting i_mutex inside the mmap_sem. >>[] mutex_lock+0x1c/0x1f >>[] reiserfs_file_release+0x54/0x447 >>[] __fput+0x53/0x101 >>[] fput+0x19/0x1c >>[] remove_vma+0x3b/0x4d >>[] do_munmap+0x17f/0x1cf >[]

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Peter Zijlstra
On Thu, 2007-04-19 at 09:38 +0200, Jens Axboe wrote: > Hi, > > Doing some testing on CFQ, I ran into this 100% reproducible report: > > === > [ INFO: possible circular locking dependency detected ] > 2.6.21-rc7 #5 >

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Peter Zijlstra
On Thu, 2007-04-19 at 09:38 +0200, Jens Axboe wrote: Hi, Doing some testing on CFQ, I ran into this 100% reproducible report: === [ INFO: possible circular locking dependency detected ] 2.6.21-rc7 #5

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Zach Brown
So, reiserfs and NFS are nesting i_mutex inside the mmap_sem. [b038c6e5] mutex_lock+0x1c/0x1f [b01b17e9] reiserfs_file_release+0x54/0x447 [b016afe7] __fput+0x53/0x101 [b016b0ee] fput+0x19/0x1c [b015bcd5] remove_vma+0x3b/0x4d [b015c659]

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Trond Myklebust
On Fri, 2007-11-09 at 09:30 -0800, Zach Brown wrote: So, reiserfs and NFS are nesting i_mutex inside the mmap_sem. [b038c6e5] mutex_lock+0x1c/0x1f [b01b17e9] reiserfs_file_release+0x54/0x447 [b016afe7] __fput+0x53/0x101 [b016b0ee] fput+0x19/0x1c

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Zach Brown
So reiser and NFS need to be fixed. No? Actually, it is rather mmap() needs to be fixed. Sure, I'm willing to have that demonstrated. My point was that DIO getting the mmap_sem inside i_mutex is currently correct. reiserfs, though, seems to be out on a more precarious limb ;). - z - To

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Chris Mason
On Fri, 09 Nov 2007 09:48:22 -0800 Zach Brown [EMAIL PROTECTED] wrote: So reiser and NFS need to be fixed. No? Actually, it is rather mmap() needs to be fixed. Sure, I'm willing to have that demonstrated. My point was that DIO getting the mmap_sem inside i_mutex is currently

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Zach Brown
Without getting into a huge patch, the best fix would just be switching to try lock. If the tail doesn't get packed, the world doesn't end. So, something like this? --- reiserfs: trylock i_mutex in file release when packing The mmap_sem is nested under the i_mutex. reiserfs_file_release()

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Chris Mason
On Fri, 09 Nov 2007 10:35:04 -0800 Zach Brown [EMAIL PROTECTED] wrote: Without getting into a huge patch, the best fix would just be switching to try lock. If the tail doesn't get packed, the world doesn't end. So, something like this? Reviewed-by: Chris Mason [EMAIL PROTECTED] -chris

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Chris Mason
On Fri, 9 Nov 2007 13:53:27 -0500 Chris Mason [EMAIL PROTECTED] wrote: On Fri, 09 Nov 2007 10:35:04 -0800 Zach Brown [EMAIL PROTECTED] wrote: Without getting into a huge patch, the best fix would just be switching to try lock. If the tail doesn't get packed, the world doesn't end.

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Zach Brown
Ugh, I thought the preallocation was getting freed elsewhere, but it looks like I was wrong. We can't just skip the i_mutex after all, sorry. Ah, so none of those tests at the top will stop tail packing if there's been pre-allocation? Like, uh, the inode reference count test? - z [

Re: dio_get_page() lockdep complaints

2007-11-09 Thread Chris Mason
On Fri, 09 Nov 2007 11:16:53 -0800 Zach Brown [EMAIL PROTECTED] wrote: Ugh, I thought the preallocation was getting freed elsewhere, but it looks like I was wrong. We can't just skip the i_mutex after all, sorry. Ah, so none of those tests at the top will stop tail packing if there's

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Andrew Morton
On Thu, 19 Apr 2007 18:57:41 +0400 "Vladimir V. Saveliev" <[EMAIL PROTECTED]> wrote: > > It's a bit odd that reiserfs is playing with file contents within > > file_operations.release(): there could be other files open against that > > inode. One would expect this sort of thing to be happening

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Vladimir V. Saveliev
Hello On Thursday 19 April 2007 12:25, Andrew Morton wrote: > On Thu, 19 Apr 2007 10:01:57 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > On Thu, Apr 19 2007, Andrew Morton wrote: > > > On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > > > Hi, > > > > > > >

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Vladimir V. Saveliev
Hello On Thursday 19 April 2007 18:15, Jens Axboe wrote: > On Thu, Apr 19 2007, Jens Axboe wrote: > > > Is it possible that fio was changed? That it was changed to close() the > > > fd > > > before doing the munmapping whereas it used to hold the file open? > > > > It's been a while since I

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Chris Mason
On Thu, Apr 19, 2007 at 01:01:42AM -0700, Andrew Morton wrote: > On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > Doing some testing on CFQ, I ran into this 100% reproducible report: > > > > === > > [

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Roland Dreier wrote: > Maybe you could add some hack really early on (say at the beginning of > the reiserfs mount code) that took instances of the locks in the > correct order, so you would get a lockdep trace of where the ordering > is violated when it first happens? See

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Jens Axboe wrote: > > Is it possible that fio was changed? That it was changed to close() the fd > > before doing the munmapping whereas it used to hold the file open? > > It's been a while since I tested on this box, so I don't really recall. > But fio does close() the fd

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Roland Dreier
> As I mentioned, the rootfs is on reiser. So something in the boot up > scripts may trigger something that gets reiser to run through that path > with the wrong locking order. After the box is done booting, the dmesg > is clean. I then mount the ext3 fs and run the fio test, the lockdep >

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Jens Axboe wrote: > > I tried fio (1.15) with this job file and did not get the possible > > circular locking dependency detected > > Perhaps some of the preempt settings? The box is an emc centera, it's a > lowly p4/ht. As I mentioned, the rootfs is on reiser. So something

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Vladimir V. Saveliev wrote: > Hello > > On Thursday 19 April 2007 12:34, Jens Axboe wrote: > > On Thu, Apr 19 2007, Andrew Morton wrote: > > > On Thu, 19 Apr 2007 10:01:57 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > > > On Thu, Apr 19 2007, Andrew Morton wrote: > >

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Vladimir V. Saveliev
Hello On Thursday 19 April 2007 12:34, Jens Axboe wrote: > On Thu, Apr 19 2007, Andrew Morton wrote: > > On Thu, 19 Apr 2007 10:01:57 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > On Thu, Apr 19 2007, Andrew Morton wrote: > > > > On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe <[EMAIL

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Andrew Morton wrote: > On Thu, 19 Apr 2007 10:01:57 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > On Thu, Apr 19 2007, Andrew Morton wrote: > > > On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > > > Hi, > > > > > > > > Doing some

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Andrew Morton
On Thu, 19 Apr 2007 10:01:57 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > On Thu, Apr 19 2007, Andrew Morton wrote: > > On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > Hi, > > > > > > Doing some testing on CFQ, I ran into this 100% reproducible report: > > >

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Andrew Morton wrote: > On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > Doing some testing on CFQ, I ran into this 100% reproducible report: > > > > === > > [ INFO: possible

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Andrew Morton
On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe <[EMAIL PROTECTED]> wrote: > Hi, > > Doing some testing on CFQ, I ran into this 100% reproducible report: > > === > [ INFO: possible circular locking dependency detected ] > 2.6.21-rc7 #5 >

dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
Hi, Doing some testing on CFQ, I ran into this 100% reproducible report: === [ INFO: possible circular locking dependency detected ] 2.6.21-rc7 #5 --- fio/9741 is trying to acquire lock:

dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
Hi, Doing some testing on CFQ, I ran into this 100% reproducible report: === [ INFO: possible circular locking dependency detected ] 2.6.21-rc7 #5 --- fio/9741 is trying to acquire lock:

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Andrew Morton
On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe [EMAIL PROTECTED] wrote: Hi, Doing some testing on CFQ, I ran into this 100% reproducible report: === [ INFO: possible circular locking dependency detected ] 2.6.21-rc7 #5

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Andrew Morton wrote: On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe [EMAIL PROTECTED] wrote: Hi, Doing some testing on CFQ, I ran into this 100% reproducible report: === [ INFO: possible circular locking

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Andrew Morton
On Thu, 19 Apr 2007 10:01:57 +0200 Jens Axboe [EMAIL PROTECTED] wrote: On Thu, Apr 19 2007, Andrew Morton wrote: On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe [EMAIL PROTECTED] wrote: Hi, Doing some testing on CFQ, I ran into this 100% reproducible report:

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Andrew Morton wrote: On Thu, 19 Apr 2007 10:01:57 +0200 Jens Axboe [EMAIL PROTECTED] wrote: On Thu, Apr 19 2007, Andrew Morton wrote: On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe [EMAIL PROTECTED] wrote: Hi, Doing some testing on CFQ, I ran into this

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Vladimir V. Saveliev
Hello On Thursday 19 April 2007 12:34, Jens Axboe wrote: On Thu, Apr 19 2007, Andrew Morton wrote: On Thu, 19 Apr 2007 10:01:57 +0200 Jens Axboe [EMAIL PROTECTED] wrote: On Thu, Apr 19 2007, Andrew Morton wrote: On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe [EMAIL PROTECTED] wrote:

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Vladimir V. Saveliev wrote: Hello On Thursday 19 April 2007 12:34, Jens Axboe wrote: On Thu, Apr 19 2007, Andrew Morton wrote: On Thu, 19 Apr 2007 10:01:57 +0200 Jens Axboe [EMAIL PROTECTED] wrote: On Thu, Apr 19 2007, Andrew Morton wrote: On Thu, 19 Apr

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Jens Axboe wrote: I tried fio (1.15) with this job file and did not get the possible circular locking dependency detected Perhaps some of the preempt settings? The box is an emc centera, it's a lowly p4/ht. As I mentioned, the rootfs is on reiser. So something in the

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Roland Dreier
As I mentioned, the rootfs is on reiser. So something in the boot up scripts may trigger something that gets reiser to run through that path with the wrong locking order. After the box is done booting, the dmesg is clean. I then mount the ext3 fs and run the fio test, the lockdep trace

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Jens Axboe wrote: Is it possible that fio was changed? That it was changed to close() the fd before doing the munmapping whereas it used to hold the file open? It's been a while since I tested on this box, so I don't really recall. But fio does close() the fd before

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Jens Axboe
On Thu, Apr 19 2007, Roland Dreier wrote: Maybe you could add some hack really early on (say at the beginning of the reiserfs mount code) that took instances of the locks in the correct order, so you would get a lockdep trace of where the ordering is violated when it first happens? See the

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Chris Mason
On Thu, Apr 19, 2007 at 01:01:42AM -0700, Andrew Morton wrote: On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe [EMAIL PROTECTED] wrote: Hi, Doing some testing on CFQ, I ran into this 100% reproducible report: === [ INFO: possible

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Vladimir V. Saveliev
Hello On Thursday 19 April 2007 18:15, Jens Axboe wrote: On Thu, Apr 19 2007, Jens Axboe wrote: Is it possible that fio was changed? That it was changed to close() the fd before doing the munmapping whereas it used to hold the file open? It's been a while since I tested on this

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Vladimir V. Saveliev
Hello On Thursday 19 April 2007 12:25, Andrew Morton wrote: On Thu, 19 Apr 2007 10:01:57 +0200 Jens Axboe [EMAIL PROTECTED] wrote: On Thu, Apr 19 2007, Andrew Morton wrote: On Thu, 19 Apr 2007 09:38:30 +0200 Jens Axboe [EMAIL PROTECTED] wrote: Hi, Doing some testing on

Re: dio_get_page() lockdep complaints

2007-04-19 Thread Andrew Morton
On Thu, 19 Apr 2007 18:57:41 +0400 Vladimir V. Saveliev [EMAIL PROTECTED] wrote: It's a bit odd that reiserfs is playing with file contents within file_operations.release(): there could be other files open against that inode. One would expect this sort of thing to be happening in an