Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-30 Thread Christoph Hellwig
On Sat, Dec 01, 2012 at 09:40:41AM +1100, Dave Chinner wrote: > So it was based on this interface? Based on that. I had dropped the inode operation as it's not really a generic operation but a callback for either the buffered I/O code or direct I/O and should be treated as such. I've also split

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-30 Thread Dave Chinner
On Fri, Nov 30, 2012 at 11:36:01AM -0500, Christoph Hellwig wrote: > On Fri, Nov 30, 2012 at 01:49:10PM +1100, Dave Chinner wrote: > > > Ugh. That's a big violation of how buffer-heads are supposed to work: > > > the block number is very much defined to be in multiples of b_size > > > (see for

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-30 Thread Linus Torvalds
On Fri, Nov 30, 2012 at 6:31 AM, Chris Mason wrote: > On Thu, Nov 29, 2012 at 07:49:10PM -0700, Dave Chinner wrote: >> >> Same with mpage_readpages(), so it's not just direct IO that has >> this problem > > I guess the good news is that block devices don't have readpages. The > bad news

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-30 Thread Christoph Hellwig
On Fri, Nov 30, 2012 at 01:49:10PM +1100, Dave Chinner wrote: > > Ugh. That's a big violation of how buffer-heads are supposed to work: > > the block number is very much defined to be in multiples of b_size > > (see for example "submit_bh()" that turns it into a sector number). > > > > But you're

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-30 Thread Chris Mason
On Thu, Nov 29, 2012 at 07:49:10PM -0700, Dave Chinner wrote: > On Thu, Nov 29, 2012 at 02:16:50PM -0800, Linus Torvalds wrote: > > On Thu, Nov 29, 2012 at 1:29 PM, Chris Mason > > wrote: > > > > > > Just reading the new blkdev_get_blocks, it looks like we're mixing > > > shifts. In direct-io.c

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-30 Thread Chris Mason
On Thu, Nov 29, 2012 at 07:49:10PM -0700, Dave Chinner wrote: On Thu, Nov 29, 2012 at 02:16:50PM -0800, Linus Torvalds wrote: On Thu, Nov 29, 2012 at 1:29 PM, Chris Mason chris.ma...@fusionio.com wrote: Just reading the new blkdev_get_blocks, it looks like we're mixing shifts. In

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-30 Thread Christoph Hellwig
On Fri, Nov 30, 2012 at 01:49:10PM +1100, Dave Chinner wrote: Ugh. That's a big violation of how buffer-heads are supposed to work: the block number is very much defined to be in multiples of b_size (see for example submit_bh() that turns it into a sector number). But you're right. The

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-30 Thread Linus Torvalds
On Fri, Nov 30, 2012 at 6:31 AM, Chris Mason chris.ma...@fusionio.com wrote: On Thu, Nov 29, 2012 at 07:49:10PM -0700, Dave Chinner wrote: Same with mpage_readpages(), so it's not just direct IO that has this problem I guess the good news is that block devices don't have readpages. The

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-30 Thread Dave Chinner
On Fri, Nov 30, 2012 at 11:36:01AM -0500, Christoph Hellwig wrote: On Fri, Nov 30, 2012 at 01:49:10PM +1100, Dave Chinner wrote: Ugh. That's a big violation of how buffer-heads are supposed to work: the block number is very much defined to be in multiples of b_size (see for example

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-30 Thread Christoph Hellwig
On Sat, Dec 01, 2012 at 09:40:41AM +1100, Dave Chinner wrote: So it was based on this interface? Based on that. I had dropped the inode operation as it's not really a generic operation but a callback for either the buffered I/O code or direct I/O and should be treated as such. I've also split

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Dave Chinner
On Thu, Nov 29, 2012 at 02:16:50PM -0800, Linus Torvalds wrote: > On Thu, Nov 29, 2012 at 1:29 PM, Chris Mason wrote: > > > > Just reading the new blkdev_get_blocks, it looks like we're mixing > > shifts. In direct-io.c map_bh->b_size is how much we'd like to map, and > > it has no relation at

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 07:13:02PM -0700, Linus Torvalds wrote: > On Thu, Nov 29, 2012 at 5:16 PM, Chris Mason wrote: > > > > I searched through filemap.c for the magic i_size check that would let > > us get away with ignoring i_blkbits in get_blocks, but its just not > > there. The whole

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 5:16 PM, Chris Mason wrote: > > I searched through filemap.c for the magic i_size check that would let > us get away with ignoring i_blkbits in get_blocks, but its just not > there. The whole fallback-to-buffered scheme seems to rely on > get_blocks checking for i_size.

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 03:36:38PM -0700, Linus Torvalds wrote: > On Thu, Nov 29, 2012 at 2:16 PM, Linus Torvalds > wrote: > > > > But you're right. The direct-IO code really *is* violating that, and > > knows that get_block() ends up being defined in i_blkbits regardless > > of b_size. > > It

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 2:16 PM, Linus Torvalds wrote: > > But you're right. The direct-IO code really *is* violating that, and > knows that get_block() ends up being defined in i_blkbits regardless > of b_size. It turns out fs/ioctl.c does the same - it fills in the buffer head with some random

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 1:29 PM, Chris Mason wrote: > > Just reading the new blkdev_get_blocks, it looks like we're mixing > shifts. In direct-io.c map_bh->b_size is how much we'd like to map, and > it has no relation at all to the actual block size of the device. The > interface is abusing

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 01:52:22PM -0700, Linus Torvalds wrote: > On Thu, Nov 29, 2012 at 11:48 AM, Chris Mason > wrote: > > > > It was all a trick to get you to say the AIO code was sane. > > It's only sane compared to the DIO code. > > That said, I hate AIO much less these days that we've

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 11:48 AM, Chris Mason wrote: > > It was all a trick to get you to say the AIO code was sane. It's only sane compared to the DIO code. That said, I hate AIO much less these days that we've largely merged the code with the regular IO. It's still a horrible interface, but

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 11:55 AM, Linus Torvalds wrote: > > The blkdev_get_blocks() this is just sad. > > The underlying data structure is actually byte-based (it's > "i_size_read(bdev->bd_inode"), but we convert it to a block-based > number. > > Oops. Oh, it's even worse than that. The DIO code

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 11:48 AM, Chris Mason wrote: > > blkdev_get_blocks (called during DIO) is also checking i_blkbits, but I > really don't get why that isn't byte based instead. DIO is already > doing the shift & mask game. The blkdev_get_blocks() this is just sad. The underlying data

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 11:26 AM, Linus Torvalds wrote: > > (I now realize that Mikulas was talking about this mess, while I > thought he was talking about the AIO code which is largely sane). Oh wow. The direct-IO code really doesn't seem to care at all. I don't think it needs locking either

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 12:26:06PM -0700, Linus Torvalds wrote: > On Thu, Nov 29, 2012 at 11:15 AM, Chris Mason > wrote: > > > > The fs/buffer.c part makes sense during a quick read. But > > fs/direct-io.c plays with i_blkbits too. The semaphore was fixing real > > bugs there. > > Ugh. I

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 11:15 AM, Chris Mason wrote: > > The fs/buffer.c part makes sense during a quick read. But > fs/direct-io.c plays with i_blkbits too. The semaphore was fixing real > bugs there. Ugh. I _hate_ direct-IO. What a mess. And yeah, it seems to be incestuously playing games

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 12:02:17PM -0700, Linus Torvalds wrote: > On Thu, Nov 29, 2012 at 9:19 AM, Linus Torvalds > wrote: > > > > I think I'll apply this for 3.7 (since it's too late to do anything > > fancier), and then for 3.8 I will rip out all the locking entirely, > > because looking at the

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 9:19 AM, Linus Torvalds wrote: > > I think I'll apply this for 3.7 (since it's too late to do anything > fancier), and then for 3.8 I will rip out all the locking entirely, > because looking at the fs/buffer.c patch I wrote up, it's all totally > unnecessary. > > Adding a

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 10:23 AM, Mikulas Patocka wrote: > > > If you remove that percpu rw lock, you also need to rewrite direct i/o > code. > > In theory, block device direct i/o doesn't need buffer block size at all. > But in practice, it shares a lot of code with filesystem direct i/o, it >

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Mikulas Patocka
On Thu, 29 Nov 2012, Linus Torvalds wrote: > On Wed, Nov 28, 2012 at 2:01 PM, Mikulas Patocka wrote: > > > > This sounds sensible. I'm sending this patch. > > This looks much better. > > I think I'll apply this for 3.7 (since it's too late to do anything > fancier), and then for 3.8 I will

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 2:01 PM, Mikulas Patocka wrote: > > This sounds sensible. I'm sending this patch. This looks much better. I think I'll apply this for 3.7 (since it's too late to do anything fancier), and then for 3.8 I will rip out all the locking entirely, because looking at the

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 2:01 PM, Mikulas Patocka mpato...@redhat.com wrote: This sounds sensible. I'm sending this patch. This looks much better. I think I'll apply this for 3.7 (since it's too late to do anything fancier), and then for 3.8 I will rip out all the locking entirely, because

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Mikulas Patocka
On Thu, 29 Nov 2012, Linus Torvalds wrote: On Wed, Nov 28, 2012 at 2:01 PM, Mikulas Patocka mpato...@redhat.com wrote: This sounds sensible. I'm sending this patch. This looks much better. I think I'll apply this for 3.7 (since it's too late to do anything fancier), and then for 3.8

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 10:23 AM, Mikulas Patocka mpato...@redhat.com wrote: If you remove that percpu rw lock, you also need to rewrite direct i/o code. In theory, block device direct i/o doesn't need buffer block size at all. But in practice, it shares a lot of code with filesystem direct

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 9:19 AM, Linus Torvalds torva...@linux-foundation.org wrote: I think I'll apply this for 3.7 (since it's too late to do anything fancier), and then for 3.8 I will rip out all the locking entirely, because looking at the fs/buffer.c patch I wrote up, it's all totally

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 12:02:17PM -0700, Linus Torvalds wrote: On Thu, Nov 29, 2012 at 9:19 AM, Linus Torvalds torva...@linux-foundation.org wrote: I think I'll apply this for 3.7 (since it's too late to do anything fancier), and then for 3.8 I will rip out all the locking entirely,

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 11:15 AM, Chris Mason chris.ma...@fusionio.com wrote: The fs/buffer.c part makes sense during a quick read. But fs/direct-io.c plays with i_blkbits too. The semaphore was fixing real bugs there. Ugh. I _hate_ direct-IO. What a mess. And yeah, it seems to be

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 12:26:06PM -0700, Linus Torvalds wrote: On Thu, Nov 29, 2012 at 11:15 AM, Chris Mason chris.ma...@fusionio.com wrote: The fs/buffer.c part makes sense during a quick read. But fs/direct-io.c plays with i_blkbits too. The semaphore was fixing real bugs there.

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 11:26 AM, Linus Torvalds torva...@linux-foundation.org wrote: (I now realize that Mikulas was talking about this mess, while I thought he was talking about the AIO code which is largely sane). Oh wow. The direct-IO code really doesn't seem to care at all. I don't think

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 11:48 AM, Chris Mason chris.ma...@fusionio.com wrote: blkdev_get_blocks (called during DIO) is also checking i_blkbits, but I really don't get why that isn't byte based instead. DIO is already doing the shift mask game. The blkdev_get_blocks() this is just sad. The

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 11:55 AM, Linus Torvalds torva...@linux-foundation.org wrote: The blkdev_get_blocks() this is just sad. The underlying data structure is actually byte-based (it's i_size_read(bdev-bd_inode), but we convert it to a block-based number. Oops. Oh, it's even worse than

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 11:48 AM, Chris Mason chris.ma...@fusionio.com wrote: It was all a trick to get you to say the AIO code was sane. It's only sane compared to the DIO code. That said, I hate AIO much less these days that we've largely merged the code with the regular IO. It's still a

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 01:52:22PM -0700, Linus Torvalds wrote: On Thu, Nov 29, 2012 at 11:48 AM, Chris Mason chris.ma...@fusionio.com wrote: It was all a trick to get you to say the AIO code was sane. It's only sane compared to the DIO code. That said, I hate AIO much less these days

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 1:29 PM, Chris Mason chris.ma...@fusionio.com wrote: Just reading the new blkdev_get_blocks, it looks like we're mixing shifts. In direct-io.c map_bh-b_size is how much we'd like to map, and it has no relation at all to the actual block size of the device. The

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 2:16 PM, Linus Torvalds torva...@linux-foundation.org wrote: But you're right. The direct-IO code really *is* violating that, and knows that get_block() ends up being defined in i_blkbits regardless of b_size. It turns out fs/ioctl.c does the same - it fills in the

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 03:36:38PM -0700, Linus Torvalds wrote: On Thu, Nov 29, 2012 at 2:16 PM, Linus Torvalds torva...@linux-foundation.org wrote: But you're right. The direct-IO code really *is* violating that, and knows that get_block() ends up being defined in i_blkbits regardless

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 5:16 PM, Chris Mason chris.ma...@fusionio.com wrote: I searched through filemap.c for the magic i_size check that would let us get away with ignoring i_blkbits in get_blocks, but its just not there. The whole fallback-to-buffered scheme seems to rely on get_blocks

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 07:13:02PM -0700, Linus Torvalds wrote: On Thu, Nov 29, 2012 at 5:16 PM, Chris Mason chris.ma...@fusionio.com wrote: I searched through filemap.c for the magic i_size check that would let us get away with ignoring i_blkbits in get_blocks, but its just not there.

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Dave Chinner
On Thu, Nov 29, 2012 at 02:16:50PM -0800, Linus Torvalds wrote: On Thu, Nov 29, 2012 at 1:29 PM, Chris Mason chris.ma...@fusionio.com wrote: Just reading the new blkdev_get_blocks, it looks like we're mixing shifts. In direct-io.c map_bh-b_size is how much we'd like to map, and it has no

[PATCH v2] Do a proper locking for mmap and block size change

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Al Viro wrote: > On Wed, Nov 28, 2012 at 11:15:12AM -0800, Linus Torvalds wrote: > > No, this is crap. > > > > We don't introduce random hooks like this just because the block layer > > has shit-for-brains and cannot be bothered to do things right. > > > > The fact is,

[PATCH v2] Do a proper locking for mmap and block size change

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Al Viro wrote: On Wed, Nov 28, 2012 at 11:15:12AM -0800, Linus Torvalds wrote: No, this is crap. We don't introduce random hooks like this just because the block layer has shit-for-brains and cannot be bothered to do things right. The fact is, the whole