Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 9:51 AM, Chris Mason wrote: > > The bigger question is do we have users that expect to be able to set > the blocksize after mmaping the block device (no writes required)? I > actually feel a little bad for taking up internet bandwidth asking, but > it is a change in behavi

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 10:26:56AM -0700, Linus Torvalds wrote: > On Thu, Nov 29, 2012 at 6:12 AM, Chris Mason wrote: > > > > Jumping in based on Linus original patch, which is doing something like > > this: > > > > set_blocksize() { > > block new calls to writepage, prepare/commit_write >

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-29 Thread Linus Torvalds
On Thu, Nov 29, 2012 at 6:12 AM, Chris Mason wrote: > > Jumping in based on Linus original patch, which is doing something like > this: > > set_blocksize() { > block new calls to writepage, prepare/commit_write > set the block size > unblock > > < --- can race in he

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-29 Thread Chris Mason
On Thu, Nov 29, 2012 at 07:12:49AM -0700, Chris Mason wrote: > On Wed, Nov 28, 2012 at 11:16:21PM -0700, Linus Torvalds wrote: > > On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds > > wrote: > > > > > > But the fact that the code wants to do things like > > > > > > block = (sector_t)page->i

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-29 Thread Chris Mason
On Wed, Nov 28, 2012 at 11:16:21PM -0700, Linus Torvalds wrote: > On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds > wrote: > > > > But the fact that the code wants to do things like > > > > block = (sector_t)page->index << (PAGE_CACHE_SHIFT - bbits); > > > > seriously seems to be the main

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-29 Thread Jeff Chua
On Thu, Nov 29, 2012 at 2:45 PM, Al Viro wrote: > On Wed, Nov 28, 2012 at 10:37:27PM -0800, Linus Torvalds wrote: >> On Wed, Nov 28, 2012 at 10:30 PM, Al Viro wrote: >> > >> > Note that sync_blockdev() a few lines prior to that is good only if we >> > have no other processes doing write(2) (or di

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Al Viro
On Wed, Nov 28, 2012 at 10:37:27PM -0800, Linus Torvalds wrote: > On Wed, Nov 28, 2012 at 10:30 PM, Al Viro wrote: > > > > Note that sync_blockdev() a few lines prior to that is good only if we > > have no other processes doing write(2) (or dirtying the mmapped pages, > > for that matter). The wi

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 10:30 PM, Al Viro wrote: > > Note that sync_blockdev() a few lines prior to that is good only if we > have no other processes doing write(2) (or dirtying the mmapped pages, > for that matter). The window isn't too wide, but... So with Mikulas' patches, the write actually

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 10:25 PM, Al Viro wrote: > > Umm... set_blocksize() is calling kill_bdev(), which does > truncate_inode_pages(mapping, 0). What's going to happen to data in > the dirty pages? IO in progress is not the only thing to worry about... Hmm. Yes. I think it works by v

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Al Viro
On Thu, Nov 29, 2012 at 06:25:19AM +, Al Viro wrote: > Umm... set_blocksize() is calling kill_bdev(), which does > truncate_inode_pages(mapping, 0). What's going to happen to data in > the dirty pages? IO in progress is not the only thing to worry about... Note that sync_blockdev() a

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Al Viro
On Wed, Nov 28, 2012 at 10:16:21PM -0800, Linus Torvalds wrote: > On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds > wrote: > > > > But the fact that the code wants to do things like > > > > block = (sector_t)page->index << (PAGE_CACHE_SHIFT - bbits); > > > > seriously seems to be the main

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds wrote: > > But the fact that the code wants to do things like > > block = (sector_t)page->index << (PAGE_CACHE_SHIFT - bbits); > > seriously seems to be the main thing that keeps us using > 'inode->i_blkbits'. Calculating bbits from bh->b_siz

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 6:04 PM, Linus Torvalds wrote: > > Interesting. The code *has* the block size (it's in "bh->b_size"), but > it actually then uses the inode blocksize instead, and verifies the > two against each other. It could just have used the block size > directly (and then used the ino

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Linus Torvalds wrote: > A bigger issue is for things that emulate what blkdev.c does, and > doesn't do the locking. I see code in md/bitmap.c that seems a bit > suspicious, for example. That said, it's not *new* breakage, and the > "lock at mmap/read/write() time" approach d

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
[ Sorry, I was offline for a while driving kids around ] On Wed, Nov 28, 2012 at 4:38 PM, Mikulas Patocka wrote: > > It can happen. Take your patch (the one that moves bd_block_size_semaphore > into blkdev_readpage, blkdev_writepage and blkdev_write_begin). Interesting. The code *has* the block

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Linus Torvalds wrote: > > For example, __block_write_full_page and __block_write_begin do > > if (!page_has_buffers(page)) { create_empty_buffers... } > > and then they do > > WARN_ON(bh->b_size != blocksize) > > err = get_block(inode, block, bh, 1) >

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 2:52 PM, Linus Torvalds wrote: > >> For example, __block_write_full_page and __block_write_begin do >> if (!page_has_buffers(page)) { create_empty_buffers... } >> and then they do >> WARN_ON(bh->b_size != blocksize) >> err = get_block(inode, block, b

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 1:29 PM, Mikulas Patocka wrote: > > The problem with this approach is that it is very easy to miss points > where it is assumed that the block size doesn't change - and if you miss a > point, it results in a hidden bug that has a little possibility of being > found. Umm. M

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Linus Torvalds wrote: > On Wed, Nov 28, 2012 at 12:32 PM, Linus Torvalds > wrote: > > > > Here is a *COMPLETELY* untested patch. Caveat emptor. It will probably > > do unspeakable things to your family and pets. > > Btw, *if* this approach works, I suspect we could just sw

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Linus Torvalds wrote: > On Wed, Nov 28, 2012 at 12:03 PM, Linus Torvalds > wrote: > > > > mmap() is in *no* way special. The exact same thing happens for > > regular read/write. Yet somehow the mmap code is special-cased, while > > the normal read-write code is not. > > I

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 12:32 PM, Linus Torvalds wrote: > > Here is a *COMPLETELY* untested patch. Caveat emptor. It will probably > do unspeakable things to your family and pets. Btw, *if* this approach works, I suspect we could just switch the bd_block_size_semaphore semaphore to be a regular r

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 12:13 PM, Linus Torvalds wrote: > > I dunno. Maybe there is some fundamental reason why the above is > broken, but it seems to be a much simpler approach. Sure, you need to > guarantee that the people who get the write-lock cannot possibly cause > IO while holding it, but s

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 12:03 PM, Linus Torvalds wrote: > > mmap() is in *no* way special. The exact same thing happens for > regular read/write. Yet somehow the mmap code is special-cased, while > the normal read-write code is not. I just double-checked, because it's been a long time since I act

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 11:50 AM, Mikulas Patocka wrote: > > mmap_region() doesn't care about the block size. But a lot of > page-in/page-out code does. That seems a bogus argument. mmap() is in *no* way special. The exact same thing happens for regular read/write. Yet somehow the mmap code is s

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
On Wed, Nov 28, 2012 at 11:43 AM, Al Viro wrote: > Have a > private vm_operations - a copy of generic_file_vm_ops with ->open()/->close() > added to it. That sounds more reasonable. However, I suspect the *most* reasonable thing to do is to just remove the whole damn thing. W

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Linus Torvalds wrote: > No, this is crap. > > We don't introduce random hooks like this just because the block layer > has shit-for-brains and cannot be bothered to do things right. > > The fact is, the whole locking in the block layer open routine is > total and utter cra

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Al Viro
On Wed, Nov 28, 2012 at 11:15:12AM -0800, Linus Torvalds wrote: > No, this is crap. > > We don't introduce random hooks like this just because the block layer > has shit-for-brains and cannot be bothered to do things right. > > The fact is, the whole locking in the block layer open routine is > t

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Linus Torvalds
No, this is crap. We don't introduce random hooks like this just because the block layer has shit-for-brains and cannot be bothered to do things right. The fact is, the whole locking in the block layer open routine is total and utter crap. It doesn't lock the right thing, even with your change *a

[PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Jens Axboe wrote: > On 2012-11-28 04:57, Mikulas Patocka wrote: > > > > This patch is wrong because you must check if the device is mapped while > > holding bdev->bd_block_size_semaphore (because > > bdev->bd_block_size_semaphore prevents new mappings from being created)