Re: [RFC] fsblock

2007-07-09 Thread Dave McCracken
On Monday 09 July 2007, Christoph Lameter wrote: > On Tue, 10 Jul 2007, Nick Piggin wrote: > > There are no changes to the filesystem API for large pages (although I > > am adding a couple of helpers to do page based bitmap ops). And I don't > > want to rely on contiguous memory. Why do you think h

Re: [RFC] fsblock

2007-07-09 Thread Nick Piggin
On Mon, Jul 09, 2007 at 05:59:47PM -0700, Christoph Lameter wrote: > On Tue, 10 Jul 2007, Nick Piggin wrote: > > > > Hmmm I did not notice that yet but then I have not done much work > > > there. > > > > Notice what? > > The bad code for the buffer heads. Oh. Well my first mail in this thr

Re: [RFC] fsblock

2007-07-09 Thread Christoph Lameter
On Tue, 10 Jul 2007, Nick Piggin wrote: > > Hmmm I did not notice that yet but then I have not done much work > > there. > > Notice what? The bad code for the buffer heads. > > > - A real "nobh" mode. nobh was created I think mainly to avoid problems > > > with buffer_head memory consump

Re: [RFC] fsblock

2007-07-09 Thread Nick Piggin
On Mon, Jul 09, 2007 at 10:14:06AM -0700, Christoph Lameter wrote: > On Sun, 24 Jun 2007, Nick Piggin wrote: > > > Firstly, what is the buffer layer? The buffer layer isn't really a > > buffer layer as in the buffer cache of unix: the block device cache > > is unified with the pagecache (in terms

Re: [RFC] fsblock

2007-07-09 Thread Christoph Lameter
On Sun, 24 Jun 2007, Nick Piggin wrote: > Firstly, what is the buffer layer? The buffer layer isn't really a > buffer layer as in the buffer cache of unix: the block device cache > is unified with the pagecache (in terms of the pagecache, a blkdev > file is just like any other, but with a 1:1 map

Re: [RFC] fsblock

2007-06-30 Thread Christoph Hellwig
On Sat, Jun 30, 2007 at 07:10:27AM -0400, Jeff Garzik wrote: > >Not really, the current behaviour is a bug. And it's not actually buffer > >layer specific - XFS now has a fix for that bug and it's generic enough > >that everyone could use it. > > I'm not sure I follow. If you require block alloc

Re: [RFC] fsblock

2007-06-30 Thread Jeff Garzik
Christoph Hellwig wrote: On Sat, Jun 23, 2007 at 11:07:54PM -0400, Jeff Garzik wrote: - In line with the above item, filesystem block allocation is performed before a page is dirtied. In the buffer layer, mmap writes can dirty a page with no backing blocks which is a problem if the filesystem

Re: [RFC] fsblock

2007-06-30 Thread Christoph Hellwig
Warning ahead: I've only briefly skipped over the pages so the comments in the mail are very highlevel. On Sun, Jun 24, 2007 at 03:45:28AM +0200, Nick Piggin wrote: > fsblock is a rewrite of the "buffer layer" (ding dong the witch is > dead), which I have been working on, on and off and is now at

Re: [RFC] fsblock

2007-06-30 Thread Christoph Hellwig
On Mon, Jun 25, 2007 at 08:25:21AM -0400, Chris Mason wrote: > > write_begin/write_end is a step in that direction (and it helps > > OCFS and GFS quite a bit). I think there is also not much reason > > for writepage sites to require the page to lock the page and clear > > the dirty bit themselves (

Re: [RFC] fsblock

2007-06-30 Thread Christoph Hellwig
On Sat, Jun 23, 2007 at 11:07:54PM -0400, Jeff Garzik wrote: > >- In line with the above item, filesystem block allocation is performed > > before a page is dirtied. In the buffer layer, mmap writes can dirty a > > page with no backing blocks which is a problem if the filesystem is > > ENOSPC (p

Re: [RFC] fsblock

2007-06-28 Thread Nick Piggin
On Thu, Jun 28, 2007 at 08:20:31AM -0400, Chris Mason wrote: > On Thu, Jun 28, 2007 at 04:44:43AM +0200, Nick Piggin wrote: > > > > That's true but I don't think an extent data structure means we can > > become too far divorced from the pagecache or the native block size > > -- what will end up ha

Re: [RFC] fsblock

2007-06-28 Thread David Chinner
On Thu, Jun 28, 2007 at 08:20:31AM -0400, Chris Mason wrote: > On Thu, Jun 28, 2007 at 04:44:43AM +0200, Nick Piggin wrote: > > That's true but I don't think an extent data structure means we can > > become too far divorced from the pagecache or the native block size > > -- what will end up happeni

Re: [RFC] fsblock

2007-06-28 Thread Chris Mason
On Thu, Jun 28, 2007 at 04:44:43AM +0200, Nick Piggin wrote: > On Thu, Jun 28, 2007 at 08:35:48AM +1000, David Chinner wrote: > > On Wed, Jun 27, 2007 at 07:50:56AM -0400, Chris Mason wrote: > > > Lets look at a typical example of how IO actually gets done today, > > > starting with sys_write(): >

Re: [RFC] fsblock

2007-06-27 Thread Nick Piggin
On Thu, Jun 28, 2007 at 08:35:48AM +1000, David Chinner wrote: > On Wed, Jun 27, 2007 at 07:50:56AM -0400, Chris Mason wrote: > > Lets look at a typical example of how IO actually gets done today, > > starting with sys_write(): > > > > sys_write(file, buffer, 1MB) > > for each page: > > prepar

Re: [RFC] fsblock

2007-06-27 Thread David Chinner
On Wed, Jun 27, 2007 at 07:50:56AM -0400, Chris Mason wrote: > On Wed, Jun 27, 2007 at 07:32:45AM +0200, Nick Piggin wrote: > > On Tue, Jun 26, 2007 at 08:34:49AM -0400, Chris Mason wrote: > > > On Tue, Jun 26, 2007 at 07:23:09PM +1000, David Chinner wrote: > > > > On Tue, Jun 26, 2007 at 01:55:11P

Re: [RFC] fsblock

2007-06-27 Thread Anton Altaparmakov
On 27 Jun 2007, at 12:50, Chris Mason wrote: On Wed, Jun 27, 2007 at 07:32:45AM +0200, Nick Piggin wrote: On Tue, Jun 26, 2007 at 08:34:49AM -0400, Chris Mason wrote: On Tue, Jun 26, 2007 at 07:23:09PM +1000, David Chinner wrote: On Tue, Jun 26, 2007 at 01:55:11PM +1000, Nick Piggin wrote: [

Re: [RFC] fsblock

2007-06-27 Thread Kyle Moffett
On Jun 26, 2007, at 07:14:14, Nick Piggin wrote: On Tue, Jun 26, 2007 at 07:23:09PM +1000, David Chinner wrote: Can we call it a block mapping layer or something like that? e.g. struct blkmap? I'm not fixed on fsblock, but blkmap doesn't grab me either. It is a map from the pagecache to the

Re: [RFC] fsblock

2007-06-27 Thread Chris Mason
On Wed, Jun 27, 2007 at 07:32:45AM +0200, Nick Piggin wrote: > On Tue, Jun 26, 2007 at 08:34:49AM -0400, Chris Mason wrote: > > On Tue, Jun 26, 2007 at 07:23:09PM +1000, David Chinner wrote: > > > On Tue, Jun 26, 2007 at 01:55:11PM +1000, Nick Piggin wrote: > > > > [ ... fsblocks vs extent range m

Re: [RFC] fsblock

2007-06-26 Thread David Chinner
On Wed, Jun 27, 2007 at 07:32:45AM +0200, Nick Piggin wrote: > I think using fsblock to drive the IO and keep the pagecache flags > uptodate and using a btree in the filesystem to manage extents of block > allocations wouldn't be a bad idea though. Do any filesystems actually > do this? Yes. XFS.

Re: [RFC] fsblock

2007-06-26 Thread Nick Piggin
On Tue, Jun 26, 2007 at 08:34:49AM -0400, Chris Mason wrote: > On Tue, Jun 26, 2007 at 07:23:09PM +1000, David Chinner wrote: > > On Tue, Jun 26, 2007 at 01:55:11PM +1000, Nick Piggin wrote: > > [ ... fsblocks vs extent range mapping ] > > > iomaps can double as range locks simply because iomaps

Re: [RFC] fsblock

2007-06-26 Thread Chris Mason
On Tue, Jun 26, 2007 at 07:23:09PM +1000, David Chinner wrote: > On Tue, Jun 26, 2007 at 01:55:11PM +1000, Nick Piggin wrote: [ ... fsblocks vs extent range mapping ] > iomaps can double as range locks simply because iomaps are > expressions of ranges within the file. Seeing as you can only > ac

Re: [RFC] fsblock

2007-06-26 Thread Nick Piggin
On Tue, Jun 26, 2007 at 07:23:09PM +1000, David Chinner wrote: > On Tue, Jun 26, 2007 at 01:55:11PM +1000, Nick Piggin wrote: > > > > > >Realistically, this is not about "filesystem blocks", this is > > >about file offset to disk blocks. i.e. it's a mapping. > > > > Yeah, fsblock ~= the layer betw

Re: [RFC] fsblock

2007-06-26 Thread David Chinner
On Tue, Jun 26, 2007 at 01:55:11PM +1000, Nick Piggin wrote: > David Chinner wrote: > >On Sun, Jun 24, 2007 at 03:45:28AM +0200, Nick Piggin wrote: > >>I'm announcing "fsblock" now because it is quite intrusive and so I'd > >>like to get some thoughts about significantly changing this core part > >

Re: [RFC] fsblock

2007-06-25 Thread Nick Piggin
David Chinner wrote: On Sun, Jun 24, 2007 at 03:45:28AM +0200, Nick Piggin wrote: I'm announcing "fsblock" now because it is quite intrusive and so I'd like to get some thoughts about significantly changing this core part of the kernel. Can you rename it to something other than shorthand for

Re: [RFC] fsblock

2007-06-25 Thread David Chinner
On Sun, Jun 24, 2007 at 03:45:28AM +0200, Nick Piggin wrote: > > I'm announcing "fsblock" now because it is quite intrusive and so I'd > like to get some thoughts about significantly changing this core part > of the kernel. Can you rename it to something other than shorthand for "filesystem block

Re: [RFC] fsblock

2007-06-25 Thread Chris Mason
On Mon, Jun 25, 2007 at 04:58:48PM +1000, Nick Piggin wrote: > > >Using buffer heads instead allows the FS to send file data down inside > >the transaction code, without taking the page lock. So, locking wrt > >data=ordered is definitely going to be tricky. > > > >The best long term option may be

Re: [RFC] fsblock

2007-06-25 Thread Nick Piggin
Andi Kleen wrote: Nick Piggin <[EMAIL PROTECTED]> writes: - Structure packing. A page gets a number of buffer heads that are allocated in a linked list. fsblocks are allocated contiguously, so cacheline footprint is smaller in the above situation. It would be interesting to test if that ma

Re: [RFC] fsblock

2007-06-25 Thread Nick Piggin
Chris Mason wrote: On Sun, Jun 24, 2007 at 05:47:55AM +0200, Nick Piggin wrote: My gut feeling is that there are several problem areas you haven't hit yet, with the new code. I would agree with your gut :) Without having read the code yet (light reading for monday morning ;), ext3 and re

Re: [RFC] fsblock

2007-06-24 Thread Chris Mason
On Sun, Jun 24, 2007 at 05:47:55AM +0200, Nick Piggin wrote: > On Sat, Jun 23, 2007 at 11:07:54PM -0400, Jeff Garzik wrote: > > > >- Large block support. I can mount and run an 8K block size minix3 fs on > > > my 4K page system and it didn't require anything special in the fs. We > > > can go up

Re: [RFC] fsblock

2007-06-24 Thread Andi Kleen
Nick Piggin <[EMAIL PROTECTED]> writes: > > - Structure packing. A page gets a number of buffer heads that are > allocated in a linked list. fsblocks are allocated contiguously, so > cacheline footprint is smaller in the above situation. It would be interesting to test if that makes a differe

Re: [RFC] fsblock

2007-06-23 Thread William Lee Irwin III
On Sun, Jun 24, 2007 at 03:45:28AM +0200, Nick Piggin wrote: > fsblock is a rewrite of the "buffer layer" (ding dong the witch is > dead), which I have been working on, on and off and is now at the stage > where some of the basics are working-ish. This email is going to be > long... Long overdue.

Re: [RFC] fsblock

2007-06-23 Thread Nick Piggin
On Sat, Jun 23, 2007 at 11:07:54PM -0400, Jeff Garzik wrote: > Nick Piggin wrote: > >- No deadlocks (hopefully). The buffer layer is technically deadlocky by > > design, because it can require memory allocations at page writeout-time. > > It also has one path that cannot tolerate memory allocatio

Re: [RFC] fsblock

2007-06-23 Thread Jeff Garzik
Nick Piggin wrote: - No deadlocks (hopefully). The buffer layer is technically deadlocky by design, because it can require memory allocations at page writeout-time. It also has one path that cannot tolerate memory allocation failures. No such problems for fsblock, which keeps fsblock metada

Re: [RFC] fsblock

2007-06-23 Thread Nick Piggin
Just clarify a few things. Don't you hate rereading a long work you wrote? (oh, you're supposed to do that *before* you press send?). On Sun, Jun 24, 2007 at 03:45:28AM +0200, Nick Piggin wrote: > > I'm announcing "fsblock" now because it is quite intrusive and so I'd > like to get some thoughts

[RFC] fsblock

2007-06-23 Thread Nick Piggin
I'm announcing "fsblock" now because it is quite intrusive and so I'd like to get some thoughts about significantly changing this core part of the kernel. fsblock is a rewrite of the "buffer layer" (ding dong the witch is dead), which I have been working on, on and off and is now at the stage whe