Re: Proposal for "proper" durable fsync() and fdatasync()

2008-02-26 Thread Nick Piggin
On Tuesday 26 February 2008 18:59, Jamie Lokier wrote: > Andrew Morton wrote: > > On Tue, 26 Feb 2008 07:26:50 + Jamie Lokier <[EMAIL PROTECTED]> wrote: > > > (It would be nicer if sync_file_range() > > > took a vector of ranges for better elevator scheduling, but let's > > > ignore that :-) >

Re: [PATCH] [0/18] Implement some low hanging BKL removal fruit in fs/*

2008-01-27 Thread Nick Piggin
but the work is done so I guess I should send it along. The minix filesystem uses bkl to protect access to metadata. Switch to a per-superblock mutex. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/fs/minix/bitmap.c =

Re: [PATCH][RFC] fast file mapping for loop

2008-01-09 Thread Nick Piggin
On Wednesday 09 January 2008 19:52, Jens Axboe wrote: > So how does it work? Instead of punting IO to a thread and passing it > through the page cache, we instead attempt to send the IO directly to the > filesystem block that it maps to. You told Christoph that just using direct-IO from kernel st

Re: [PATCH 36/42] VFS: export drop_pagecache_sb

2007-12-13 Thread Nick Piggin
On Friday 14 December 2007 02:24, Erez Zadok wrote: > In message <[EMAIL PROTECTED]>, Nick Piggin writes: > > On Monday 10 December 2007 13:42, Erez Zadok wrote: > > > Needed to maintain cache coherency after branch management. > > > > Hmm, I&

Re: [PATCH 36/42] VFS: export drop_pagecache_sb

2007-12-11 Thread Nick Piggin
On Monday 10 December 2007 13:42, Erez Zadok wrote: > Needed to maintain cache coherency after branch management. > Hmm, I'd much prefer to be able to sleep in invalidate_mapping_pages before this function gets exported. As it is, it can cause massive latencies on preemption and the inode_lock so

Re: [patch] mm: fix XIP file writes

2007-12-11 Thread Nick Piggin
flush_dcache_page(page); > > I asked myself why this problem never happened before. So I asked our testers > to reproduce this problem on 2.6.23 and service levels. As the testcase did > not trigger, I looked into the 2.6.23 code. This problem was introduced by >

Re: [patch] ext2: xip check fix

2007-12-06 Thread Nick Piggin
On Thu, Dec 06, 2007 at 10:17:39PM -0600, Rob Landley wrote: > On Thursday 06 December 2007 21:22:25 Jared Hulbert wrote: > > > > I have'nt looked at it yet. I do appreciate it, I think it might > > > > broaden the user-base of this feature which is up to now s390 only due > > > > to the fact that

Re: [patch] ext2: xip check fix

2007-12-06 Thread Nick Piggin
On Thu, Dec 06, 2007 at 10:59:02AM +0100, Carsten Otte wrote: > Nick Piggin wrote: > >After my patch, we can do XIP in a hardsect size < PAGE_SIZE block > >device -- this seems to be a fine thing to do at least for the > >ramdisk code. Would this situation be problema

Re: [patch] ext2: xip check fix

2007-12-06 Thread Nick Piggin
On Thu, Dec 06, 2007 at 09:43:27AM +0100, Carsten Otte wrote: > Nick Piggin wrote: > >>Xip does only work, if both do match PAGE_SIZE because it > >>does'nt support multiple calls to direct_access in the get_xip_page > >>address space operation. Thus we

Re: [patch] ext2: xip check fix

2007-12-05 Thread Nick Piggin
On Wed, Dec 05, 2007 at 04:43:16PM +0100, Carsten Otte wrote: > Nick Piggin wrote: > >Am I missing something here? I wonder how s390 works without this change? > > > >-- > >ext2 should not worry about checking sb->s_blocksize for XIP before the > >sb's b

[patch] rd: support XIP (updated)

2007-12-04 Thread Nick Piggin
On Tue, Dec 04, 2007 at 12:06:23PM +, Duane Griffin wrote: > On 04/12/2007, Nick Piggin <[EMAIL PROTECTED]> wrote: > > + gfp_flags = GFP_NOIO | __GFP_ZERO; > > +#ifndef CONFIG_BLK_DEV_XIP > > + gfp_flags |= __GFP_HIGHMEM; > > +#endif >

[patch] mm: fix XIP file writes

2007-12-04 Thread Nick Piggin
On Tue, Dec 04, 2007 at 12:35:49PM +0100, Nick Piggin wrote: > On Tue, Dec 04, 2007 at 03:26:20AM -0800, Andrew Morton wrote: > > On Tue, 4 Dec 2007 12:21:00 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > + * > > > + * Cannot support XIP a

Re: [patch] rd: support XIP

2007-12-04 Thread Nick Piggin
On Tue, Dec 04, 2007 at 03:26:20AM -0800, Andrew Morton wrote: > On Tue, 4 Dec 2007 12:21:00 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > +* > > +* Cannot support XIP and highmem, because our ->direct_access > > +* routine for XIP must return

[patch] ext2: xip check fix

2007-12-04 Thread Nick Piggin
Am I missing something here? I wonder how s390 works without this change? -- ext2 should not worry about checking sb->s_blocksize for XIP before the sb's blocksize actually gets set. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- Index: linux-2.6/f

[patch] rd: support XIP

2007-12-04 Thread Nick Piggin
On Tue, Dec 04, 2007 at 11:10:09AM +0100, Nick Piggin wrote: > > > > This is just an idea, I dont know if it is worth the trouble, but have you > > though about implementing direct_access for brd? That would allow > > execute-in-place (xip) on brd eliminating the extra co

Re: [patch] rewrite rd

2007-12-04 Thread Nick Piggin
On Tue, Dec 04, 2007 at 10:54:51AM +0100, Christian Borntraeger wrote: > Am Dienstag, 4. Dezember 2007 schrieb Nick Piggin: > [...] > > There is one slight downside -- direct block device access and filesystem > > metadata access goes through an extra copy and gets stored in RAM

Re: [patch] rewrite rd

2007-12-04 Thread Nick Piggin
On Tue, Dec 04, 2007 at 01:55:17AM -0600, Rob Landley wrote: > On Monday 03 December 2007 22:26:28 Nick Piggin wrote: > > There is one slight downside -- direct block device access and filesystem > > metadata access goes through an extra copy and gets stored in RAM twice. &g

Re: [patch] rewrite rd

2007-12-03 Thread Nick Piggin
On Tue, Dec 04, 2007 at 08:01:31AM +0100, Nick Piggin wrote: > Thanks for the review, I'll post an incremental patch in a sec. Index: linux-2.6/drivers/block/brd.c === --- linux-2.6.orig/drivers/block/brd.c +++ linux-2.6

Re: [patch] rewrite rd

2007-12-03 Thread Nick Piggin
On Mon, Dec 03, 2007 at 10:29:03PM -0800, Andrew Morton wrote: > On Tue, 4 Dec 2007 05:26:28 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > There is one slight downside -- direct block device access and filesystem > > metadata access goes through an extra copy a

[patch] rewrite rd

2007-12-03 Thread Nick Piggin
ze (because it is no longer part of the ramdisk code). - Boot / load time flexible ramdisk size, which could easily be extended to a per-ramdisk runtime changeable size (eg. with an ioctl). Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- MAINTAINERS|5 drivers/block/

[rfc][patch 2/2] inotify: remove debug code

2007-12-02 Thread Nick Piggin
nding problems anyway. So remove it for now. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- Index: linux-2.6/fs/dcache.c === --- linux-2.6.orig/fs/dcache.c +++ linux-2.6/fs/dcache.c @@ -1408,9 +1408,6 @@ void d_delete(st

[rfc][patch 1/2] inotify: fix race

2007-12-02 Thread Nick Piggin
ld_flags after adding the watch. Locking is taken care of, because both set_dentry_child_flags and inotify_d_instantiate hold dcache_lock and child->d_locks. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- Index: linux-2.6/fs/inotify.c ===

Re: [rfc][patch 3/5] afs: new aops

2007-11-15 Thread Nick Piggin
On Thu, Nov 15, 2007 at 12:15:41PM +, David Howells wrote: > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > So you're saying a struct page controls an area of PAGE_CACHE_SIZE, not an > > > area of PAGE_SIZE? > > > > No, a pagecache page

Re: Should PAGE_CACHE_SIZE be discarded?

2007-11-15 Thread Nick Piggin
On Thu, Nov 15, 2007 at 02:46:46PM +, David Howells wrote: > Benny Halevy <[EMAIL PROTECTED]> wrote: > > > I think that what Nick was trying to say is that PAGE_CACHE_SIZE should > > always be used properly as the size of the memory struct Page covers (while > > PAGE_SIZE is the hardware page

Re: [PATCH 3/3] nfs: use ->mmap_prepare() to avoid an AB-BA deadlock

2007-11-14 Thread Nick Piggin
On Wed, Nov 14, 2007 at 05:18:50PM -0500, Trond Myklebust wrote: > > On Wed, 2007-11-14 at 22:50 +0100, Peter Zijlstra wrote: > > Right, but I guess what Nick asked is, if pages could be stale to start > > with, how is that avoided in the future. > > > > The way I understand it, this re-validate

Re: Should PAGE_CACHE_SIZE be discarded?

2007-11-14 Thread Nick Piggin
On Wed, Nov 14, 2007 at 03:59:39PM +, David Howells wrote: > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > Christoph Lameter has patches exactly to make PAGE_CACHE_SIZE larger than > > PAGE_SIZE, and they seem to work without much effort. I happen to hate the > &g

Re: [rfc][patch 3/5] afs: new aops

2007-11-14 Thread Nick Piggin
On Wed, Nov 14, 2007 at 03:57:46PM +, David Howells wrote: > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > In core code, the PAGE_CACHE_SIZE is for page cache struct pages. Single > > struct pages (not page arrays). Take a look at generic mapping read or > > some

Re: [PATCH 3/3] nfs: use ->mmap_prepare() to avoid an AB-BA deadlock

2007-11-14 Thread Nick Piggin
On Wed, Nov 14, 2007 at 09:01:39PM +0100, Peter Zijlstra wrote: > Normal locking order is: > > i_mutex > mmap_sem > > However NFS's ->mmap hook, which is called under mmap_sem, can take i_mutex. > Avoid this potential deadlock by doing the work that requires i_mutex from > the new ->mmap_pr

Re: Should PAGE_CACHE_SIZE be discarded?

2007-11-14 Thread Nick Piggin
On Wed, Nov 14, 2007 at 01:56:53PM +, David Howells wrote: > > Are we ever going to have PAGE_CACHE_SIZE != PAGE_SIZE? If not, why not > discard PAGE_CACHE_SIZE as it's then redundant. > Christoph Lameter has patches exactly to make PAGE_CACHE_SIZE larger than PAGE_SIZE, and they seem to wo

Re: [rfc][patch 3/5] afs: new aops

2007-11-14 Thread Nick Piggin
On Wed, Nov 14, 2007 at 12:18:43PM +, David Howells wrote: > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > The problem is that the code called assumes that the struct page * > > > argument points to a single page, not an array of pages as would > > > pr

Re: [rfc][patch 3/5] afs: new aops

2007-11-13 Thread Nick Piggin
On Tue, Nov 13, 2007 at 10:56:25AM +, David Howells wrote: > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > It takes a pagecache page, yes. If you follow convention, you use > > PAGE_CACHE_SIZE for that guy. You don't have to allow PAGE_CACHE_SIZE != > > PAG

Re: [rfc][patch 3/5] afs: new aops

2007-11-12 Thread Nick Piggin
On Tue, Nov 13, 2007 at 12:30:05AM +, David Howells wrote: > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > PAGE_CACHE_SIZE should be used to address the pagecache. > > Perhaps, but the function being called from there takes pages not page cache > slots. If I have t

Re: [rfc][patch 3/5] afs: new aops

2007-11-12 Thread Nick Piggin
On Mon, Nov 12, 2007 at 03:29:14PM +, David Howells wrote: > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > - ASSERTCMP(start + len, <=, PAGE_SIZE); > > + ASSERTCMP(len, <=, PAGE_CACHE_SIZE); > > Do you guarantee this will work if PAGE_CACHE_SIZE !=

[rfc][patch 5/5] remove prepare_write

2007-11-11 Thread Nick Piggin
Index: linux-2.6/drivers/block/loop.c === --- linux-2.6.orig/drivers/block/loop.c +++ linux-2.6/drivers/block/loop.c @@ -40,8 +40,7 @@ * Heinz Mauelshagen <[EMAIL PROTECTED]>, Feb 2002 * * Support for falling back on the write f

[rfc][patch 4/5] rd: rewrite rd

2007-11-11 Thread Nick Piggin
use it can also reclaim buffer heads. The fact that it now goes through all the regular vm/fs paths makes it much more useful for testing, too. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- Index: linux-2.6/drivers/block/Kconfig

[rfc][patch 3/5] afs: new aops

2007-11-11 Thread Nick Piggin
Convert afs to new aops. Cannot assume writes will fully complete, so this conversion goes the easy way and always brings the page uptodate before the write. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- Index: linux-2.6/fs/afs/

[rfc][patch 2/5] cifs: new aops

2007-11-11 Thread Nick Piggin
witch to writeback mode in the case that the full page was dirtied. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- Index: linux-2.6/fs/cifs/file.c === --- linux-2.6.orig/fs/cifs/file.c +++ linux-2.6/fs/cifs/file.c @@ -1

[rfc][patch 1/5] ecryptfs new aops

2007-11-11 Thread Nick Piggin
Convert ecryptfs to new aops. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- Index: linux-2.6/fs/ecryptfs/mmap.c === --- linux-2.6.orig/fs/ecryptfs/mmap.c +++ linux-2.6/fs/ecryptfs/mmap.c @@ -263,31 +263,38

[rfc][patches] remove ->prepare_write

2007-11-11 Thread Nick Piggin
Hi, These are a set of patches to convert the last few filesystems to use the new deadlock-free write aops, and remove the core code to handle the legacy write path. I don't really have setups to sufficiently test these filesystems. So I would really appreciate if filesystem maintainers can pick

Re: [patch] fs: restore nobh

2007-10-25 Thread Nick Piggin
On Thu, Oct 25, 2007 at 09:07:36PM +0200, Jan Kara wrote: > Hi, > > > This is overdue, sorry. Got a little complicated, and I've been away from > > my filesystem test setup so I didn't want ot send it (lucky, coz I found > > a bug after more substantial testing). > > > > Anyway, RFC? > Hmm, m

Re: [PATCH 00/31] Remove iget() and read_inode() [try #4]

2007-10-12 Thread Nick Piggin
On Friday 12 October 2007 19:07, David Howells wrote: > Hi Linus, > > Here's a set of patches that remove all calls to iget() and all > read_inode() functions. They should be removed for two reasons: firstly > they don't lend themselves to good error handling, and secondly their > presence is a te

Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:51, Andrew Morton wrote: > On Mon, 8 Oct 2007 10:28:43 -0700 > > I'll now add remap_file_pages soon. > > Maybe those other 2 tests aren't strong enough (?). > > Or maybe they don't return a non-0 exit status even when they fail... > > (I'll check.) > > Perhaps Yan Zhe

Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:04, Andrew Morton wrote: > On Mon, 8 Oct 2007 19:45:08 +0800 "Yan Zheng" <[EMAIL PROTECTED]> wrote: > > Hi all > > > > The test for VM_CAN_NONLINEAR always fails > > > > Signed-off-by: Yan Zheng<[EMAIL PROTECTED]> > > > > diff -ur linux-2.6.23-rc9/mm/fremap.c linu

Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
instead of failing. I doubt anybody will be using nonlinear mappings on anything but regular files for the time being, but as a trivial fix, I think this probably should go into 2.6.23. Thanks for spotting this problem Acked-by: Nick Piggin <[EMAIL PROTECTED]> > I hope Nick or Miklos

[patch] fs: restore nobh

2007-10-07 Thread Nick Piggin
Hi, This is overdue, sorry. Got a little complicated, and I've been away from my filesystem test setup so I didn't want ot send it (lucky, coz I found a bug after more substantial testing). Anyway, RFC? --- Implement nobh in new aops. This is a bit tricky. FWIW, nobh_truncate is now implemented

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-10-02 Thread Nick Piggin
On Tuesday 02 October 2007 06:50, Christoph Lameter wrote: > On Fri, 28 Sep 2007, Nick Piggin wrote: > > I thought it was slower. Have you fixed the performance regression? > > (OK, I read further down that you are still working on it but not > > confirmed yet...) > > Th

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-10-02 Thread Nick Piggin
On Tuesday 02 October 2007 07:01, Christoph Lameter wrote: > On Sat, 29 Sep 2007, Peter Zijlstra wrote: > > On Fri, 2007-09-28 at 11:20 -0700, Christoph Lameter wrote: > > > Really? That means we can no longer even allocate stacks for forking. > > > > I think I'm running with 4k stacks... > > 4k st

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-09-30 Thread Nick Piggin
On Monday 01 October 2007 06:12, Andrew Morton wrote: > On Sun, 30 Sep 2007 05:09:28 +1000 Nick Piggin <[EMAIL PROTECTED]> wrote: > > On Sunday 30 September 2007 05:20, Andrew Morton wrote: > > > We can't "run out of unfragmented memory" for an orde

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-09-30 Thread Nick Piggin
On Sunday 30 September 2007 05:20, Andrew Morton wrote: > On Sat, 29 Sep 2007 06:19:33 +1000 Nick Piggin <[EMAIL PROTECTED]> wrote: > > On Saturday 29 September 2007 19:27, Andrew Morton wrote: > > > On Sat, 29 Sep 2007 11:14:02 +0200 Peter Zijlstra > > >

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-09-29 Thread Nick Piggin
On Saturday 29 September 2007 04:41, Christoph Lameter wrote: > On Fri, 28 Sep 2007, Peter Zijlstra wrote: > > memory got massively fragemented, as anti-frag gets easily defeated. > > setting min_free_kbytes to 12M does seem to solve it - it forces 2 max > > order blocks to stay available, so we do

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-09-29 Thread Nick Piggin
On Saturday 29 September 2007 19:27, Andrew Morton wrote: > On Sat, 29 Sep 2007 11:14:02 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > > oom-killings, or page allocation failures? The latter, one hopes. > > > > Linux version 2.6.23-rc4-mm1-dirty ([EMAIL PROTECTED]) (gcc version 4.1.2 > > (

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-09-28 Thread Nick Piggin
On Saturday 29 September 2007 03:33, Christoph Lameter wrote: > On Fri, 28 Sep 2007, Nick Piggin wrote: > > On Wednesday 19 September 2007 13:36, Christoph Lameter wrote: > > > SLAB_VFALLBACK can be specified for selected slab caches. If fallback > > > is available the

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-28 Thread Nick Piggin
On Thursday 20 September 2007 11:38, David Chinner wrote: > On Wed, Sep 19, 2007 at 04:04:30PM +0200, Andrea Arcangeli wrote: > > Plus of course you don't like fsblock because it requires work to > > adapt a fs to it, I can't argue about that. > > No, I don't like fsblock because it is inherently

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-09-28 Thread Nick Piggin
On Wednesday 19 September 2007 13:36, Christoph Lameter wrote: > SLAB_VFALLBACK can be specified for selected slab caches. If fallback is > available then the conservative settings for higher order allocations are > overridden. We then request an order that can accomodate at mininum > 100 objects.

Re: [13/17] Virtual compound page freeing in interrupt context

2007-09-20 Thread Nick Piggin
On Wednesday 19 September 2007 13:36, Christoph Lameter wrote: > If we are in an interrupt context then simply defer the free via a > workqueue. > > In an interrupt context it is not possible to use vmalloc_addr() to > determine the vmalloc address. So add a variant that does that too. > > Removing

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-19 Thread Nick Piggin
On Wednesday 19 September 2007 04:30, Linus Torvalds wrote: > On Tue, 18 Sep 2007, Nick Piggin wrote: > > ROFL! Yeah of course, how could I have forgotten about our trusty OOM > > killer as the solution to the fragmentation problem? It would only have > > been funnier if y

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-18 Thread Nick Piggin
On Tuesday 18 September 2007 08:05, Christoph Lameter wrote: > On Sun, 16 Sep 2007, Nick Piggin wrote: > > > > fsblock doesn't need any of those hacks, of course. > > > > > > Nor does mine for the low orders that we are considering. For order > > >

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-18 Thread Nick Piggin
On Tuesday 18 September 2007 08:21, Christoph Lameter wrote: > On Sun, 16 Sep 2007, Nick Piggin wrote: > > > > So if you argue that vmap is a downside, then please tell me how you > > > > consider the -ENOMEM of your approach to be better? > > > > > >

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-18 Thread Nick Piggin
On Tuesday 18 September 2007 08:00, Christoph Lameter wrote: > On Sun, 16 Sep 2007, Nick Piggin wrote: > > I don't know how it would prevent fragmentation from building up > > anyway. It's commonly the case that potentially unmovable objects > > are allowed to fill

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-17 Thread Nick Piggin
On Monday 17 September 2007 14:07, David Chinner wrote: > On Fri, Sep 14, 2007 at 06:48:55AM +1000, Nick Piggin wrote: > > OK, the vunmap batching code wipes your TLB flushing and IPIs off > > the table. Diffstat below, but the TLB portions are here (besides that > > _ev

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-17 Thread Nick Piggin
On Saturday 15 September 2007 04:08, Christoph Lameter wrote: > On Fri, 14 Sep 2007, Nick Piggin wrote: > > However fsblock can do everything that higher order pagecache can > > do in terms of avoiding vmap and giving contiguous memory to block > > devices by opportunistica

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-17 Thread Nick Piggin
On Monday 17 September 2007 04:13, Mel Gorman wrote: > On (15/09/07 14:14), Goswin von Brederlow didst pronounce: > > I keep coming back to the fact that movable objects should be moved > > out of the way for unmovable ones. Anything else just allows > > fragmentation to build up. > > This is easi

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-17 Thread Nick Piggin
On Saturday 15 September 2007 03:52, Christoph Lameter wrote: > On Fri, 14 Sep 2007, Nick Piggin wrote: > > > > [*] ok, this isn't quite true because if you can actually put a hard > > > > limit on unmovable allocations then anti-frag will fundamentally help > &

Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-09-17 Thread Nick Piggin
On Saturday 15 September 2007 20:22, Soeren Sonnenburg wrote: > On Sat, 2007-09-15 at 09:47 +, Soeren Sonnenburg wrote: > > Memtest did not find anything after 16 passes so I finally stopped it > > applied your patch and used > > > > CONFIG_DEBUG_SLAB=y > > CONFIG_DEBUG_SLAB_LEAK=y > > > > and

Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-09-14 Thread Nick Piggin
On Friday 14 September 2007 16:02, Soeren Sonnenburg wrote: > On Thu, 2007-09-13 at 09:51 +1000, Nick Piggin wrote: > > On Thursday 13 September 2007 19:20, Soeren Sonnenburg wrote: > > > Dear all, > > > > > > I've just seen this in dmesg on a AMD K

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-14 Thread Nick Piggin
On Thursday 13 September 2007 09:17, Christoph Lameter wrote: > On Wed, 12 Sep 2007, Nick Piggin wrote: > > I will still argue that my approach is the better technical solution for > > large block support than yours, I don't think we made progress on that. > > And I

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-14 Thread Nick Piggin
On Thursday 13 September 2007 09:06, Christoph Lameter wrote: > On Wed, 12 Sep 2007, Nick Piggin wrote: > > So lumpy reclaim does not change my formula nor significantly help > > against a fragmentation attack. AFAIKS. > > Lumpy reclaim improves the situation sign

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-14 Thread Nick Piggin
On Thursday 13 September 2007 12:01, Nick Piggin wrote: > On Thursday 13 September 2007 23:03, David Chinner wrote: > > Then just do operations on directories with lots of files in them > > (tens of thousands). Every directory operation will require at > > least one vmap in th

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-13 Thread Nick Piggin
On Thursday 13 September 2007 23:03, David Chinner wrote: > On Thu, Sep 13, 2007 at 03:23:21AM +1000, Nick Piggin wrote: > > Well, it may not be easy to _fix_, but it's easy to try a few > > improvements ;) > > > > How do I make an image and run a workload that

Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-09-13 Thread Nick Piggin
On Thursday 13 September 2007 19:20, Soeren Sonnenburg wrote: > Dear all, > > I've just seen this in dmesg on a AMD K7 / kernel 2.6.22.6 machine > (config attached). > > Any ideas / which further information needed ? Thanks for the report. Is it reproduceable? It seems like the locks_free_lock cal

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-13 Thread Nick Piggin
On Thursday 13 September 2007 11:49, David Chinner wrote: > On Wed, Sep 12, 2007 at 01:27:33AM +1000, Nick Piggin wrote: > > I just gave 4 things which combined might easily reduce xfs vmap overhead > > by several orders of magnitude, all without changing much code at all. >

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-12 Thread Nick Piggin
On Wednesday 12 September 2007 10:00, Christoph Lameter wrote: > On Tue, 11 Sep 2007, Nick Piggin wrote: > > Yes. I think we differ on our interpretations of "okay". In my > > interpretation, it is not OK to use this patch as a way to solve VM or FS > > or IO sca

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-12 Thread Nick Piggin
On Wednesday 12 September 2007 07:52, Christoph Lameter wrote: > On Tue, 11 Sep 2007, Nick Piggin wrote: > > > No you have not explained why the theoretical issues continue to exist > > > given even just considering Lumpy Reclaim in .23 nor what effect the > > >

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-12 Thread Nick Piggin
On Wednesday 12 September 2007 11:49, David Chinner wrote: > On Tue, Sep 11, 2007 at 04:00:17PM +1000, Nick Piggin wrote: > > > > OTOH, I'm not sure how much buy-in there was from the filesystems > > > > guys. Particularly Christoph H and XFS (which is strange

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-12 Thread Nick Piggin
On Wednesday 12 September 2007 11:49, David Chinner wrote: > On Tue, Sep 11, 2007 at 04:00:17PM +1000, Nick Piggin wrote: > > > > OTOH, I'm not sure how much buy-in there was from the filesystems > > > > guys. Particularly Christoph H and XFS (which is strange

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Wednesday 12 September 2007 07:48, Christoph Lameter wrote: > On Tue, 11 Sep 2007, Nick Piggin wrote: > > But that's not my place to say, and I'm actually not arguing that high > > order pagecache does not have uses (especially as a practical, > > shorter-term s

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Wednesday 12 September 2007 07:41, Christoph Lameter wrote: > On Tue, 11 Sep 2007, Nick Piggin wrote: > > I think I would have as good a shot as any to write a fragmentation > > exploit, yes. I think I've given you enough info to do the same, so I'd > > like t

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Wednesday 12 September 2007 06:53, Mel Gorman wrote: > On (11/09/07 11:44), Nick Piggin didst pronounce: > However, this discussion belongs more with the non-existant-remove-slab > patch. Based on what we've seen since the summits, we need a thorough > analysis with benchmark

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Wednesday 12 September 2007 06:42, Christoph Lameter wrote: > On Tue, 11 Sep 2007, Nick Piggin wrote: > > > I guess you would have to run that without my targeted slab reclaim > > > patchset? Otherwise the slab that are in the way could be reclaimed and > > > y

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Wednesday 12 September 2007 06:01, Christoph Lameter wrote: > On Tue, 11 Sep 2007, Nick Piggin wrote: > > There is a limitation in the VM. Fragmentation. You keep saying this > > is a solved issue and just assuming you'll be able to fix any cases > > that come up as th

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Wednesday 12 September 2007 06:11, Christoph Lameter wrote: > On Tue, 11 Sep 2007, Nick Piggin wrote: > > It would be interesting to craft an attack. If you knew roughly the > > layout and size of your dentry slab for example... maybe you could stat a > > whole lot of fil

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Wednesday 12 September 2007 06:01, Christoph Lameter wrote: > On Tue, 11 Sep 2007, Nick Piggin wrote: > > There is a limitation in the VM. Fragmentation. You keep saying this > > is a solved issue and just assuming you'll be able to fix any cases > > that come up as th

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Wednesday 12 September 2007 04:25, Maxim Levitsky wrote: > Hi, > > I think that fundamental problem is no fragmentation/large pages/... > > The problem is the VM itself. > The vm doesn't use virtual memory, thats all, that the problem. > Although this will be probably linux 3.0, I think that th

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Wednesday 12 September 2007 04:31, Mel Gorman wrote: > On Tue, 2007-09-11 at 18:47 +0200, Andrea Arcangeli wrote: > > Hi Mel, > > Hi, > > > On Tue, Sep 11, 2007 at 04:36:07PM +0100, Mel Gorman wrote: > > > that increasing the pagesize like what Andrea suggested would lead to > > > internal fragm

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Wednesday 12 September 2007 01:36, Mel Gorman wrote: > On Tue, 2007-09-11 at 04:52 +1000, Nick Piggin wrote: > > On Tuesday 11 September 2007 16:03, Christoph Lameter wrote: > > > 5. VM scalability > > >Large block sizes mean less state keeping for the information

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Tuesday 11 September 2007 22:12, Jörn Engel wrote: > On Tue, 11 September 2007 04:52:19 +1000, Nick Piggin wrote: > > On Tuesday 11 September 2007 16:03, Christoph Lameter wrote: > > > 5. VM scalability > > >Large block sizes mean less state keepin

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-11 Thread Nick Piggin
On Tuesday 11 September 2007 16:03, Christoph Lameter wrote: > 5. VM scalability >Large block sizes mean less state keeping for the information being >transferred. For a 1TB file one needs to handle 256 million page >structs in the VM if one uses 4k page size. A 64k page size reduces >

Re: [07/36] Use page_cache_xxx in mm/filemap_xip.c

2007-08-28 Thread Nick Piggin
Christoph Hellwig wrote: On Tue, Aug 28, 2007 at 09:49:38PM +0200, J??rn Engel wrote: On Tue, 28 August 2007 12:05:58 -0700, [EMAIL PROTECTED] wrote: - index = *ppos >> PAGE_CACHE_SHIFT; - offset = *ppos & ~PAGE_CACHE_MASK; + index = page_cache_index(mapping, *ppos); +

[rfc] block-based page writeout

2007-08-27 Thread Nick Piggin
Hi, I've always liked the idea of being able to do writeout directly based on block number, rather than the valiant but doomed-to-be-suboptimal heuristics that our current dirty writeout system does, as it is running above the pagecache and doesn't know about poor file layouts, or interaction betw

[patch] 2.6.23-rc3: fsblock

2007-08-24 Thread Nick Piggin
Hi, I'm still plugging away at fsblock slowly. Haven't really got around to to finishing up any big new features, but there has been a lot of bug fixing and little API changes since last release. I still think fsblock has merit, and even if a more extent-based approach ends up working better for

Re: [patch][rfc] fs: fix nobh error handling

2007-08-19 Thread Nick Piggin
On Tue, Aug 07, 2007 at 07:33:47PM -0700, Andrew Morton wrote: > On Wed, 8 Aug 2007 04:18:38 +0200 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > You could try making it up as you go along, but of course if we _can_ > > attach the buffers here then it would be pr

Re: [patch][rfc] fs: fix nobh error handling

2007-08-08 Thread Nick Piggin
On Wed, Aug 08, 2007 at 07:39:42AM -0700, Mingming Cao wrote: > On Wed, 2007-08-08 at 08:07 -0500, Dave Kleikamp wrote: > > > > For jfs's sake, I don't really care if it ever uses nobh again. I > > originally started using it because I figured the movement was away from > > buffer heads and jfs s

Re: [patch][rfc] fs: fix nobh error handling

2007-08-07 Thread Nick Piggin
On Tue, Aug 07, 2007 at 07:33:47PM -0700, Andrew Morton wrote: > On Wed, 8 Aug 2007 04:18:38 +0200 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > On Tue, Aug 07, 2007 at 06:09:03PM -0700, Andrew Morton wrote: > > > > > > With this change, nobh_prepare_write()

Re: [patch][rfc] fs: fix nobh error handling

2007-08-07 Thread Nick Piggin
On Tue, Aug 07, 2007 at 06:09:03PM -0700, Andrew Morton wrote: > On Tue, 7 Aug 2007 07:51:29 +0200 > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > nobh mode error handling is not just pretty slack, it's wrong. > > > > One cannot zero out the whole page to ens

[patch][rfc] fs: fix nobh error handling

2007-08-06 Thread Nick Piggin
s the regular buffer path does, then attach the buffers to the page so that it can actually be written out correctly and be subject to the normal IO error handling paths. As an upshot, we save 1K of kernel stack on ia64 or powerpc 64K page systems. Signe

Re: [PATCH RFC] extent mapped page cache

2007-07-26 Thread Nick Piggin
On Thu, Jul 26, 2007 at 09:05:15AM -0400, Chris Mason wrote: > On Thu, 26 Jul 2007 04:36:39 +0200 > Nick Piggin <[EMAIL PROTECTED]> wrote: > > [ are state trees a good idea? ] > > > > One thing it gains us is finding the start of the cluster. Even if > >

Re: [PATCH RFC] extent mapped page cache

2007-07-25 Thread Nick Piggin
On Wed, Jul 25, 2007 at 10:10:07PM -0400, Chris Mason wrote: > On Thu, 26 Jul 2007 03:37:28 +0200 > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > > One advantage to the state tree is that it separates the state from > > > the memory being described,

Re: [PATCH RFC] extent mapped page cache

2007-07-25 Thread Nick Piggin
On Wed, Jul 25, 2007 at 08:18:53AM -0400, Chris Mason wrote: > On Wed, 25 Jul 2007 04:32:17 +0200 > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > Having another tree to store block state I think is a good idea as I > > said in the fsblock thread with Dave, but I haven&#x

Re: [PATCH RFC] extent mapped page cache

2007-07-24 Thread Nick Piggin
On Tue, Jul 24, 2007 at 07:25:09PM -0400, Chris Mason wrote: > On Tue, 24 Jul 2007 23:25:43 +0200 > Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > The tree is a critical part of the patch, but it is also the easiest to > rip out and replace. Basically the code stores a range by inserting > an obje

[patch 4/4] ufs convert to new aops fix

2007-07-24 Thread Nick Piggin
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/fs/ufs/dir.c === --- linux-2.6.orig/fs/ufs/dir.c +++ linux-2.6/fs/ufs/dir.c @@ -89,7 +89,7 @@ ino_t ufs_inode_by_name(struct inode *di void ufs_set_link(struct

  1   2   3   4   >