Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Andrew Morton wrote: > mapping_gfp_mask if a pretty foul thing. Adding > > struct page (*alloc_page)(struct address_space *mapping); > > to address_space_operations would be a quite nice cleanup. Ummm... If things would be that simple... I think we need struct page

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Andrew Morton
On Tue, 24 Apr 2007 13:58:35 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Tue, 24 Apr 2007, Andrew Morton wrote: > > > > Then I think we should disable page migration for allocations that do not > > > allow access to the policy zone. That would fix it. > > > > Can't we use map

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Andrew Morton wrote: > > Then I think we should disable page migration for allocations that do not > > allow access to the policy zone. That would fix it. > > Can't we use mapping_gfp_mask() when allocating the destination page? There is no point in migrating something if y

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Andrew Morton
On Tue, 24 Apr 2007 13:44:50 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Tue, 24 Apr 2007, Andrew Morton wrote: > > > > I would say that the filesystem is broke if it has such expectations > > > regardless of page migration. > > > > Others disagree ;) > > > > The filesystem

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Andrew Morton wrote: > > I would say that the filesystem is broke if it has such expectations > > regardless of page migration. > > Others disagree ;) > > The filesystem has *told* the core kernel what its allocation constraints > are by setting up mapping_gfp_mask(). If t

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Andrew Morton
On Tue, 24 Apr 2007 13:30:33 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > If the system has both high memory and normal memory then only > > > allocations > > > to highmemory are subject to memory policies etc etc. The block device > > > allocations would be in zone normal/dm

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Hugh Dickins wrote: > On Tue, 24 Apr 2007, Christoph Lameter wrote: > > On Tue, 24 Apr 2007, Hugh Dickins wrote: > > > > > Or Christoph may prevail in persuading there's no such problem. > > > > This is pointless. NUMA allocations can only be controlled for the highest > >

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Andrew Morton
On Tue, 24 Apr 2007 13:12:42 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Tue, 24 Apr 2007, Andrew Morton wrote: > > > > A highmem page can have buffers??? > > > > yep. Take a 4k page which is stored in four discontiguous 1k disk blocks. > > The > > data at page_buffers(page)

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Hugh Dickins
On Tue, 24 Apr 2007, Christoph Lameter wrote: > On Tue, 24 Apr 2007, Hugh Dickins wrote: > > > Or Christoph may prevail in persuading there's no such problem. > > This is pointless. NUMA allocations can only be controlled for the highest > zone. If we switch to a lower zone then we allocate on a

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Andrew Morton wrote: > > A highmem page can have buffers??? > > yep. Take a 4k page which is stored in four discontiguous 1k disk blocks. The > data at page_buffers(page) is the sole way in which we track which parts of > the page belong to which blocks of the disk. But I s

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Andrew Morton
On Tue, 24 Apr 2007 12:59:17 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Tue, 24 Apr 2007, Andrew Morton wrote: > > > No, think of the following scenario: > > > > - file I/O causes a read of an ext2 file's bitmap. The bitmap is > > brought into /dev/hda1's pagecache using !

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Andrew Morton wrote: > No, think of the following scenario: > > - file I/O causes a read of an ext2 file's bitmap. The bitmap is > brought into /dev/hda1's pagecache using !__GFP_HIGHMEM > > - references are released against that page and it's now just clean > reclaimab

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Hugh Dickins wrote: > Or Christoph may prevail in persuading there's no such problem. This is pointless. NUMA allocations can only be controlled for the highest zone. If we switch to a lower zone then we allocate on a different zone than the user requested. - To unsubscrib

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Andrew Morton
On Tue, 24 Apr 2007 12:34:53 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > > Not as metadata, no. But someone (let's hope only root, though I may > > be wrong on that) can map any part of the block device into userspace. > > Concurrent access to a block device by a filesystem and t

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Hugh Dickins wrote: > I was certainly ignorant of that; but I'm not convinced it eliminates > the potential issue. For a start, sys_move_pages seems not to involve > mempolicies at all - I don't see what prevents it migrating blockdev > pages away from the only node which has

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Hugh Dickins
On Tue, 24 Apr 2007, Andrew Morton wrote: > > From my reading it would be pretty simple to teach unmap_and_move() > to pass mapping_gfp_mask(page_mapping(page)) down into > (*get_new_page)() to get the correct type of page. Or even simpler, since they're already passed the source page, just get i

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Hugh Dickins
On Tue, 24 Apr 2007, Christoph Lameter wrote: > On Tue, 24 Apr 2007, Hugh Dickins wrote: > > > I've not yet looked at the patch under discussion, but this remark > > prompts me... a couple of days ago I got very worried by the various > > hard-wired GFP_HIGHUSER allocations in mm/migrate.c and mm

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Andrew Morton wrote: > > I think that much is also true, but not where the problem lies. > > Isn't the problem that filesystems using these block devices > > expect their metadata to be accessible without kmap calls? > > > > yup. wherever we dereference buffer_head.b_data w

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Andrew Morton
On Tue, 24 Apr 2007 18:45:03 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]> wrote: > On Tue, 24 Apr 2007, Christoph Lameter wrote: > > On Tue, 24 Apr 2007, Hugh Dickins wrote: > > > > > I've not yet looked at the patch under discussion, but this remark > > > prompts me... a couple of days ago I go

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Hugh Dickins wrote: > > And if a page is in the wrong area then it can be bounced before I/O > > is performed on it. > > I think that much is also true, but not where the problem lies. > Isn't the problem that filesystems using these block devices > expect their metadata to

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Hugh Dickins wrote: > I've not yet looked at the patch under discussion, but this remark > prompts me... a couple of days ago I got very worried by the various > hard-wired GFP_HIGHUSER allocations in mm/migrate.c and mm/mempolicy.c, > and wondered how those would work out if

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Hugh Dickins
On Tue, 24 Apr 2007, Christoph Lameter wrote: > On Tue, 24 Apr 2007, Hugh Dickins wrote: > > > I've not yet looked at the patch under discussion, but this remark > > prompts me... a couple of days ago I got very worried by the various > > hard-wired GFP_HIGHUSER allocations in mm/migrate.c and mm

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2007, Hugh Dickins wrote: > I've not yet looked at the patch under discussion, but this remark > prompts me... a couple of days ago I got very worried by the various > hard-wired GFP_HIGHUSER allocations in mm/migrate.c and mm/mempolicy.c, > and wondered how those would work out if

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Andrew Morton
On Tue, 24 Apr 2007 14:09:33 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]> wrote: > On Mon, 23 Apr 2007, Andrew Morton wrote: > > > > OK. I hope. the mapping_gfp_mask() here will have come from bdget()'s > > mapping_set_gfp_mask(&inode->i_data, GFP_USER); If anyone is accidentally > > setting _

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-24 Thread Hugh Dickins
On Mon, 23 Apr 2007, Andrew Morton wrote: > > OK. I hope. the mapping_gfp_mask() here will have come from bdget()'s > mapping_set_gfp_mask(&inode->i_data, GFP_USER); If anyone is accidentally > setting __GFP_HIGHMEM on a blockdev address_space we'll cause ghastly > explosions. Albeit ones whic

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-23 Thread Andrew Morton
On Mon, 23 Apr 2007 15:33:07 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > Grow dev page simply passes GFP_NOFS to find_or_create_page. This means the > allocation of radix tree nodes is done with GFP_NOFS and the allocation > of a new page is done using GFP_NOFS as well. > > The map

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-23 Thread Christoph Lameter
And the second fix (cleanup patch will follow) Pagecache: find_or_create_page does not spread memory. The find_or_create function calls alloc_page with the gfp_mask passed to it which is derived from the mappings gfp mask. So the allocation flags are right (assuming my bugfix to fs/buffer.c is ap

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-23 Thread Christoph Lameter
On Mon, 23 Apr 2007, Andrew Morton wrote: > There are few calls to page_cache_alloc(). Would it not be simpler to just > add the additional argument to page_cache_alloc() (called "extra_gfp", > please) and to update all callers? And to remove page_cache_alloc_cold() > and replace all it callers

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-23 Thread Christoph Lameter
On Mon, 23 Apr 2007, Andrew Morton wrote: > > +static inline struct page *page_cache_alloc_mask(struct address_space *x, > > + gfp_t gfp) > > +{ > > + return __page_cache_alloc(mapping_gfp_mask(x) | gfp); > > +} > > Usually we use the term "mask" to imply an AND function, not

Re: Pagecache: find_or_create_page does not call a proper page allocator function

2007-04-23 Thread Andrew Morton
On Mon, 23 Apr 2007 14:11:57 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > The find_or_create function calls alloc_page with a local gfp mask instead > of using page_cache_alloc. This means that the page allocation will not > obey cpuset memory spreading and page allocation will not p