On Wed, 30 May 2007 20:57:38 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]> wrote:
> I've taken a look at last. It looks like a good fix to a real problem, > but may I suggest a simpler version? The anon_vma isn't usually held > by a refcount, but by having a vma on its linked list: why not just > put a dummy vma into that linked list? No need to add a refcount. > > The NUMA shmem_alloc_page already uses a dummy vma on its stack, Oh, I didn't notice that. If dummy-vma works well now, I'll use it. thank you. > > > static int unmap_and_move(new_page_t get_new_page, unsigned long private, > > - struct page *page, int force) > > + struct page *page, int force, int nocontext) > > { > > An "int context" would be a lot better than the negative "int nocontext"; > even better would be "int holds_mmap_sem". Or even skip the additional > argument completely, use the anon_vma_hold method always without relying > on whether or not mmap_sem is held. I don't know how significant it is > to avoid extra locking here: on the one hand we like to avoid unnecessary > locking; on the other hand there's probably a thousand commoner places in > the kernel where we could pass down an arg to say, actually you won't > need to lock in such and such a case. Hmm, ok. I'd like to try make things simpler. > > > int rc = 0; > > int *result = NULL; > > struct page *newpage = get_new_page(page, private, &result); > > + struct anon_vma *anon_vma = NULL; > > > > if (!newpage) > > return -ENOMEM; > > @@ -632,17 +633,23 @@ static int unmap_and_move(new_page_t get > > goto unlock; > > wait_on_page_writeback(page); > > } > > - > > + /* hold this anon_vma until page migration ends */ > > + if (nocontext && PageAnon(page) && page_mapped(page)) > > + anon_vma = anon_vma_hold(page); > > /* > > * Establish migration ptes or remove ptes > > */ > > - try_to_unmap(page, 1); > > + if (page_mapped(page)) > > + try_to_unmap(page, 1); > > + > > All these preliminary tests: yes, I suppose they avoid unnecessary > locking, so I guess they're good; but it should work without them. > > > if (!page_mapped(page)) > > rc = move_to_new_page(newpage, page); > > > > if (rc) > > remove_migration_ptes(page, page); > > > > + anon_vma_release(anon_vma); > > + > > unlock: > > unlock_page(page); > > > > @@ -686,8 +693,8 @@ move_newpage: > > * > > * Return: Number of pages not migrated or error code. > > */ > > -int migrate_pages(struct list_head *from, > > - new_page_t get_new_page, unsigned long private) > > +int __migrate_pages(struct list_head *from, > > + new_page_t get_new_page, unsigned long private, int nocontext) > > { > > Remarks on nocontext as above: mmm, I think keep the patch small > and don't add that extra argument at all. > > > int retry = 1; > > int nr_failed = 0; > > @@ -707,7 +714,7 @@ int migrate_pages(struct list_head *from > > cond_resched(); > > > > rc = unmap_and_move(get_new_page, private, > > - page, pass > 2); > > + page, pass > 2, nocontext); > > > > switch(rc) { > > case -ENOMEM: > > @@ -737,6 +744,22 @@ out: > > return nr_failed + retry; > > } > > > > +int migrate_pages(struct list_head *from, > > + new_page_t get_new_page, unsigned long private) > > +{ > > + return __migrate_pages(from, get_new_page, private, 0); > > +} > > + > > +/* > > + * When page migration is issued by the kernel itself without page mapper's > > + * mm->sem, we have to be more careful to do page migration. > > + */ > > +int migrate_pages_nocontext(struct list_head *from, > > + new_page_t get_new_page, unsigned long private) > > +{ > > + return __migrate_pages(from, get_new_page, private, 1); > > +} > > + > > #ifdef CONFIG_NUMA > > /* > > * Move a list of individual pages > > Index: linux-2.6.22-rc2-mm1/include/linux/rmap.h > > =================================================================== > > --- linux-2.6.22-rc2-mm1.orig/include/linux/rmap.h > > +++ linux-2.6.22-rc2-mm1/include/linux/rmap.h > > @@ -26,6 +26,9 @@ > > struct anon_vma { > > spinlock_t lock; /* Serialize access to vma list */ > > struct list_head head; /* List of private "related" vmas */ > > +#ifdef CONFIG_MIGRATION > > + int ref; /* special refcnt for migration */ > > +#endif > > }; > > > > #ifdef CONFIG_MMU > > @@ -42,6 +45,14 @@ static inline void anon_vma_free(struct > > kmem_cache_free(anon_vma_cachep, anon_vma); > > } > > > > +#ifdef CONFIG_MIGRATION > > +extern struct anon_vma *anon_vma_hold(struct page *page); > > +extern void anon_vma_release(struct anon_vma *anon_vma); > > +#else > > +#define anon_vma_hold(page) do{}while(0) > > +#define anon_vma_release(anon) do{}while(0) > > Rather than change those to "do {} while (0)", to which others > will ask for static inlines, just delete them, can't you - > they're simply not needed in the !CONFIG_MIGRATION case, right? > Ok. they are not necessary if !CONFIG_MIGRATION. I'll delete. Maybe I was confused at deleting CONFIG_MIGRATON_BY_KERNEL...which needed ifdef. Thank you!. -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/