Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
Hi, On Sat, Mar 24, 2001 at 10:05:18PM -0300, Rik van Riel wrote: > On Sun, 25 Mar 2001, Stephen C. Tweedie wrote: > > > Rik, do you think it is really necessary to take the page lock and > > release it inside lookup_swap_cache? I may be overlooking something, > > but I can't see the benefit of it --- > > I don't think we need to do this, except to protect us from > using a page which isn't up-to-date yet and locked because > of disk IO. But it doesn't --- page_launder can try to lock the page after it checks the refcount, without taking any locks which protect us against running lookup_swap_cache in parallel. If we get our reference after page_launder checks the count, we can find the page getting locked out from underneath our feet. > Reclaim_page() takes the pagecache_lock before trying to > free anything, so there's no reason to lock against that. Exactly. We're not in danger of _losing_ the page, because reclaim_page is locked more aggressively than page_launder. We still risk having the page locked against us after lookup_swap_cache does its own UnlockPage. So, if lookup_swap_cache doesn't actually ensure that the page is unlocked, are there any callers which implicitly rely on lookup_swap_cache() doing a wait_on_page? --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
Hi, On Sat, Mar 24, 2001 at 10:05:18PM -0300, Rik van Riel wrote: On Sun, 25 Mar 2001, Stephen C. Tweedie wrote: Rik, do you think it is really necessary to take the page lock and release it inside lookup_swap_cache? I may be overlooking something, but I can't see the benefit of it --- I don't think we need to do this, except to protect us from using a page which isn't up-to-date yet and locked because of disk IO. But it doesn't --- page_launder can try to lock the page after it checks the refcount, without taking any locks which protect us against running lookup_swap_cache in parallel. If we get our reference after page_launder checks the count, we can find the page getting locked out from underneath our feet. Reclaim_page() takes the pagecache_lock before trying to free anything, so there's no reason to lock against that. Exactly. We're not in danger of _losing_ the page, because reclaim_page is locked more aggressively than page_launder. We still risk having the page locked against us after lookup_swap_cache does its own UnlockPage. So, if lookup_swap_cache doesn't actually ensure that the page is unlocked, are there any callers which implicitly rely on lookup_swap_cache() doing a wait_on_page? --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Sun, 25 Mar 2001, Stephen C. Tweedie wrote: > Rik, do you think it is really necessary to take the page lock and > release it inside lookup_swap_cache? I may be overlooking something, > but I can't see the benefit of it --- I don't think we need to do this, except to protect us from using a page which isn't up-to-date yet and locked because of disk IO. Reclaim_page() takes the pagecache_lock before trying to free anything, so there's no reason to lock against that. regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
Hi, On Fri, Mar 23, 2001 at 11:58:50AM -0800, Linus Torvalds wrote: > Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. Uggh --- the shmem code already does, see: shmem_truncate->shmem_truncate_part->shmem_free_swp-> lookup_swap_cache->find_lock_page It looks messy: lookup_swap_cache seems to be abusing the page lock gratuitously, but there are probably callers of it which rely on the assumption that it performs an implicit wait_on_page(). Rik, do you think it is really necessary to take the page lock and release it inside lookup_swap_cache? I may be overlooking something, but I can't see the benefit of it --- we can still race against page_launder, so the page may still get locked behind our backs after we get the reference from lookup_swap_cache (page_launder explicitly avoids taking the pagecache hash spinlock which might avoid this particular race). --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
Hi, On Fri, Mar 23, 2001 at 11:58:50AM -0800, Linus Torvalds wrote: Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. Uggh --- the shmem code already does, see: shmem_truncate-shmem_truncate_part-shmem_free_swp- lookup_swap_cache-find_lock_page It looks messy: lookup_swap_cache seems to be abusing the page lock gratuitously, but there are probably callers of it which rely on the assumption that it performs an implicit wait_on_page(). Rik, do you think it is really necessary to take the page lock and release it inside lookup_swap_cache? I may be overlooking something, but I can't see the benefit of it --- we can still race against page_launder, so the page may still get locked behind our backs after we get the reference from lookup_swap_cache (page_launder explicitly avoids taking the pagecache hash spinlock which might avoid this particular race). --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Sun, 25 Mar 2001, Stephen C. Tweedie wrote: Rik, do you think it is really necessary to take the page lock and release it inside lookup_swap_cache? I may be overlooking something, but I can't see the benefit of it --- I don't think we need to do this, except to protect us from using a page which isn't up-to-date yet and locked because of disk IO. Reclaim_page() takes the pagecache_lock before trying to free anything, so there's no reason to lock against that. regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Fri, 23 Mar 2001, Alan Cox wrote: > > __find_get_page has I think a misleading comment ? Ehh.. I only said the _naming_ makes sense. [ Wild hand-waving ] I suspect that what happened was that we split off the functions (one to just get the page, one to lock it), and the comment that was associated with the original "find_page()" never got removed, and just happens to sit above one of the helper functions now - the one that didn't lock. I'll fix the comment. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
> If you don't want to sleep, you need to use one of the wrappers for > "__find_page_nolock()". Something like "find_get_page()", which only > "gets" the page. * a rather lightweight function, finding and getting a reference to a * hashed page atomically, waiting for it if it's locked. __find_get_page has I think a misleading comment ? > The naming actually does make sense in this area. Yep - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
Alan Cox writes: > Umm find_lock_page doesnt sleep does it ? It does lock_page, which sleeps to get the lock if necessary. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
> > > + page = find_lock_page(mapping, idx); > > > > > > Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. > > > > Umm find_lock_page doesnt sleep does it ? > > It certainly does. find_lock_page() -> __find_lock_page() -> lock_page() -> > -> __lock_page() -> schedule(). Ok I missed the lock page one. Yep. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Fri, 23 Mar 2001, Alan Cox wrote: > > On Fri, 23 Mar 2001, Stephen C. Tweedie wrote: > > > > > > The patch below is for two races in sysV shared memory. > > > > + spin_lock (>lock); > > + > > + /* The shmem_swp_entry() call may have blocked, and > > +* shmem_writepage may have been moving a page between the page > > +* cache and swap cache. We need to recheck the page cache > > +* under the protection of the info->lock spinlock. */ > > + > > + page = find_lock_page(mapping, idx); > > > > Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. > > Umm find_lock_page doesnt sleep does it ? It certainly does. find_lock_page() -> __find_lock_page() -> lock_page() -> -> __lock_page() -> schedule(). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Fri, 23 Mar 2001, Alan Cox wrote: > > > > + spin_lock (>lock); > > + > > + /* The shmem_swp_entry() call may have blocked, and > > +* shmem_writepage may have been moving a page between the page > > +* cache and swap cache. We need to recheck the page cache > > +* under the protection of the info->lock spinlock. */ > > + > > + page = find_lock_page(mapping, idx); > > > > Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. > > Umm find_lock_page doesnt sleep does it ? Sure it does. Note the "lock" in find_lock_page(). It will lock the page, which implies sleeping if somebody is accessing it at the same time. If you don't want to sleep, you need to use one of the wrappers for "__find_page_nolock()". Something like "find_get_page()", which only "gets" the page. The naming actually does make sense in this area. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
> On Fri, 23 Mar 2001, Stephen C. Tweedie wrote: > > > > The patch below is for two races in sysV shared memory. > > + spin_lock (>lock); > + > + /* The shmem_swp_entry() call may have blocked, and > +* shmem_writepage may have been moving a page between the page > +* cache and swap cache. We need to recheck the page cache > +* under the protection of the info->lock spinlock. */ > + > + page = find_lock_page(mapping, idx); > > Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. Umm find_lock_page doesnt sleep does it ? Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Fri, 23 Mar 2001, Stephen C. Tweedie wrote: > > The patch below is for two races in sysV shared memory. + spin_lock (>lock); + + /* The shmem_swp_entry() call may have blocked, and +* shmem_writepage may have been moving a page between the page +* cache and swap cache. We need to recheck the page cache +* under the protection of the info->lock spinlock. */ + + page = find_lock_page(mapping, idx); Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Fri, 23 Mar 2001, Stephen C. Tweedie wrote: The patch below is for two races in sysV shared memory. + spin_lock (info-lock); + + /* The shmem_swp_entry() call may have blocked, and +* shmem_writepage may have been moving a page between the page +* cache and swap cache. We need to recheck the page cache +* under the protection of the info-lock spinlock. */ + + page = find_lock_page(mapping, idx); Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Fri, 23 Mar 2001, Stephen C. Tweedie wrote: The patch below is for two races in sysV shared memory. + spin_lock (info-lock); + + /* The shmem_swp_entry() call may have blocked, and +* shmem_writepage may have been moving a page between the page +* cache and swap cache. We need to recheck the page cache +* under the protection of the info-lock spinlock. */ + + page = find_lock_page(mapping, idx); Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. Umm find_lock_page doesnt sleep does it ? Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Fri, 23 Mar 2001, Alan Cox wrote: On Fri, 23 Mar 2001, Stephen C. Tweedie wrote: The patch below is for two races in sysV shared memory. + spin_lock (info-lock); + + /* The shmem_swp_entry() call may have blocked, and +* shmem_writepage may have been moving a page between the page +* cache and swap cache. We need to recheck the page cache +* under the protection of the info-lock spinlock. */ + + page = find_lock_page(mapping, idx); Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. Umm find_lock_page doesnt sleep does it ? It certainly does. find_lock_page() - __find_lock_page() - lock_page() - - __lock_page() - schedule(). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Fri, 23 Mar 2001, Alan Cox wrote: + spin_lock (info-lock); + + /* The shmem_swp_entry() call may have blocked, and +* shmem_writepage may have been moving a page between the page +* cache and swap cache. We need to recheck the page cache +* under the protection of the info-lock spinlock. */ + + page = find_lock_page(mapping, idx); Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. Umm find_lock_page doesnt sleep does it ? Sure it does. Note the "lock" in find_lock_page(). It will lock the page, which implies sleeping if somebody is accessing it at the same time. If you don't want to sleep, you need to use one of the wrappers for "__find_page_nolock()". Something like "find_get_page()", which only "gets" the page. The naming actually does make sense in this area. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
+ page = find_lock_page(mapping, idx); Ehh.. Sleeping with the spin-lock held? Sounds like a truly bad idea. Umm find_lock_page doesnt sleep does it ? It certainly does. find_lock_page() - __find_lock_page() - lock_page() - - __lock_page() - schedule(). Ok I missed the lock page one. Yep. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
If you don't want to sleep, you need to use one of the wrappers for "__find_page_nolock()". Something like "find_get_page()", which only "gets" the page. * a rather lightweight function, finding and getting a reference to a * hashed page atomically, waiting for it if it's locked. __find_get_page has I think a misleading comment ? The naming actually does make sense in this area. Yep - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
On Fri, 23 Mar 2001, Alan Cox wrote: __find_get_page has I think a misleading comment ? Ehh.. I only said the _naming_ makes sense. [ Wild hand-waving ] I suspect that what happened was that we split off the functions (one to just get the page, one to lock it), and the comment that was associated with the original "find_page()" never got removed, and just happens to sit above one of the helper functions now - the one that didn't lock. I'll fix the comment. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix races in 2.4.2-ac22 SysV shared memory
Alan Cox writes: Umm find_lock_page doesnt sleep does it ? It does lock_page, which sleeps to get the lock if necessary. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix races in 2.4.2-ac22 SysV shared memory
Hi, The patch below is for two races in sysV shared memory. The first (minor) one is in shmem_free_swp: swap_free (entry); *ptr = (swp_entry_t){0}; freed++; if (!(page = lookup_swap_cache(entry))) continue; delete_from_swap_cache(page); page_cache_release(page); has a window between the first swap_free() and the lookup_swap_cache(). If the swap_free() frees the last ref to the swap entry and another cpu allocates and caches the same entry before the lookup, we'll end up destroying another task's swap cache. The second is nastier. shmem_nopage() uses the inode semaphore to serialise access to shmem_getpage_locked() for paging in shared memory segments. Lookups in the page cache and in the shmem swap vector are done to locate the entry. _getpage_ can move entries from swap to page cache under protection of the shmem's info->lock spinlock. shmem_writepage() is locked via the page lock, and moves shmem pages from the page cache to the swap cache under protection of the same info->lock spinlock. However, shmem_nopage() does not hold this spinlock while doing its lookups in the page cache and swap vectors, so it can race with writepage, with once cpu in the middle of moving the page out of the page cache in writepage and the other cpu then failing to find the entry either in the page cache or in the shm swap entry vector. Feedback welcome. Cheers, Stephen --- mm/shmem.c.~1~ Fri Mar 23 00:26:49 2001 +++ mm/shmem.c Fri Mar 23 00:42:21 2001 @@ -121,13 +121,13 @@ if (!ptr->val) continue; entry = *ptr; - swap_free (entry); *ptr = (swp_entry_t){0}; freed++; - if (!(page = lookup_swap_cache(entry))) - continue; - delete_from_swap_cache(page); - page_cache_release(page); + if ((page = lookup_swap_cache(entry)) != NULL) { + delete_from_swap_cache(page); + page_cache_release(page); + } + swap_free (entry); } return freed; } @@ -218,15 +218,24 @@ } /* - * Move the page from the page cache to the swap cache + * Move the page from the page cache to the swap cache. + * + * The page lock prevents multiple occurences of shmem_writepage at + * once. We still need to guard against racing with + * shmem_getpage_locked(). */ static int shmem_writepage(struct page * page) { int error; struct shmem_inode_info *info; swp_entry_t *entry, swap; + struct inode *inode; - info = >mapping->host->u.shmem_i; + if (!PageLocked(page)) + BUG(); + + inode = page->mapping->host; + info = >u.shmem_i; swap = __get_swap_page(2); if (!swap.val) { set_page_dirty(page); @@ -234,11 +243,11 @@ return -ENOMEM; } - spin_lock(>lock); - shmem_recalc_inode(page->mapping->host); entry = shmem_swp_entry(info, page->index); if (IS_ERR(entry)) /* this had been allocted on page allocation */ BUG(); + spin_lock(>lock); + shmem_recalc_inode(page->mapping->host); error = -EAGAIN; if (entry->val) { __swap_free(swap, 2); @@ -268,6 +277,10 @@ * If we allocate a new one we do not mark it dirty. That's up to the * vm. If we swap it in we mark it dirty since we also free the swap * entry since a page cannot live in both the swap and page cache + * + * Called with the inode locked, so it cannot race with itself, but we + * still need to guard against racing with shm_writepage(), which might + * be trying to move the page to the swap cache as we run. */ static struct page * shmem_getpage_locked(struct inode * inode, unsigned long idx) { @@ -276,31 +289,57 @@ struct page * page; swp_entry_t *entry; - page = find_lock_page(mapping, idx);; + info = >u.shmem_i; + +repeat: + page = find_lock_page(mapping, idx); if (page) return page; - info = >u.shmem_i; entry = shmem_swp_entry (info, idx); if (IS_ERR(entry)) return (void *)entry; + + spin_lock (>lock); + + /* The shmem_swp_entry() call may have blocked, and +* shmem_writepage may have been moving a page between the page +* cache and swap cache. We need to recheck the page cache +* under the protection of the info->lock spinlock. */ + + page = find_lock_page(mapping, idx); + if (page) { + spin_unlock (>lock); + return page; + } + if (entry->val) { unsigned long flags; /* Look it up and read it in.. */ page =
[PATCH] Fix races in 2.4.2-ac22 SysV shared memory
Hi, The patch below is for two races in sysV shared memory. The first (minor) one is in shmem_free_swp: swap_free (entry); *ptr = (swp_entry_t){0}; freed++; if (!(page = lookup_swap_cache(entry))) continue; delete_from_swap_cache(page); page_cache_release(page); has a window between the first swap_free() and the lookup_swap_cache(). If the swap_free() frees the last ref to the swap entry and another cpu allocates and caches the same entry before the lookup, we'll end up destroying another task's swap cache. The second is nastier. shmem_nopage() uses the inode semaphore to serialise access to shmem_getpage_locked() for paging in shared memory segments. Lookups in the page cache and in the shmem swap vector are done to locate the entry. _getpage_ can move entries from swap to page cache under protection of the shmem's info-lock spinlock. shmem_writepage() is locked via the page lock, and moves shmem pages from the page cache to the swap cache under protection of the same info-lock spinlock. However, shmem_nopage() does not hold this spinlock while doing its lookups in the page cache and swap vectors, so it can race with writepage, with once cpu in the middle of moving the page out of the page cache in writepage and the other cpu then failing to find the entry either in the page cache or in the shm swap entry vector. Feedback welcome. Cheers, Stephen --- mm/shmem.c.~1~ Fri Mar 23 00:26:49 2001 +++ mm/shmem.c Fri Mar 23 00:42:21 2001 @@ -121,13 +121,13 @@ if (!ptr-val) continue; entry = *ptr; - swap_free (entry); *ptr = (swp_entry_t){0}; freed++; - if (!(page = lookup_swap_cache(entry))) - continue; - delete_from_swap_cache(page); - page_cache_release(page); + if ((page = lookup_swap_cache(entry)) != NULL) { + delete_from_swap_cache(page); + page_cache_release(page); + } + swap_free (entry); } return freed; } @@ -218,15 +218,24 @@ } /* - * Move the page from the page cache to the swap cache + * Move the page from the page cache to the swap cache. + * + * The page lock prevents multiple occurences of shmem_writepage at + * once. We still need to guard against racing with + * shmem_getpage_locked(). */ static int shmem_writepage(struct page * page) { int error; struct shmem_inode_info *info; swp_entry_t *entry, swap; + struct inode *inode; - info = page-mapping-host-u.shmem_i; + if (!PageLocked(page)) + BUG(); + + inode = page-mapping-host; + info = inode-u.shmem_i; swap = __get_swap_page(2); if (!swap.val) { set_page_dirty(page); @@ -234,11 +243,11 @@ return -ENOMEM; } - spin_lock(info-lock); - shmem_recalc_inode(page-mapping-host); entry = shmem_swp_entry(info, page-index); if (IS_ERR(entry)) /* this had been allocted on page allocation */ BUG(); + spin_lock(info-lock); + shmem_recalc_inode(page-mapping-host); error = -EAGAIN; if (entry-val) { __swap_free(swap, 2); @@ -268,6 +277,10 @@ * If we allocate a new one we do not mark it dirty. That's up to the * vm. If we swap it in we mark it dirty since we also free the swap * entry since a page cannot live in both the swap and page cache + * + * Called with the inode locked, so it cannot race with itself, but we + * still need to guard against racing with shm_writepage(), which might + * be trying to move the page to the swap cache as we run. */ static struct page * shmem_getpage_locked(struct inode * inode, unsigned long idx) { @@ -276,31 +289,57 @@ struct page * page; swp_entry_t *entry; - page = find_lock_page(mapping, idx);; + info = inode-u.shmem_i; + +repeat: + page = find_lock_page(mapping, idx); if (page) return page; - info = inode-u.shmem_i; entry = shmem_swp_entry (info, idx); if (IS_ERR(entry)) return (void *)entry; + + spin_lock (info-lock); + + /* The shmem_swp_entry() call may have blocked, and +* shmem_writepage may have been moving a page between the page +* cache and swap cache. We need to recheck the page cache +* under the protection of the info-lock spinlock. */ + + page = find_lock_page(mapping, idx); + if (page) { + spin_unlock (info-lock); + return page; + } + if (entry-val) { unsigned long flags; /* Look it up and read it in.. */