> Your idea is nice, but the patch lacks a few things: > > - SMP locking, what if some other process faults in this page > between the atomic_read of the page count and the test later? It can't happen. free_pte is called with the page_table_lock held in addition to having the mmap_sem downed. > - testing if our process is the _only_ user of this swap page, > for eg. apache you'll have lots of COW-shared pages .. it would > be good to keep the page in memory for our siblings This is already done in free_page_and_swap_cache. There is a problem with a possible kernel panic in that try_to_free_buffers is called with a wait of 1 (thanks to Andrew Morton for pointing that out) and we might reschedule while we wait on io. So, to fix it, here is an even newer (and simpler) patch. Everything is handled in free_page_and_swap_cache, so we just call that if we can successfully look up the entry. Rich
diff --recursive -u -N linux-2.4.1/mm/memory.c linux-2.4.1-paging-fix/mm/memory.c --- linux-2.4.1/mm/memory.c Sat Jan 27 22:12:35 2001 +++ linux-2.4.1-paging-fix/mm/memory.c Thu Feb 15 13:36:06 2001 @@ -281,7 +281,16 @@ return 1; } swap_free(pte_to_swp_entry(pte)); - return 0; + { + struct page *page = lookup_swap_cache(pte_to_swp_entry(pte)); + /* returns the page and takes a reference */ + + if (!page || (!VALID_PAGE(page)) || PageReserved(page)) + return 0; + + free_page_and_swap_cache(page); /* Expects the page to be mapped, so +will */ + return 1; /* account for the reference we have + */ + } } static inline void forget_pte(pte_t page)