Hi, On Tue, 2005-03-08 at 09:28, Stephen C. Tweedie wrote:
> I think it should be OK just to move the page->mapping != mapping test > above the page>index > end test. Sure, if all the pages have been > stolen by the time we see them, then we'll repeat without advancing > "next"; but we're still making progress in that case because pages that > were previously in this mapping *have* been removed, just by a different > process. If we're really concerned about that case we can play the > trick that invalidate_mapping_pages() tries and do a "next++" in that > case. Variant on akpm's last patch with this change, plus one extra s/page->index/page_index/ that he missed. I've moved the whole next=page_index+1 and wrapping check up to just after we've tested the mapping, so even if we break early all of that updating has already been done reliably and we don't end up setting next in two places in the code. A 400-task fsstress used to livelock reliably within minutes inside O_DIRECT without any fix; this patch has been running for over an hour now without problem. --Stephen
Fix livelock in invalidate_inode_pages2_range invalidate_inode_pages2_range can loop indefinitely if pagevec_lookup returns only pages beyond the end of the range being invalidated. Fix it by advancing the restart pointer, "next", every time we see a page on the correct mapping, not just if the page is within the range; and break out early when we encounter such a page. Based on a proposed fix by akpm. Signed-off-by: Stephen Tweedie <[EMAIL PROTECTED]> --- linux-2.6-ext3/mm/truncate.c.=K0002=.orig +++ linux-2.6-ext3/mm/truncate.c @@ -268,25 +268,31 @@ int invalidate_inode_pages2_range(struct min(end - next, (pgoff_t)PAGEVEC_SIZE - 1) + 1)) { for (i = 0; !ret && i < pagevec_count(&pvec); i++) { struct page *page = pvec.pages[i]; + pgoff_t page_index; int was_dirty; lock_page(page); - if (page->mapping != mapping || page->index > end) { + if (page->mapping != mapping) { unlock_page(page); continue; } - wait_on_page_writeback(page); - next = page->index + 1; + page_index = page->index; + next = page_index + 1; if (next == 0) wrapped = 1; + if (page_index > end) { + unlock_page(page); + break; + } + wait_on_page_writeback(page); while (page_mapped(page)) { if (!did_range_unmap) { /* * Zap the rest of the file in one hit. */ unmap_mapping_range(mapping, - page->index << PAGE_CACHE_SHIFT, - (end - page->index + 1) + page_index << PAGE_CACHE_SHIFT, + (end - page_index + 1) << PAGE_CACHE_SHIFT, 0); did_range_unmap = 1; @@ -295,7 +301,7 @@ int invalidate_inode_pages2_range(struct * Just zap this page */ unmap_mapping_range(mapping, - page->index << PAGE_CACHE_SHIFT, + page_index << PAGE_CACHE_SHIFT, PAGE_CACHE_SIZE, 0); } }