> @@ -1691,17 +1680,34 @@ static unsigned long isolate_lru_pages(unsigned long > nr_to_scan, > * only when the page is being freed somewhere else. > */ > scan += nr_pages; > - switch (__isolate_lru_page(page, mode)) { > + switch (__isolate_lru_page_prepare(page, mode)) { > case 0: > + /* > + * Be careful not to clear PageLRU until after we're > + * sure the page is not being freed elsewhere -- the > + * page release code relies on it. > + */ > + if (unlikely(!get_page_unless_zero(page))) > + goto busy; > + > + if (!TestClearPageLRU(page)) { > + /* > + * This page may in other isolation path, > + * but we still hold lru_lock. > + */ > + put_page(page); > + goto busy; > + } > +
So I was reviewing the code and came across this piece. It has me a bit concerned since we are calling put_page while holding the LRU lock which was taken before calling the function. We should be fine in terms of not encountering a deadlock since the LRU bit is cleared the page shouldn't grab the LRU lock again, however we could end up grabbing the zone lock while holding the LRU lock which would be an issue. One other thought I had is that this might be safe because the assumption would be that another thread is holding a reference on the page, has already called TestClearPageLRU on the page and retrieved the LRU bit, and is waiting on us to release the LRU lock before it can pull the page off of the list. In that case put_page will never decrement the reference count to 0. I believe that is the current case but I cannot be certain. I'm just wondering if we should just replace the put_page(page) with a WARN_ON(put_page_testzero(page)) and a bit more documentation. If I am not mistaken it should never be possible for the reference count to actually hit zero here. Thanks. - Alex