Linus Torvalds <torva...@linux-foundation.org> wrote: > > Yeah, I have trouble with the private2 vs fscache bit too. I've been > > trying to persuade David that he doesn't actually need an fscache > > bit at all; he can just increment the page's refcount to prevent it > > from being freed while he writes data to the cache. > > Does the code not hold a refcount already?
AIUI, Willy wanted me to drop the refcount and rely on PG_locked alone during I/O triggered by the new ->readahead() method, so when it comes to setting PG_fscache after a successful read from the server, I don't hold any page refs - the assumption being that the waits in releasepage and invalidatepage suffice. If that isn't sufficient, I can make it take page refs on the pages to be written out - that should be easy enough to do. > Honestly, the fact that writeback doesn't take a refcount, and then > has magic "if writeback is set, don't free" code in other parts of the > VM layer has been a problem already, when the wakeup ended up > "leaking" from a previous page to a new allocation. > > I very much hope the fscache bit does not make similar mistakes, > because the rest of the VM will _not_ have special "if fscache is set, > then we won't do X" the way we do for writeback. The VM can't do that because PG_private_2 might not be being used for PG_fscache. It does, however, treat PG_private_2 like PG_private when triggering calls to releasepage and invalidatepage. > So I think the fscache code needs to hold a refcount regardless, and > that the fscache bit is set the page has to have a reference. > > So what are the current lifetime rules for the fscache bit? It depends which 'current' you're referring to. The old fscache I/O API (ie. what's upstream) - in which PG_fscache is set on a page to note that fscache knows about the page - does not keep a separate ref on such pages. The new fscache I/O API simplifies things. With that, pages are only known about for the duration of a write to the cache. I've tried to analogise the way PG_writeback works[*], including waiting for it in places like invalidation, releasepage, page_mkwrite (though in the netfs, not the core VM) as it may represent DMA. Note that with the new I/O API, fscache and cachefiles know nothing about the PG_fscache bit or netfs pages; they just deal with an iov_iter and a completion function. Dealing with PG_fscache is done by the netfs and the new netfs helper lib. [*] Though I see that 073861ed77b6b made a change to end_page_writeback() for an issue that probably affects unlock_page_fscache() too[**]. [**] This may mean that both PG_fscache and PG_writeback need to hold a ref on the page. David