On Wed, Apr 02, 2014 at 10:40:16AM -0700, John Stultz wrote: > On Wed, Apr 2, 2014 at 9:36 AM, Johannes Weiner <han...@cmpxchg.org> wrote: > > I'm just dying to hear a "normal" use case then. :) > > So the more "normal" use cause would be marking objects volatile and > then non-volatile w/o accessing them in-between. In this case the > zero-fill vs SIGBUS semantics don't really matter, its really just a > trade off in how we handle applications deviating (intentionally or > not) from this use case. > > So to maybe flesh out the context here for folks who are following > along (but weren't in the hallway at LSF :), Johannes made a fairly > interesting proposal (Johannes: Please correct me here where I'm maybe > slightly off here) to use only the dirty bits of the ptes to mark a > page as volatile. Then the kernel could reclaim these clean pages as > it needed, and when we marked the range as non-volatile, the pages > would be re-dirtied and if any of the pages were missing, we could > return a flag with the purged state. This had some different > semantics then what I've been working with for awhile (for example, > any writes to pages would implicitly clear volatility), so I wasn't > completely comfortable with it, but figured I'd think about it to see > if it could be done. Particularly since it would in some ways simplify > tmpfs/shm shared volatility that I'd eventually like to do. ... > Now, while for the case I'm personally most interested in (ashmem), > zero-fill would technically be ok, since that's what Android does. > Even so, I don't think its the best approach for the interface, since > applications may end up quite surprised by the results when they > accidentally don't follow the "don't touch volatile pages" rule. > > That point beside, I think the other problem with the page-cleaning > volatility approach is that there are other awkward side effects. For > example: Say an application marks a range as volatile. One page in the > range is then purged. The application, due to a bug or otherwise, > reads the volatile range. This causes the page to be zero-filled in, > and the application silently uses the corrupted data (which isn't > great). More problematic though, is that by faulting the page in, > they've in effect lost the purge state for that page. When the > application then goes to mark the range as non-volatile, all pages are > present, so we'd return that no pages were purged. From an > application perspective this is pretty ugly.
The write-implicitly-clears-volatile semantics would actually be an advantage for some use cases. If you have a volatile cache of many sub-page-size objects, the application can just include at the start of each page "int present, in_use;". "present" is set to non-zero before marking volatile, and when the application wants unmark as volatile it writes to "in_use" and tests the value of "present". No need for a syscall at all, although it does take a minor fault. The syscall would be better for the case of large objects, though. Or is that fatally flawed? - Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/