On 4/14/26 16:23, Kiryl Shutsemau (Meta) wrote: > This series adds userfaultfd support for tracking the working set of > VM guest memory, enabling VMMs to identify cold pages and evict them > to tiered or remote storage. > > == Problem == > > VMMs managing guest memory need to: > 1. Track which pages are actively used (working set detection) > 2. Safely evict cold pages to slower storage > 3. Fetch pages back on demand when accessed again > > For shmem-backed guest memory, working set tracking partially works > today: MADV_DONTNEED zaps PTEs while pages stay in page cache, and > re-access auto-resolves from cache. But safe eviction still requires > synchronous fault interception to prevent data loss races. > > For anonymous guest memory (needed for KSM cross-VM deduplication), > there is no mechanism at all — clearing a PTE loses the page. > > == Solution == > > The series introduces a unified userfaultfd interface that works > across both anonymous and shmem-backed memory: > > UFFD_FEATURE_MINOR_ANON: extends MODE_MINOR registration to anonymous > private memory. Uses the PROT_NONE hinting mechanism (same as NUMA > balancing) to make pages inaccessible without freeing them.
I would rather tackle this from the other direction: it's another form of protection (like WP), not really a "minor" mode. Could we add a UFFDIO_REGISTER_MODE_RWP (or however we would call it) and support it for anon+shmem, avoiding the zapping for shmem completely? -- Cheers, David

