On Tue, Apr 14, 2026 at 05:37:50PM +0200, David Hildenbrand (Arm) wrote: > On 4/14/26 16:23, Kiryl Shutsemau (Meta) wrote: > > This series adds userfaultfd support for tracking the working set of > > VM guest memory, enabling VMMs to identify cold pages and evict them > > to tiered or remote storage. > > > > == Problem == > > > > VMMs managing guest memory need to: > > 1. Track which pages are actively used (working set detection) > > 2. Safely evict cold pages to slower storage > > 3. Fetch pages back on demand when accessed again > > > > For shmem-backed guest memory, working set tracking partially works > > today: MADV_DONTNEED zaps PTEs while pages stay in page cache, and > > re-access auto-resolves from cache. But safe eviction still requires > > synchronous fault interception to prevent data loss races. > > > > For anonymous guest memory (needed for KSM cross-VM deduplication), > > there is no mechanism at all — clearing a PTE loses the page. > > > > == Solution == > > > > The series introduces a unified userfaultfd interface that works > > across both anonymous and shmem-backed memory: > > > > UFFD_FEATURE_MINOR_ANON: extends MODE_MINOR registration to anonymous > > private memory. Uses the PROT_NONE hinting mechanism (same as NUMA > > balancing) to make pages inaccessible without freeing them. > > I would rather tackle this from the other direction: it's another form > of protection (like WP), not really a "minor" mode. > > Could we add a UFFDIO_REGISTER_MODE_RWP (or however we would call it) > and support it for anon+shmem, avoiding the zapping for shmem completely?
I like this idea. It should be functionally equivalent, but your interface idea fits better with the rest. Thanks! Will give it a try. -- Kiryl Shutsemau / Kirill A. Shutemov

