On Wed, Jun 21, 2023 at 06:17:37PM +0200, David Hildenbrand wrote: > As documented, ram_block_discard_range() guarantees two things > > a) Read 0 after discarding succeeded > b) Make postcopy work by triggering a fault on next access > > And if we'd simply want to drop the FALLOC_FL_PUNCH_HOLE: > > 1) For hugetlb, only newer kernels support MADV_DONTNEED. So there is no way > to just discard in a private mapping here that works for kernels we still > care about. > > 2) free-page-reporting wants to read 0's when re-accessing discarded memory. > If there is still something there in the file, that won't work.
Ah right. The semantics is indeed slightly different.. IMHO, ideally here we need a zero page installed as private, ignoring the page cache underneath, freeing any possible private page. But I just don't know how to do that easily with current default mm infrastructures, or free-page-reporting over private mem seems just won't really work at all, it seems to me. Maybe.. UFFDIO_ZEROPAGE would work? We need uffd registered by default, but that's slightly tricky. > > 3) Regarding postcopy on MAP_PRIVATE shmem, I am not sure if it will > actually do what you want if the pagecache holds a page. Maybe it works, but > I am not so sure. Needs investigation. For MINOR I think it will. I actually already implemented some of that (I think, all of that is required) in the HGM qemu rfc series, and smoked it a bit without any known issue yet with the HGM kernel. IIUC we can work on MINOR support without HGM; I can separate it out. It's really a matter of whether it'll be worthwhile the effort and time. Thanks, -- Peter Xu