On 06.01.21 04:46, Liang Li wrote: > A typical usage of hugetlbfs it's to reserve amount of memory > during the kernel booting stage, and the reserved pages are > unlikely to return to the buddy system. When application need > hugepages, kernel will allocate them from the reserved pool. > when application terminates, huge pages will return to the > reserved pool and are kept in the free list for hugetlbfs, > these free pages will not return to buddy freelist unless the > size of reserved pool is changed. > Free page reporting only supports buddy pages, it can't report > the free pages reserved for hugetlbfs. On the other hand, > hugetlbfs is a good choice for system with a huge amount of RAM, > because it can help to reduce the memory management overhead and > improve system performance. > This patch add the support for reporting hugepages in the free > list of hugetlbfs, it can be used by virtio_balloon driver for > memory overcommit and pre zero out free pages for speeding up > memory population and page fault handling.
You should lay out the use case + measurements. Further you should describe what this patch set actually does, how behavior can be tuned, pros and cons, etc... And you should most probably keep this RFC. > > Most of the code are 'copied' from free page reporting because > they are working in the same way. So the code can be refined to > remove duplication. It can be done later. Nothing speaks about getting it right from the beginning. Otherwise it will most likely never happen. > > Since some guys have some concern about side effect of the 'buddy > free page pre zero out' feature brings, I remove it from this > serier. You should really point out what changed size the last version. I remember Alex and Mike had some pretty solid points of what they don't want to see (especially: don't use free page reporting infrastructure and don't temporarily allocate huge pages for processing them). I am not convinced that we want to use the free page reporting infrastructure for this (pre-zeroing huge pages). What speaks about a thread simply iterating over huge pages one at a time, zeroing them? The whole free page reporting infrastructure was invented because we have to do expensive coordination (+ locking) when going via the hypervisor. For the main use case of zeroing huge pages in the background, I don't see a real need for that. If you believe this is the right thing to do, please add a discussion regarding this. -- Thanks, David / dhildenb