Re: [External] Re: [PATCH v20 4/9] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page

2021-04-16 Thread Muchun Song
On Sat, Apr 17, 2021 at 5:10 AM Mike Kravetz  wrote:
>
> On 4/15/21 1:40 AM, Muchun Song wrote:
> > Every HugeTLB has more than one struct page structure. We __know__ that
> > we only use the first 4 (__NR_USED_SUBPAGE) struct page structures
> > to store metadata associated with each HugeTLB.
> >
> > There are a lot of struct page structures associated with each HugeTLB
> > page. For tail pages, the value of compound_head is the same. So we can
> > reuse first page of tail page structures. We map the virtual addresses
> > of the remaining pages of tail page structures to the first tail page
> > struct, and then free these page frames. Therefore, we need to reserve
> > two pages as vmemmap areas.
> >
> > When we allocate a HugeTLB page from the buddy, we can free some vmemmap
> > pages associated with each HugeTLB page. It is more appropriate to do it
> > in the prep_new_huge_page().
> >
> > The free_vmemmap_pages_per_hpage(), which indicates how many vmemmap
> > pages associated with a HugeTLB page can be freed, returns zero for
> > now, which means the feature is disabled. We will enable it once all
> > the infrastructure is there.
> >
> > Signed-off-by: Muchun Song 
> > Reviewed-by: Oscar Salvador 
> > Tested-by: Chen Huang 
> > Tested-by: Bodeddula Balasubramaniam 
> > Acked-by: Michal Hocko 
>
> There may need to be some trivial rebasing due to Oscar's changes
> when they go in.

Yeah, thanks for your reminder.

>
> Reviewed-by: Mike Kravetz 
> --
> Mike Kravetz


Re: [PATCH v20 4/9] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page

2021-04-16 Thread Mike Kravetz
On 4/15/21 1:40 AM, Muchun Song wrote:
> Every HugeTLB has more than one struct page structure. We __know__ that
> we only use the first 4 (__NR_USED_SUBPAGE) struct page structures
> to store metadata associated with each HugeTLB.
> 
> There are a lot of struct page structures associated with each HugeTLB
> page. For tail pages, the value of compound_head is the same. So we can
> reuse first page of tail page structures. We map the virtual addresses
> of the remaining pages of tail page structures to the first tail page
> struct, and then free these page frames. Therefore, we need to reserve
> two pages as vmemmap areas.
> 
> When we allocate a HugeTLB page from the buddy, we can free some vmemmap
> pages associated with each HugeTLB page. It is more appropriate to do it
> in the prep_new_huge_page().
> 
> The free_vmemmap_pages_per_hpage(), which indicates how many vmemmap
> pages associated with a HugeTLB page can be freed, returns zero for
> now, which means the feature is disabled. We will enable it once all
> the infrastructure is there.
> 
> Signed-off-by: Muchun Song 
> Reviewed-by: Oscar Salvador 
> Tested-by: Chen Huang 
> Tested-by: Bodeddula Balasubramaniam 
> Acked-by: Michal Hocko 

There may need to be some trivial rebasing due to Oscar's changes
when they go in.

Reviewed-by: Mike Kravetz 
-- 
Mike Kravetz


[PATCH v20 4/9] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page

2021-04-15 Thread Muchun Song
Every HugeTLB has more than one struct page structure. We __know__ that
we only use the first 4 (__NR_USED_SUBPAGE) struct page structures
to store metadata associated with each HugeTLB.

There are a lot of struct page structures associated with each HugeTLB
page. For tail pages, the value of compound_head is the same. So we can
reuse first page of tail page structures. We map the virtual addresses
of the remaining pages of tail page structures to the first tail page
struct, and then free these page frames. Therefore, we need to reserve
two pages as vmemmap areas.

When we allocate a HugeTLB page from the buddy, we can free some vmemmap
pages associated with each HugeTLB page. It is more appropriate to do it
in the prep_new_huge_page().

The free_vmemmap_pages_per_hpage(), which indicates how many vmemmap
pages associated with a HugeTLB page can be freed, returns zero for
now, which means the feature is disabled. We will enable it once all
the infrastructure is there.

Signed-off-by: Muchun Song 
Reviewed-by: Oscar Salvador 
Tested-by: Chen Huang 
Tested-by: Bodeddula Balasubramaniam 
Acked-by: Michal Hocko 
---
 include/linux/bootmem_info.h |  28 +-
 include/linux/mm.h   |   3 +
 mm/Makefile  |   1 +
 mm/hugetlb.c |   2 +
 mm/hugetlb_vmemmap.c | 218 +++
 mm/hugetlb_vmemmap.h |  20 
 mm/sparse-vmemmap.c  | 194 ++
 7 files changed, 465 insertions(+), 1 deletion(-)
 create mode 100644 mm/hugetlb_vmemmap.c
 create mode 100644 mm/hugetlb_vmemmap.h

diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h
index 4ed6dee1adc9..2bc8b1f69c93 100644
--- a/include/linux/bootmem_info.h
+++ b/include/linux/bootmem_info.h
@@ -2,7 +2,7 @@
 #ifndef __LINUX_BOOTMEM_INFO_H
 #define __LINUX_BOOTMEM_INFO_H
 
-#include 
+#include 
 
 /*
  * Types for free bootmem stored in page->lru.next. These have to be in
@@ -22,6 +22,27 @@ void __init register_page_bootmem_info_node(struct 
pglist_data *pgdat);
 void get_page_bootmem(unsigned long info, struct page *page,
  unsigned long type);
 void put_page_bootmem(struct page *page);
+
+/*
+ * Any memory allocated via the memblock allocator and not via the
+ * buddy will be marked reserved already in the memmap. For those
+ * pages, we can call this function to free it to buddy allocator.
+ */
+static inline void free_bootmem_page(struct page *page)
+{
+   unsigned long magic = (unsigned long)page->freelist;
+
+   /*
+* The reserve_bootmem_region sets the reserved flag on bootmem
+* pages.
+*/
+   VM_BUG_ON_PAGE(page_ref_count(page) != 2, page);
+
+   if (magic == SECTION_INFO || magic == MIX_SECTION_INFO)
+   put_page_bootmem(page);
+   else
+   VM_BUG_ON_PAGE(1, page);
+}
 #else
 static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
 {
@@ -35,6 +56,11 @@ static inline void get_page_bootmem(unsigned long info, 
struct page *page,
unsigned long type)
 {
 }
+
+static inline void free_bootmem_page(struct page *page)
+{
+   free_reserved_page(page);
+}
 #endif
 
 #endif /* __LINUX_BOOTMEM_INFO_H */
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 25b9041f9925..a4d160ddb749 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3046,6 +3046,9 @@ static inline void print_vma_addr(char *prefix, unsigned 
long rip)
 }
 #endif
 
+void vmemmap_remap_free(unsigned long start, unsigned long end,
+   unsigned long reuse);
+
 void *sparse_buffer_alloc(unsigned long size);
 struct page * __populate_section_memmap(unsigned long pfn,
unsigned long nr_pages, int nid, struct vmem_altmap *altmap);
diff --git a/mm/Makefile b/mm/Makefile
index d0ccddae7a45..40ee404e200e 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -75,6 +75,7 @@ obj-$(CONFIG_FRONTSWAP)   += frontswap.o
 obj-$(CONFIG_ZSWAP)+= zswap.o
 obj-$(CONFIG_HAS_DMA)  += dmapool.o
 obj-$(CONFIG_HUGETLBFS)+= hugetlb.o
+obj-$(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP)+= hugetlb_vmemmap.o
 obj-$(CONFIG_NUMA) += mempolicy.o
 obj-$(CONFIG_SPARSEMEM)+= sparse.o
 obj-$(CONFIG_SPARSEMEM_VMEMMAP) += sparse-vmemmap.o
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 54d81d5947ed..923d05e2806b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include "internal.h"
+#include "hugetlb_vmemmap.h"
 
 int hugetlb_max_hstate __read_mostly;
 unsigned int default_hstate_idx;
@@ -1485,6 +1486,7 @@ void free_huge_page(struct page *page)
 
 static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
 {
+   free_huge_page_vmemmap(h, page);
INIT_LIST_HEAD(&page->lru);
set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
hugetlb_set_page_subpool(page, NULL);
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
n