Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
On Fri, Nov 04, 2016 at 09:48:14AM -0600, Catalin Marinas wrote: > On Fri, Nov 04, 2016 at 10:52:17AM +0800, Huang Shijie wrote: > > On Thu, Nov 03, 2016 at 06:16:16PM -0600, Catalin Marinas wrote: > > > On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote: > > > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > > > > index 2e49bd2..4811ef1 100644 > > > > --- a/arch/arm64/mm/hugetlbpage.c > > > > +++ b/arch/arm64/mm/hugetlbpage.c > > > > @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, > > > > unsigned long addr, > > > > return 1; > > > > } > > > > pmd = pmd_offset(pud, addr); > > > > - if (!pmd_present(*pmd)) { > > > > - VM_BUG_ON(!pmd_present(*pmd)); > > > > - return 1; > > > > - } > > > > if ((pte_t *)pmd == ptep) { > > > > *pgsize = PMD_SIZE; > > > > return CONT_PMDS; > > > > > > BTW, for the !pud_present() and !pgd_present() cases, shouldn't > > > find_num_contig() actually return 0? These are more likely real bugs, so > > > no point in setting the huge pte. > > > > The kernel will not call the find_num_contig() if the PGD/PUD are empty. > > Please see the code in the hugetlb_fault(). > > > >-- > > ptep = huge_pte_offset(mm, address); > > if (ptep) { > > ... > > } else { > > ptep = huge_pte_alloc(mm, address, huge_page_size(h)); > > if (!ptep) > > return VM_FAULT_OOM; > > } > >-- > > Exactly. So what is the reason for returning 1 if !pgd_present()? Would I think the author was too cautious for returning 1 if !pgd_present(). :) > removing the checks entirely or adding BUG() be a better option? I will remove the checks in the next version. Thanks Huang Shijie
Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
On Fri, Nov 04, 2016 at 10:52:17AM +0800, Huang Shijie wrote: > On Thu, Nov 03, 2016 at 06:16:16PM -0600, Catalin Marinas wrote: > > On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote: > > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > > > index 2e49bd2..4811ef1 100644 > > > --- a/arch/arm64/mm/hugetlbpage.c > > > +++ b/arch/arm64/mm/hugetlbpage.c > > > @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, > > > unsigned long addr, > > > return 1; > > > } > > > pmd = pmd_offset(pud, addr); > > > - if (!pmd_present(*pmd)) { > > > - VM_BUG_ON(!pmd_present(*pmd)); > > > - return 1; > > > - } > > > if ((pte_t *)pmd == ptep) { > > > *pgsize = PMD_SIZE; > > > return CONT_PMDS; > > > > BTW, for the !pud_present() and !pgd_present() cases, shouldn't > > find_num_contig() actually return 0? These are more likely real bugs, so > > no point in setting the huge pte. > > The kernel will not call the find_num_contig() if the PGD/PUD are empty. > Please see the code in the hugetlb_fault(). > >-- > ptep = huge_pte_offset(mm, address); > if (ptep) { > ... > } else { > ptep = huge_pte_alloc(mm, address, huge_page_size(h)); > if (!ptep) > return VM_FAULT_OOM; > } >-- Exactly. So what is the reason for returning 1 if !pgd_present()? Would removing the checks entirely or adding BUG() be a better option? -- Catalin
Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
On Thu, Nov 03, 2016 at 06:16:16PM -0600, Catalin Marinas wrote: > On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote: > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > > index 2e49bd2..4811ef1 100644 > > --- a/arch/arm64/mm/hugetlbpage.c > > +++ b/arch/arm64/mm/hugetlbpage.c > > @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, > > unsigned long addr, > > return 1; > > } > > pmd = pmd_offset(pud, addr); > > - if (!pmd_present(*pmd)) { > > - VM_BUG_ON(!pmd_present(*pmd)); > > - return 1; > > - } > > if ((pte_t *)pmd == ptep) { > > *pgsize = PMD_SIZE; > > return CONT_PMDS; > > BTW, for the !pud_present() and !pgd_present() cases, shouldn't The kernel will not call the find_num_contig() if the PGD/PUD are empty. Please see the code in the hugetlb_fault(). -- ptep = huge_pte_offset(mm, address); if (ptep) { ... } else { ptep = huge_pte_alloc(mm, address, huge_page_size(h)); if (!ptep) return VM_FAULT_OOM; } -- Thanks Huang Shijie > find_num_contig() actually return 0? These are more likely real bugs, so > no point in setting the huge pte.
Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote: > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > index 2e49bd2..4811ef1 100644 > --- a/arch/arm64/mm/hugetlbpage.c > +++ b/arch/arm64/mm/hugetlbpage.c > @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned > long addr, > return 1; > } > pmd = pmd_offset(pud, addr); > - if (!pmd_present(*pmd)) { > - VM_BUG_ON(!pmd_present(*pmd)); > - return 1; > - } > if ((pte_t *)pmd == ptep) { > *pgsize = PMD_SIZE; > return CONT_PMDS; BTW, for the !pud_present() and !pgd_present() cases, shouldn't find_num_contig() actually return 0? These are more likely real bugs, so no point in setting the huge pte. -- Catalin
[PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
The find_num_contig() will return 1 when the pmd is not present. It will cause a kernel dead loop in the following scenaro: 1.) pmd entry is not present. 2.) the page fault occurs: ... hugetlb_fault() --> hugetlb_no_page() --> set_huge_pte_at() 3.) set_huge_pte_at() will only set the first PMD entry, since the find_num_contig just return 1 in this case. So the PMD entries are all empty except the first one. 4.) when kernel accesses the address mapped by the second PMD entry, a new page fault occurs: ... hugetlb_fault() --> huge_ptep_set_access_flags() The second PMD entry is still empty now. 5.) When the kernel returns, the access will cause a page fault again. The kernel will run like the "4)" above. We will see a dead loop since here. The dead loop is caught in the 32M hugetlb page (2M PMD + Contiguous bit). This patch removes wrong pmd check, and fixes this dead loop. Acked-by: Steve Capper Signed-off-by: Huang Shijie --- arch/arm64/mm/hugetlbpage.c | 4 1 file changed, 4 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 2e49bd2..4811ef1 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr, return 1; } pmd = pmd_offset(pud, addr); - if (!pmd_present(*pmd)) { - VM_BUG_ON(!pmd_present(*pmd)); - return 1; - } if ((pte_t *)pmd == ptep) { *pgsize = PMD_SIZE; return CONT_PMDS; -- 2.5.5