On Tue, Jan 17, 2017 at 12:36:21PM -0600, Reza Arbab wrote: > On Tue, Jan 17, 2017 at 12:52:51PM +0530, Balbir Singh wrote: > > Shouldn't most of these functions have __meminit? > > I don't think so. The mapping functions are __meminit, but the unmapping > functions are completely within #ifdef CONFIG_MEMORY_HOTPLUG already. > > > On Mon, Jan 16, 2017 at 01:07:45PM -0600, Reza Arbab wrote: > > > #ifdef CONFIG_MEMORY_HOTPLUG > > > +static void free_pte_table(pte_t *pte_start, pmd_t *pmd) > > > +{ > > > + pte_t *pte; > > > + int i; > > > + > > > + for (i = 0; i < PTRS_PER_PTE; i++) { > > > + pte = pte_start + i; > > > + if (!pte_none(*pte)) > > > + return; > > > > If !pte_none() we fail the hotplug? Or silently > > leave the allocated pte's around. I guess this is > > the same as x86 > > The latter--it's not a failure. If you provided remove_pagetable() an > unaligned address range, there could be a pte left unremoved at either end. >
OK. > > > +static void remove_pmd_table(pmd_t *pmd_start, unsigned long addr, > > > + unsigned long end) > > > +{ > > > + unsigned long next; > > > + pte_t *pte_base; > > > + pmd_t *pmd; > > > + > > > + pmd = pmd_start + pmd_index(addr); > > > + for (; addr < end; addr = next, pmd++) { > > > + next = pmd_addr_end(addr, end); > > > + > > > + if (!pmd_present(*pmd)) > > > + continue; > > > + > > > + if (pmd_huge(*pmd)) { > > > + pte_clear(&init_mm, addr, (pte_t *)pmd); > > > > pmd_clear()? > > I used pte_clear() to mirror what happens in radix__map_kernel_page(): > > if (map_page_size == PMD_SIZE) { > ptep = (pte_t *)pmdp; > goto set_the_pte; > } > > [...] > > set_the_pte: > set_pte_at(&init_mm, ea, ptep, pfn_pte(pa >> PAGE_SHIFT, > flags)); > > Would pmd_clear() be equivalent, since the pointer got set like a pte? But we are still setting a pmdp. pmd_clear() will set the pmd to 0, pte_clear() will go through the pte_update() mechanism which is expensive IMHO and we may not need to do it. > > > > +static void remove_pagetable(unsigned long start, unsigned long end) > > > +{ > > > + unsigned long addr, next; > > > + pud_t *pud_base; > > > + pgd_t *pgd; > > > + > > > + spin_lock(&init_mm.page_table_lock); > > > + > > > > x86 does more granular lock acquisition only during > > clearing the relevant entries. I suppose we don't have > > to worry about it since its not fast path and frequent. > > Yep. Ben thought the locking in remove_pte_table() was actually too > granular, and Aneesh questioned what was being protected in the first place. > So I left one lock/unlock in the outermost function for now. > Fair enough Balbir Singh.