Hi Michal, On Thu, Mar 12, 2020 at 06:02:37AM +0100, Michal Suchánek wrote: > On Wed, Mar 11, 2020 at 06:08:15PM -0500, Scott Cheloha wrote: > > At memory hot-remove time we can retrieve an LMB's nid from its > > corresponding memory_block. There is no need to store the nid > > in multiple locations. > > > > Signed-off-by: Scott Cheloha <chel...@linux.ibm.com> > > --- > > The linear search in powerpc's memory_add_physaddr_to_nid() has become a > > bottleneck at boot on systems with many LMBs. > > > > As described in this patch here: > > > > https://lore.kernel.org/linuxppc-dev/20200221172901.1596249-2-chel...@linux.ibm.com/ > > > > the linear search seriously cripples drmem_init(). > > > > The obvious solution (shown in that patch) is to just make the search > > in memory_add_physaddr_to_nid() faster. An XArray seems well-suited > > to the task of mapping an address range to an LMB object. > > > > The less obvious approach is to just call memory_add_physaddr_to_nid() > > in fewer places. > > > > I'm not sure which approach is correct, hence the RFC. > > You basically revert the below which will likely cause the very error > that was fixed there: > > commit b2d3b5ee66f2a04a918cc043cec0c9ed3de58f40 > Author: Nathan Fontenot <nf...@linux.vnet.ibm.com> > Date: Tue Oct 2 10:35:59 2018 -0500 > > powerpc/pseries: Track LMB nid instead of using device tree > > When removing memory we need to remove the memory from the node > it was added to instead of looking up the node it should be in > in the device tree. > > During testing we have seen scenarios where the affinity for a > LMB changes due to a partition migration or PRRN event. In these > cases the node the LMB exists in may not match the node the device > tree indicates it belongs in. This can lead to a system crash > when trying to DLPAR remove the LMB after a migration or PRRN > event. The current code looks up the node in the device tree to > remove the LMB from, the crash occurs when we try to offline this > node and it does not have any data, i.e. node_data[nid] == NULL.
I'm aware of this patch and that this is a near-total revert. I'm not reintroducing the original behavior, though. Instead of going to the device tree to recompute the expected node id I'm retrieving it from the LMB's corresponding memory_block. That crucial difference is this chunk: --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -376,25 +376,29 @@ static int dlpar_add_lmb(struct drmem_lmb *); static int dlpar_remove_lmb(struct drmem_lmb *lmb) { + struct memory_block *mem_block; unsigned long block_sz; int rc; if (!lmb_is_removable(lmb)) return -EINVAL; + mem_block = lmb_to_memblock(lmb); + if (mem_block == NULL) + return -EINVAL; + rc = dlpar_offline_lmb(lmb); if (rc) return rc; block_sz = pseries_memory_block_size(); - __remove_memory(lmb->nid, lmb->base_addr, block_sz); + __remove_memory(mem_block->nid, lmb->base_addr, block_sz); /* Update memory regions for memory remove */ memblock_remove(lmb->base_addr, block_sz); invalidate_lmb_associativity_index(lmb); - lmb_clear_nid(lmb); lmb->flags &= ~DRCONF_MEM_ASSIGNED; return 0; --- Unless it's possible that the call: __add_memory(my_nid, my_addr, my_size); does not yield the following: memory_block.nid == my_nid on success, then I think the solution in my patch is equivalent to and simpler than the one introduced in the patch you quote. Can you see a way the above would not hold? Then again, maybe there's a good reason not to retrieve the node id in this way. I'm thinking David Hildenbrand and/or Nathan Fontenont may have some insight on that.