Scott Cheloha <chel...@linux.ibm.com> writes: > Hi Michal, > > On Thu, Mar 12, 2020 at 06:02:37AM +0100, Michal Suchánek wrote: >> >> You basically revert the below which will likely cause the very error >> that was fixed there: >> >> commit b2d3b5ee66f2a04a918cc043cec0c9ed3de58f40 >> Author: Nathan Fontenot <nf...@linux.vnet.ibm.com> >> Date: Tue Oct 2 10:35:59 2018 -0500 >> >> powerpc/pseries: Track LMB nid instead of using device tree >> >> When removing memory we need to remove the memory from the node >> it was added to instead of looking up the node it should be in >> in the device tree. >> >> During testing we have seen scenarios where the affinity for a >> LMB changes due to a partition migration or PRRN event. In these >> cases the node the LMB exists in may not match the node the device >> tree indicates it belongs in. This can lead to a system crash >> when trying to DLPAR remove the LMB after a migration or PRRN >> event. The current code looks up the node in the device tree to >> remove the LMB from, the crash occurs when we try to offline this >> node and it does not have any data, i.e. node_data[nid] == NULL. > > I'm aware of this patch and that this is a near-total revert. > > I'm not reintroducing the original behavior, though. Instead of going > to the device tree to recompute the expected node id I'm retrieving it > from the LMB's corresponding memory_block. > > That crucial difference is this chunk: > > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c > @@ -376,25 +376,29 @@ static int dlpar_add_lmb(struct drmem_lmb *); > > static int dlpar_remove_lmb(struct drmem_lmb *lmb) > { > + struct memory_block *mem_block; > unsigned long block_sz; > int rc; > > if (!lmb_is_removable(lmb)) > return -EINVAL; > > + mem_block = lmb_to_memblock(lmb); > + if (mem_block == NULL) > + return -EINVAL; > +
Assuming lmb_to_memblock() -> find_memory_block() isn't engaging in O(n) behavior or worse (which is the case in linux-next), then I think Scott's change makes sense and is a net win.