On Wed 05-04-17 10:48:52, Reza Arbab wrote: > On Wed, Apr 05, 2017 at 08:42:39AM +0200, Michal Hocko wrote: > >On Tue 04-04-17 16:43:39, Reza Arbab wrote: > >>Okay, getting further. With this I can again repeatedly add and remove, > >>but now I'm seeing a weird variation of that earlier issue: > >> > >>1. add_memory(), online_movable > >> /sys/devices/system/node/nodeX/memoryY symlinks are created. > >> > >>2. offline, remove_memory() > >> The node is offlined, since all memory has been removed, so all of > >> /sys/devices/system/node/nodeX is gone. This is normal. > >> > >>3. add_memory(), online_movable > >> The node is onlined, so /sys/devices/system/node/nodeX is recreated, > >> and the memory is added, but just like earlier in this email thread, > >> the memoryY links are not there. > > > >Could you add some printks to see why the sysfs creation failed please? > > Ah, simple enough. It's this, right at the top of > register_mem_sect_under_node(): > > if (!node_online(nid)) > return 0; > > That being the case, I really don't understand why your patches make any > difference. Is node_set_online() being called later than before somehow?
This is really interesting. Because add_memory_resource does the following /* call arch's memory hotadd */ ret = arch_add_memory(nid, start, size); if (ret < 0) goto error; /* we online node here. we can't roll back from here. */ node_set_online(nid); so we are setting the node online _after_ arch_add_memory but the code which adds those sysfs file is called from arch_add_memory __add_pages __add_section register_new_memory register_mem_sect_under_node node_online check I haven't touched this part. What is the point of this check anyway? We have already associated all the pages with a node (and with a zone prior to my patches) so we _know_ how to create those links. The check goes back to the initial submissions. Gary is not available anymore so we cannot ask. But I completely fail to see how my changes could have made any difference. I assume that things start working after you remove that check? Btw. if you put printk to the original kernel does it see the node online? I would be also interested whether you see try_offline_node setting the node offline in the original code. Thanks! -- Michal Hocko SUSE Labs