Brian King <[email protected]> writes: > On 2/7/23 9:14 AM, Nathan Lynch wrote: >> Brian King <[email protected]> writes: >>> While testing fixes to the hvcs hotplug code, kmemleak was reporting >>> potential memory leaks. This was tracked down to the struct device_node >>> object associated with the hvcs device. Looking at the leaked >>> object in crash showed that the kref in the kobject in the device_node >>> had a reference count of 1 still, and the release function was never >>> getting called as a result of this. This adds an of_node_put in >>> pSeries_reconfig_remove_node in order to balance the refcounting >>> so that we actually free the device_node in the case of it being >>> allocated in pSeries_reconfig_add_node. >> >> My concern here would be whether the additional put is the right thing >> to do in all cases. The questions it raises for me are: >> >> - Is it safe for nodes that were present at boot, instead of added >> dynamically? > > Yes. of_node_release has a check to see if OF_DYNAMIC is set. If it is not > set, > the release function is a noop.
Yes, but to be more specific - does the additional of_node_put() risk underflowing the refcount on nodes without the OF_DYNAMIC flag? I suspect it's OK. If it's not, then I would expect to see warnings from the refcount code when that case is exercised. > >> - Is it correct for all types of nodes, or is there something specific >> to hvcs that leaves a dangling refcount? > > I would welcome more testing and I shared the same concern. I did do some > DLPARs of a virtual ethernet device with the change along with > CONFIG_PAGE_POISONING > enabled and did not run into any issues. However if I do a DLPAR remove of a > virtual > ethernet device without the change with kmemleak enabled it does not detect > any > leaked memory. Seems odd. If the change is generically correct, then without it applied I would expect kmemleak to flag a leak on removal of any type of dynamically-added node. On the other hand, if the change is for some reason not correct for virtual ethernet devices, then I would expect it to cause complaints from the refcount code and/or allocator debug facilities. But if I understand correctly, neither of those things is happening.
