On Mon, Dec 01, 2025 at 03:35:04AM -0800, Breno Leitao wrote: > Given you are completely lockless here, so, there is a chance you hit > a TOCTOU, also. > > I think you want to have dynamic_netconsole_mutex held during the > operation of process_resume_target(). > > * mutex_lock(&dynamic_netconsole_mutex); > * remove from the list > * resume > * re-add to the list > * mutex_unlock(&dynamic_netconsole_mutex); >
This is a pretty good point. Will address this on the next version. > > + if (bound_by_mac(nt)) > > + /* ensure netpoll_setup will retrieve device by mac */ > > + memset(&nt->np.dev_name, 0, IFNAMSIZ); > > This is a clean-up step that was missing whent the target is getting > down, and htis is just a work around that doesn't belong in here. > > Please move it to netconsole_process_cleanups_core(), in a separate > patch. Sounds good. Will include this as a separated patch on the next version of this series. > Something as: > > list_for_each_entry_safe(nt, tmp, &target_cleanup_list, list) > do_netpoll_cleanup(&nt->np); > if (bound_by_mac(nt)) > memset(&nt->np.dev_name, 0, IFNAMSIZ); > > > Ideally this should belong to do_netpoll_cleanup(), but let's keep it in > netconsole_process_cleanups_core() for three reasons: > > > 1) Bounding by mac is a netconsole concept > 2) do_netpoll_cleanup() is only used by netconsole, and I plan to move > it back to netconsole. Some PoC in [1] > 3) bound_by_mac() should be in netconsole and we do not want to export > it. > > [1]: > https://lore.kernel.org/all/[email protected]/ The reasoning makes sense to me. I considered performing this cleanup on netpoll, but given your patch opted for this 'hack' before setup - I think doing it on netconsole_process_cleanups_core makes more sense. I need to check this more, but I'm wondering if we would be able to completely remove dev_name and dev_mac from netpoll and make it part of the netconsole_target. Perhaps as a future refactoring after your patch series. > > It needs to be initialized earlier before the kzalloc, otherwise we > might hit a similar problem to the one fixed by e5235eb6cfe0 ("net: > netpoll: initialize work queue before error checks") > > The code path would be: > * alloc_param_target() > * alloc_and_init() > * kzalloc() fails and return NULL. > * resume_wq() is still not initialized > fail: > * free_param_target() > * cancel_work_sync(&nt->resume_wq); and resume_wq is not > initialized Ack. Will fix this. > > Thanks for the patch, > --breno Thanks again for the review. Will submit a new version addressing all the comments once net-next re-opens. -- Andre Carvalho

