On Mon, Dec 01, 2025 at 03:35:04AM -0800, Breno Leitao wrote:
> Given you are completely lockless here, so, there is a chance you hit
> a TOCTOU, also.
> 
> I think you want to have dynamic_netconsole_mutex held during the
> operation of process_resume_target().
> 
>   * mutex_lock(&dynamic_netconsole_mutex);
>   * remove from the list
>   * resume
>   * re-add to the list
>   * mutex_unlock(&dynamic_netconsole_mutex);
>   

This is a pretty good point. Will address this on the next version.

> > +   if (bound_by_mac(nt))
> > +           /* ensure netpoll_setup will retrieve device by mac */
> > +           memset(&nt->np.dev_name, 0, IFNAMSIZ);
> 
> This is a clean-up step that was missing whent the target is getting
> down, and htis is just a work around that doesn't belong in here.
> 
> Please move it to netconsole_process_cleanups_core(), in a separate
> patch.

Sounds good. Will include this as a separated patch on the next version of this
series.

> Something as: 
> 
>       list_for_each_entry_safe(nt, tmp, &target_cleanup_list, list)
>               do_netpoll_cleanup(&nt->np);
>               if (bound_by_mac(nt))
>                       memset(&nt->np.dev_name, 0, IFNAMSIZ);
>                       
> 
> Ideally this should belong to do_netpoll_cleanup(), but let's keep it in
> netconsole_process_cleanups_core() for three reasons:
> 
> 
> 1) Bounding by mac is a netconsole concept
> 2) do_netpoll_cleanup() is only used by netconsole, and I plan to move
>    it back to netconsole. Some PoC in [1]
> 3) bound_by_mac() should be in netconsole and we do not want to export
>    it.
> 
> [1]:
> https://lore.kernel.org/all/[email protected]/

The reasoning makes sense to me. I considered performing this cleanup on 
netpoll,
but given your patch opted for this 'hack' before setup - I think doing it on
netconsole_process_cleanups_core makes more sense. 

I need to check this more, but I'm wondering if we would be able to completely
remove dev_name and dev_mac from netpoll and make it part of the 
netconsole_target.
Perhaps as a future refactoring after your patch series.

> 
> It needs to be initialized earlier before the kzalloc, otherwise we
> might hit a similar problem to the one fixed by e5235eb6cfe0  ("net:
> netpoll: initialize work queue before error checks")
> 
> The code path would be:
>   * alloc_param_target()
>         * alloc_and_init()
>                 * kzalloc() fails and return NULL.
>                 * resume_wq() is still not initialized
>   fail:
>       * free_param_target()
>               * cancel_work_sync(&nt->resume_wq); and resume_wq is not
>                 initialized

Ack. Will fix this.

> 
> Thanks for the patch,
> --breno

Thanks again for the review. Will submit a new version addressing all the 
comments
once net-next re-opens.

-- 
Andre Carvalho

Reply via email to