Re: running network stack forwarding in parallel

Martin Pieuchot Mon, 17 May 2021 03:09:14 -0700

On 16/05/21(Sun) 15:56, Vitaliy Makkoveev wrote:
> 
> 
> > On 14 May 2021, at 14:43, Martin Pieuchot <m...@openbsd.org> wrote:
> > 
> > On 13/05/21(Thu) 14:50, Vitaliy Makkoveev wrote:
> >> On Thu, May 13, 2021 at 01:15:05PM +0200, Hrvoje Popovski wrote:
> >>> On 13.5.2021. 1:25, Vitaliy Makkoveev wrote:
> >>>> It seems this lock order issue is not parallel diff specific.
> >>> 
> >>> 
> >>> 
> >>> Yes,  you are right ... it seemed familiar but i couldn't reproduce it
> >>> on lapc trunk or without this diff so i thought that parallel diff is
> >>> one to blame ..
> >>> 
> >>> 
> >>> sorry for noise ..
> >>> 
> >> 
> >> Timeout thread and interface destroy thread are both serialized by
> >> kernel lock so it's hard to catch this issue. So your report is
> >> useful :)
> > 
> > The use of the NET_LOCK() in *clone_destroy() is problematic.  tpmr(4)
> > has a similar problem as reported by Hrvoje in a different thread.  I
> > don't know what it is serializing, hopefully David can tell us more.
> > 
> 
> It serializes detach hook and clone_detach. Detach hooks are executed
> with netlock held. Unfortunately this problem is much complicated,
> and we can’t just introduce new lock to solve it because this will
> introduce lock order issue.


We're talking about different uses of the NET_LOCK().  if_detach() and
if_deactivate() internally grab the NET_LOCK() for the reason you
mentioned.

I'm asking what in aggr_down() and aggr_p_dtor() or respectively
tpmr_down() and tpmr_p_dtor() require the NET_LOCK() and if this could
be done differently.

Re: running network stack forwarding in parallel

Reply via email to