> Date: Fri, 10 Aug 2018 17:51:11 +0200 > From: Edgar Fuß <e...@math.uni-bonn.de> > > I'm currently running an 8.0_STABLE kernel on the machine (with 6.1_STABLE > userland) and no panics so far. This smay be > -- luck > -- different timing that doesn't trigger the race > -- a bug fixed since 6.1 > > If someone remembers a bug in this area fixed since 6.1, we can stop here.
I don't remember a specific bug but that is entirely plausible. There have been some possibly relevant changes, e.g. uipc_usrreq.c 1.175. > What I guess is that that -16L is MUTEX_THREAD and was put there by > MUTEX_DESTROY(), called by mutex_destroy() called by soput() by another > thread that ran during the preemtion-enabled phase. > > Any other ideas on how that -16L could go there? This sounds plausible. > Could I install some hack, that, in soput(), would panic if the socket to be > freed is the one unp_gc() is currently working on? If that would trigger, > we'd > get a useful traceback, no? And if that panic doesn't trigger, but the other > one does, we'd know that some of my assumptions were wrong. They are unlikely to coincide like that. More likely is that the socket is prematurely freed before unp_gc grabs it at all. You could do something like create a global variable that stores the socket pointer that unp_gc is currently working on, shortly before it tries solock, and kassert that soclose isn't given that. But I'm not optimistic that there's enough of a window there for you to catch anyone in the act.