On Fri, May 29, 2020 at 08:48:14AM +0200, Martin Pieuchot wrote:
> On 28/05/20(Thu) 14:59, Vitaliy Makkoveev wrote:
> > socket(2) layer is already protected by solock(). It grabs NET_LOCK()
> > for inet{,6}(4) sockets, but all other sockets are still under
> > KERNEL_LOCK().
> >
> > I guess solock is already placed everythere it's required. Also `struct
> > file' is already mp-safe.
> >
> > Diff below introduces rwlock `unp_lock'. It's like NET_LOCK()'s
> > `netlock' but for PF_UNIX sockets. This lock protects the whole PF_UNIX
> > layer. Vnode layer accessed in "connect" "bind" and "desconnect" cases
> > is still under KERNEL_LOCK().
> >
> > I'am experimenting with his diff since 19.04.2020 and my test systems,
> > include Gnome workstaion are stable, without any issue.
>
> Looks great, more tests are required :)
>
> Your diff has many trailing spaces, could you remove them? Commonly
> used editors or diff programs have a way to highlight them :)
Ok I will fix it.
>
> One comment:
>
> > Index: sys/kern/uipc_usrreq.c
> > ===================================================================
> > RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v
> > retrieving revision 1.142
> > diff -u -p -r1.142 uipc_usrreq.c
> > --- sys/kern/uipc_usrreq.c 16 Jul 2019 21:41:37 -0000 1.142
> > +++ sys/kern/uipc_usrreq.c 1 May 2020 13:47:11 -0000
> > @@ -409,12 +418,17 @@ unp_detach(struct unpcb *unp)
> > {
> > struct vnode *vp;
> >
> > + rw_assert_wrlock(&unp_lock);
> > +
> > LIST_REMOVE(unp, unp_link);
> > if (unp->unp_vnode) {
> > + /* this is an unlocked cleaning of `v_socket', but it's safe */
> > unp->unp_vnode->v_socket = NULL;
> > vp = unp->unp_vnode;
> > unp->unp_vnode = NULL;
> > + KERNEL_LOCK();
> > vrele(vp);
> > + KERNEL_UNLOCK();
>
> Why is it safe? That's what the comment should explain :) If the code
> doesn't take the lock that implies it is not required. Why it is not
> required is not obvious.
>
The only place where `v_socket' is accessed outside PF_UNIX layer is
kern/kern_sysctl.c:1143. This is just pointer to integer cast. Pointer
assignment is atomic.
Also we are accessing this `unp' in unp_detach(), unp_bind() and
unp_connect() but they are serialized by `unp_lock'.
This is the last reference to this `unp' and also there is the only
reference in socket layer to `unp_vnode'. Since `unpcb' has no reference
counter we can't add assertion to be sure this is the last reference to
`unp'. We can check reference count of vnode. But since sys_open() and
unp_detach() are not serialized by KERNEL_LOCK(), vn_open() called by
doopenat() will raise `v_usecount' before `v_type' check
(kern/vfs_vnops.c:112). And `vn->v_usecount' in this unp_detach() call
can be "2".
In fact, only vrele(9) should be called under KERNEL_LOCK() to protect
vnode. If we want to put "KASSERT(vn->v_usecount==1)" we should move it
under KERNEL_LOCK() before `v_socket' cleaning to be sure there is no
concurency with sys_open().
Something like this:
if (unp->unp_vnode) {
KERNEL_LOCK();
KASSERT(unp->unp_vnode->v_usecount == 1);
unp->unp_vnode->v_socket = NULL;
vp = unp->unp_vnode;
unp->unp_vnode = NULL;
vrele(vp);
KERNEL_UNLOCK();
}
I will update diff if this is required.