On Thu, Apr 27, 2023 at 02:54:38PM +0200, Claudio Jeker wrote: > On Thu, Apr 27, 2023 at 01:55:33PM +0300, Vitaliy Makkoveev wrote: > > Now only direct netlock used for inet sockets protection. The unlocked > > access to all other sockets is safe, but we could lost consistency for a > > little. Since the solock() used for sockets protection, make locking > > path common and use it. Make it shared because this is read-only access > > to sockets data. For same reason use shared netlock while performing > > inpcb tables foreach walkthrough. > > This is still wrong. fill_file() is not allowed to sleep. So you can not > use a rwlock in here. fill_file is called inside a allprocess > LIST_FOREACH loop and sleeping there will blow up.
Yes. > Maybe it is enough to grab a shared NETLOCK around the LIST_FOREACH() to > ensure that we don't sleep inside. I have tired to fix this for years. It is not easy. Taking the net lock to early results in a lock ordering problem. Down in fill_file() fdplock() is called: #define fdplock(fdp) NET_ASSERT_UNLOCKED(); rw_enter_write(&(fdp)->fd_lock); This guarantees lock ordering fdlock before netlock. You probably run into NFS problems if you want to relax that. > Please fix this properly. Right now running fstat is like playing russian > roulette. Shared netlock is possible russian roulette with less bullets compared to exclusive netlock. For network sockets pcb mutex and pru_lock the situation gets better. For non-network sockets this diff does rw_enter_write(&so->so_lock). Do we need that? It means more sleeping points and more opportunity to crash. bluhm > > Index: sys/kern/kern_sysctl.c > > =================================================================== > > RCS file: /cvs/src/sys/kern/kern_sysctl.c,v > > retrieving revision 1.411 > > diff -u -p -r1.411 kern_sysctl.c > > --- sys/kern/kern_sysctl.c 22 Jan 2023 12:05:44 -0000 1.411 > > +++ sys/kern/kern_sysctl.c 27 Apr 2023 10:18:01 -0000 > > @@ -1173,14 +1173,11 @@ fill_file(struct kinfo_file *kf, struct > > > > if (so == NULL) { > > so = (struct socket *)fp->f_data; > > + solock_shared(so); > > + locked = 1; > > + } else { > > /* if so is passed as parameter it is already locked */ > > - switch (so->so_proto->pr_domain->dom_family) { > > - case AF_INET: > > - case AF_INET6: > > - NET_LOCK(); > > - locked = 1; > > - break; > > - } > > + soassertlocked(so); > > } > > > > kf->so_type = so->so_type; > > @@ -1203,14 +1200,13 @@ fill_file(struct kinfo_file *kf, struct > > kf->so_splicelen = -1; > > if (so->so_pcb == NULL) { > > if (locked) > > - NET_UNLOCK(); > > + sounlock_shared(so); > > break; > > } > > switch (kf->so_family) { > > case AF_INET: { > > struct inpcb *inpcb = so->so_pcb; > > > > - NET_ASSERT_LOCKED(); > > if (show_pointers) > > kf->inp_ppcb = PTRTOINT64(inpcb->inp_ppcb); > > kf->inp_lport = inpcb->inp_lport; > > @@ -1232,7 +1228,6 @@ fill_file(struct kinfo_file *kf, struct > > case AF_INET6: { > > struct inpcb *inpcb = so->so_pcb; > > > > - NET_ASSERT_LOCKED(); > > if (show_pointers) > > kf->inp_ppcb = PTRTOINT64(inpcb->inp_ppcb); > > kf->inp_lport = inpcb->inp_lport; > > @@ -1279,7 +1274,7 @@ fill_file(struct kinfo_file *kf, struct > > } > > } > > if (locked) > > - NET_UNLOCK(); > > + sounlock_shared(so); > > break; > > } > > > > @@ -1375,7 +1370,7 @@ sysctl_file(int *name, u_int namelen, ch > > if (arg == DTYPE_SOCKET) { > > struct inpcb *inp; > > > > - NET_LOCK(); > > + NET_LOCK_SHARED(); > > mtx_enter(&tcbtable.inpt_mtx); > > TAILQ_FOREACH(inp, &tcbtable.inpt_queue, inp_queue) > > FILLSO(inp->inp_socket); > > @@ -1395,7 +1390,7 @@ sysctl_file(int *name, u_int namelen, ch > > FILLSO(inp->inp_socket); > > mtx_leave(&rawin6pcbtable.inpt_mtx); > > #endif > > - NET_UNLOCK(); > > + NET_UNLOCK_SHARED(); > > } > > fp = NULL; > > while ((fp = fd_iterfile(fp, p)) != NULL) { > > > > -- > :wq Claudio