Hello, </snip> > > as above, copyout with a sleeping lock is fine. > > the whole point of my change is to give us the ability to lock in the > forwarding path separately to locking in the ioctl path. half of that is > so we can copyout safely. the other half is to avoid letting the ioctl > path block packet processing if we can avoid it as an alternative to > having the network stack having to yield the cpu.
I see. my confusion came from fact I've forgot pflock got turned to mutex, when we saw the crash. > > > let's take a look at this part of pf_purge_expired_states() > > from your diff: > > > > 1543 NET_LOCK(); > > 1544 rw_enter_write(&pf_state_list.pfs_rwl); > > 1545 PF_LOCK(); > > 1546 PF_STATE_ENTER_WRITE(); > > 1547 SLIST_FOREACH(st, &gcl, gc_list) { > > 1548 if (st->timeout != PFTM_UNLINKED) > > 1549 pf_remove_state(st); > > 1550 > > 1551 pf_free_state(st); > > 1552 } > > 1553 PF_STATE_EXIT_WRITE(); > > 1554 PF_UNLOCK(); > > 1555 rw_exit_write(&pf_state_list.pfs_rwl); > > > > at line 1543 we grab NET_LOCK(), at line 1544 we are trying > > to grab new lock (pf_state_list.pfs_rwl) exclusively. > > > > with your change we might be running into situation, where we do copyout() > > as a > > reader on pf_state_list.pfs_rwl. Then we grab NET_LOCK() and attempt to > > acquire > > pf_state_list.pfs_rwl exclusively, which is still occupied by guy, who > > might be > > doing uvm_fault() in copyout(9f). > > > > I'm just worried we may be trading one bug for another bug. may be my > > concern > > is just a false alarm here. I don't know. > > no, these things are all worth discussing. > > it's definitely possible there's bugs in here, but im pretty confident > it's not the copyout one. > it seems to work. I'm running your diff with bluhm's parallel diff and do occasional pfctl -Fs/pfctl -ss under a load. so far so good. </snip> > > I guess 'pfgpurge_expired_fragment(s);' is unintentional change, right? > > yeah, i dont know how i did that. vi is hard? sure it is... thanks for fixing the nits. </snip> > Index: if_pfsync.c > =================================================================== > RCS file: /cvs/src/sys/net/if_pfsync.c,v > retrieving revision 1.292 > diff -u -p -r1.292 if_pfsync.c > --- if_pfsync.c 15 Jun 2021 10:10:22 -0000 1.292 > +++ if_pfsync.c 15 Jun 2021 11:21:20 -0000 > @@ -2545,22 +2545,34 @@ pfsync_bulk_start(void) > { > struct pfsync_softc *sc = pfsyncif; have not spot anything suspicious in if_pfsync.c the new diff reads fine to me. OK sashan