On Wed, Jun 6, 2018 at 12:03 PM, Peter Holm <pe...@holm.cc> wrote: > On Thu, May 17, 2018 at 05:59:35PM +0000, Matt Macy wrote: >> Author: mmacy >> Date: Thu May 17 17:59:35 2018 >> New Revision: 333744 >> URL: https://svnweb.freebsd.org/changeset/base/333744 >> >> Log: >> AF_UNIX: make unix socket locking finer grained >> >> This change moves to using a reference count across lock drop / reacquire >> to guarantee liveness. >> >> Currently sends on unix sockets contend heavily on read locking the list >> lock. >> unix1_processes in will-it-scale peaks at 6 processes and then declines. >> >> With this change I get a substantial improvement in number of operations >> per second >> with 96 processes: >> >> x before >> + after >> N Min Max Median Avg Stddev >> x 11 1688420 1696389 1693578 1692766.3 2971.1702 >> + 10 63417955 71030114 70662504 69576423 2374684.6 >> Difference at 95.0% confidence >> 6.78837e+07 +/- 1.49463e+06 >> 4010.22% +/- 88.4246% >> (Student's t, pooled s = 1.63437e+06) >> >> And even for 2 processes shows a ~18% improvement. >> "Small" iron changes (1, 2, and 4 processes): >> >> x before1 >> + after1.2 >> +------------------------------------------------------------------------+ >> | + | >> | x + | >> | x + | >> | x + | >> | x ++ | >> | xx ++ | >> |x x xx ++ | >> | |__________________A_____M_____AM____|| >> +------------------------------------------------------------------------+ >> N Min Max Median Avg Stddev >> x 10 1131648 1197750 1197138.5 1190369.3 20651.839 >> + 10 1203840 1205056 1204919 1204827.9 353.27404 >> Difference at 95.0% confidence >> 14458.6 +/- 13723 >> 1.21463% +/- 1.16683% >> (Student's t, pooled s = 14605.2) >> >> x before2 >> + after2.2 >> +------------------------------------------------------------------------+ >> | +| >> | +| >> | +| >> | +| >> | +| >> | +| >> | x +| >> | x +| >> | x xx +| >> |x xxxx +| >> | |___AM_| A| >> +------------------------------------------------------------------------+ >> N Min Max Median Avg Stddev >> x 10 1972843 2045866 2038186.5 2030443.8 21367.694 >> + 10 2400853 2402196 2401043.5 2401172.7 385.40024 >> Difference at 95.0% confidence >> 370729 +/- 14198.9 >> 18.2585% +/- 0.826943% >> (Student's t, pooled s = 15111.7) >> >> x before4 >> + after4.2 >> N Min Max Median Avg Stddev >> x 10 3986994 3991728 3990137.5 3989985.2 1300.0164 >> + 10 4799990 4806664 4806116.5 4805194 1990.6625 >> Difference at 95.0% confidence >> 815209 +/- 1579.64 >> 20.4314% +/- 0.0421713% >> (Student's t, pooled s = 1681.19) >> >> Tested by: pho >> Reported by: mjg >> Approved by: sbruno >> Sponsored by: Limelight Networks >> Differential Revision: https://reviews.freebsd.org/D15430 >> >> Modified: >> head/sys/kern/uipc_usrreq.c >> head/sys/sys/unpcb.h >> >> Modified: head/sys/kern/uipc_usrreq.c >> ============================================================================== >> --- head/sys/kern/uipc_usrreq.c Thu May 17 17:57:41 2018 >> (r333743) > > This commit seems to cause this: > > 20180606 20:44:37 all (1/1): datagram.sh > panic: _mtx_lock_sleep: recursed on non-recursive mutex unp_mtx @ > ../../../kern/uipc_usrreq.c:354 > > cpuid = 91 > time = 1528310678 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00f3a2bcd0 > vpanic() at vpanic+0x1a3/frame 0xfffffe00f3a2bd30 > doadump() at doadump/frame 0xfffffe00f3a2bdb0 > __mtx_lock_sleep() at __mtx_lock_sleep+0x49d/frame 0xfffffe00f3a2be30 > __mtx_lock_flags() at __mtx_lock_flags+0xf9/frame 0xfffffe00f3a2be80 > unp_connectat() at unp_connectat+0x2ec/frame 0xfffffe00f3a2c110 > soconnectat() at soconnectat+0xe8/frame 0xfffffe00f3a2c160 > kern_connectat() at kern_connectat+0x10f/frame 0xfffffe00f3a2c1c0 > sys_connect() at sys_connect+0x77/frame 0xfffffe00f3a2c200 > amd64_syscall() at amd64_syscall+0x2a7/frame 0xfffffe00f3a2c330 > fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00f3a2c330 > --- syscall (98, FreeBSD ELF64, sys_connect), rip = 0x80040081a, rsp = > 0x7fffffffe2e8, rbp = 0x7fffffffe770 --- > KDB: enter: panic > [ thread pid 1276 tid 100966 ] > Stopped at kdb_enter+0x3b: movq $0,kdb_why > db> x/s version > version: FreeBSD 12.0-CURRENT #0 r333744: Wed Jun 6 20:35:18 CEST > 2018\012 p...@flix1a.netperf.freebsd.org:/usr/src/sys/amd64/compile/PHO\012 > db>
I fixed this in 334175 and 334185. > > The stack trace is different on HEAD: > > panic: mutex unp not owned at ../../../kern/uipc_usrreq.c:879 > cpuid = 47 > time = 1528311629 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00ee593ca0 > vpanic() at vpanic+0x1a3/frame 0xfffffe00ee593d00 > panic() at panic+0x43/frame 0xfffffe00ee593d60 > __mtx_assert() at __mtx_assert+0xb4/frame 0xfffffe00ee593d70 > uipc_disconnect() at uipc_disconnect+0x93/frame 0xfffffe00ee593dc0 > soclose() at soclose+0xa5/frame 0xfffffe00ee593e30 > closef() at closef+0x1f5/frame 0xfffffe00ee593ec0 > fdescfree_fds() at fdescfree_fds+0x90/frame 0xfffffe00ee593f10 > fdescfree() at fdescfree+0x480/frame 0xfffffe00ee593fd0 > exit1() at exit1+0x500/frame 0xfffffe00ee594040 > sigexit() at sigexit+0xae8/frame 0xfffffe00ee594220 > postsig() at postsig+0x1cf/frame 0xfffffe00ee5942f0 > ast() at ast+0x4b8/frame 0xfffffe00ee594330 > doreti_ast() at doreti_ast+0x1f/frame 0x7fffffffe760 > KDB: enter: panic > [ thread pid 1013 tid 100834 ] > Stopped at kdb_enter+0x3b: movq $0,kdb_why > db> x/s version > version: FreeBSD 12.0-CURRENT #0 r334723: Wed Jun 6 20:53:32 CEST > 2018\012 p...@flix1a.netperf.freebsd.org:/usr/src/sys/amd64/compile/PHO\012 > db> I'll take a look. -M _______________________________________________ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"