On 26 August 2016 at 14:36, Slawa Olhovchenkov <s...@zxy.spb.ru> wrote: > On Fri, Aug 26, 2016 at 02:32:00PM -0700, Adrian Chadd wrote: > >> Hi, >> >> It's pcb lock contention. > > Not sure: only 5% of all time. > And same 5% for tcbhashsize = 65K and 256K. > Or you talk about some more thin effect?
You're in the inpcb lock from multiple places. the tcbhashsize doesnt influence the pcb lock contention - it just affects how long you take doing lookups. iF your hash table is too small then you end up doing lots of O(n) walks of a hash bucket to find a pcb entry. :) -adrian >> >> On 26 August 2016 at 08:13, Slawa Olhovchenkov <s...@zxy.spb.ru> wrote: >> > On Fri, Aug 26, 2016 at 04:01:14PM +0100, Bruce Simpson wrote: >> > >> >> Slawa, >> >> >> >> I'm afraid this may be a bit of a non-sequitur. Sorry.. I seem to be >> >> missing something. As I understand it this thread is about Ryan's change >> >> to netinet for broadcast. >> >> >> >> On 26/08/16 15:49, Slawa Olhovchenkov wrote: >> >> > On Sun, Aug 21, 2016 at 03:04:00AM +0300, Slawa Olhovchenkov wrote: >> >> >> On Sun, Aug 21, 2016 at 12:25:46AM +0100, Bruce Simpson wrote: >> >> >>> Whilst I agree with your concerns about multipoint, I support the >> >> >>> motivation behind Ryan's original change: optimize the common case. >> >> >> >> >> >> Oh, common case... >> >> >> I am have pmc profiling for TCP output and see on this SVG picture and >> >> >> don't find any simple way. >> >> >> You want to watch too? >> >> > >> >> > At time peak network traffic (more then 25K connections, about 20Gbit >> >> > total traffic) half of cores fully utilised by network stack. >> >> > >> >> > This is flamegraph from one core: http://zxy.spb.ru/cpu10.svg >> >> > This is same, but stack cut of at ixgbe_rxeof for more unified >> >> > tcp/ip stack view http://zxy.spb.ru/cpu10u.svg >> >> ... >> >> >> >> I appreciate that you've taken the time to post a flamegraph (a >> >> fashionable visualization) of relative performance in the FreeBSD >> >> networking stack. >> >> >> >> Sadly, I am mostly out of my depth for looking at stack wide performance >> >> for the moment; for the things I look at involving FreeBSD at work just >> >> at the moment, I would not generally go down there except for specific >> >> performance issues (e.g. with IEEE 1588). >> >> >> >> It sounds as though perhaps you should raise a wider discussion about >> >> your results on -net. I would caution you however that the Function >> >> Boundary Trace (FBT) provider for DTrace can introduce a fair amount of >> >> noise to the raw performance data because of the trap mechanism it uses. >> >> This ruled it out for one of my own studies requiring packet-level >> >> accuracy. >> >> >> >> Whilst raw pmc(4) profiles may require more post-processing, they will >> >> provide less equivocal data (and a better fix) on the hot path, due also >> >> to being sampled effectively on a PMC interrupt (a gather stage- poll >> >> core+uncore MSRs), not purely a software timer interrupt. >> > >> > Thanks for answer, I am now try to start discussion on -net. _______________________________________________ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"