Hi!

Can you give us actual numbers? Something that we can easily graph?

Do you have a diff against -HEAD? I'd like to stare at it a bunch and think
about merging some stuff in. :-)

How are you testing it? Is it something I can set up in our lab and
thoroughly thrash?

I'm very close to starting an mbuf batching thing to use in a few places
like receive, transmit and transmit completion -> free path. I'd be
interested in your review/feedback and testing as it sounds like something
you can easily stress test there. :)

Thanks,


-Adrian


On 24 August 2013 05:48, Alexander V. Chernikov <melif...@ipfw.ru> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 22.08.2013 00:51, Andre Oppermann wrote:
> > On 19.08.2013 13:42, Alexander V. Chernikov wrote:
> >> On 14.08.2013 19:48, Luigi Rizzo wrote:
> >>> On Wed, Aug 14, 2013 at 05:40:28PM +0200, Marko Zec wrote:
> >>>> On Wednesday 14 August 2013 14:40:24 Luigi Rizzo wrote:
> >>>>> On Wed, Aug 14, 2013 at 04:15:25PM +0400, Alexander V.
> >>>>> Chernikov wrote:
> >>> ...
> >>>> FWIW, apparently we already have that infrastrucure in place
> >>>> - if_rele() calls if_free_internal() only when the last
> >>>> reference to the ifnet is dropped, so with little care this
> >>>> should be usable for caching ifp pointers w/o fears for
> >>>> kernel crashes mentioned above.
> >>> maybe Alexander was referring to holding references to the rte
> >>> entries returned as a result of the lookup. The rte holds a
> >>> reference to the ifp.
> >>
> >> Yes. Since there is the only refcount which is protected (and is
> >> also a huge performance killer).
> >>
> >> Btw, there is a picture describing IPv4 packet flow from my
> >> still-not-written post related network stack performance, maybe
> >> it can be useful:
> >> http://static.ipfw.ru/images/freebsd_ipv4_flow.png
> >
> > Wow, that's really cool.  Please note that a rmlock doesn't cost
> > anything for the read case (unless contended of course).  Whereas
> > normal rlocks or
> We're running this entire stack without singe rwlock (everything is
> either converted to rmlock or using lockless data copies with delayed
> GC (in_adrr_local and other similar)). It really is fasters, but,
> however, due to current process-to-completion routing architecture
> this is limited to 5-6MPPS for 12 cores on 2xE5645.
>
> > rwlocks write to the lock memory location and cause atomic bus lock
> > cycles as well as a lot of cache line invalidations across cores.
> > The same is true for refcounts.
> >
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.20 (FreeBSD)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iEYEARECAAYFAlIYq7QACgkQwcJ4iSZ1q2nFZwCfZLckg4b/iny2CK+bYJa20XxE
> y7UAnRZHVr4AZRYnB8acrN54KtRMpvNQ
> =0kPb
> -----END PGP SIGNATURE-----
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to