On Sun, 30 Mar 2008, Alexander Motin wrote:

My initial leaning would be that we would like to avoid adding too many more threads that will do per-packet work, as that leads to excessive context switching.

Netgraph uses queueing only as last resort, when direct call is not possible due to locking or stack limitations. For example, while working with kernel sockets (*upcall)() I have got many issues which make impossible to freely use received data without queueing as upcall() caller holds some locks leading to unpredicted LORs in socket/TCP/UDP code. In case of such forced queueing, node becomes an independent data source which can be pinned to and processed by whatever specialized thread or netisr, when it will be able to do it more effectively.

I guess my caution is that it does not necessarily follow from a design that allows for explicit parallelism that the implementation will use it well, and that any time context switchs are necessarily introduced, cost goes up. The move to direct dispatch from the ithread, despite reducing opportunities for parallelism, significantly increased performance for many local workloads. If we have a netisr thread, an ithread, and a netgraph thread, the potential context switch overhead is significant, even if we are doing a good job at batching work transfer between them. Often times, the way this behaves in practice is quite dependent on scheduling, and right now we have known defficiencies in this area, so give it a try on an SMP box and see what happens. Since you're a FreeBSD committer, you can sign up to use the netperf cluster, which might not be a bad idea if you don't have local access to a good SMP test setup.

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to