bge if_bge.c

Robert Watson Sat, 23 Dec 2006 15:20:08 -0800


On Sun, 24 Dec 2006, Oleg Bulyzhin wrote:

We currently make this a lot worse than it needs to be by handing off thereceived packets one at a time, unlocking and relocking for every packet.It would be better if the driver's receive interrupt handler would harvestall of the incoming packets and queue them locally. Then, at the end, handoff the linked list of packets to the network stack wholesale, unlockingand relocking only once. (Actually, the list could probably be handed offat the very end of the interrupt service routine, after the driver hasalready dropped its lock.) We wouldn't even need a new primitive, ifether_input() and the other if_input() functions were enhanced to dealwith a possible list of packets instead of just a single one.
I try this experiement every few years, and generally don't measure muchimprovement. I'll try it again with 10gbps early next year once back inthe office again. The more interesting transition is between the linklayer and the network layer, which is high on my list of topics to lookinto in the next few weeks. In particular, reworking the ifqueue handoff.The tricky bit is balancing latency, overhead, and concurrency...
FYI, there are several sets of patches floating around to modify if_em tohand off queues of packets to the link layer, etc. They probably needupdating, of course, since if_em has changed quite a bit in the last year.In my implementaiton, I add a new input routine that accepts mbuf packetqueues.
I'm just curious, do you remember average length of mbuf queue in yourtests? While experimenting with bge(4) driver (taskqueue, interruptmoderation, converted bge_rxeof() to above scheme), i've found it's quiteeasy to exhaust available mbuf clusters under load (trying to queuehundreids of received packets). So i had to limit rx queue to rather lowlength.

Off-hand, I don't remember. I do remember it being very important to maintainbounds on the size of in-flight packet sets at all levels in the stack -- forthe same reason the netisr dispatch queue is bounded. Otherwise if the deviceis able to keep the device driver entirely busy, you'll effectively live-locksince you never dispatch to the next layer, exhaust available memory, etc,etc. One of the ideas I've been futzing with is "back-pressure" across thenetisr and a "checkout" model in which the total length of the queue spanningdevice driver and dispatch through to the protocol has a total bound withreservations taken by components as they process sets of packets. In thisway, the ithread would know the netisr was already in execution and notperform a wakeup (and getting involved in the scheduler), avoid excessivememory consumption, etc. Ed Maste has also suggested changing our notion ofmbuf packet queues, as our current queue model requires following linkedlists, which make inefficient use of of CPU caches, and instead using arraysof mbuf pointers. I've done a bit of experimentation along these lines, butnot enough to investigate the properties well.


Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: cvs commit: src/sys/dev/bge if_bge.c

Reply via email to