On Sun, 11 Oct 1998, Matti Aarnio wrote:
> The version of the Tulip driver at 2.1.124 contains various other
> bugfixes, most important of which (I think) is SKB allocation out-of-memory
> handling code, which Donald has not yet believed to be necessary for those
> drivers which have rx_skbuff[] rings. :-(
Look again: the older versions of the tulip driver (and most other
descriptor-based bus-master driver) used an drop-immediately scheme. If
they could not allocate a refill skbuff, they dropped the packet and kept
the old skbuff on the receive ring. This was to avoid having the chip ever
shut down from lack of receive buffers.
This worked pretty well until there was a kernel version that tended to keep
too little free memory around for interrupt handlers. Even though there was
plenty of physical memory, it was all being used for buffer cache. This
caused drivers to drop a lot of packets even though a few milliseconds later
the kernel could come up with a whole bunch of free memory.
So I slowly changed all of my descriptor-based drivers to instead use a
"buffer deficit" scheme, where the driver passes all of the received
packets to the queue layer in the first phase, and then refills the receive
ring later. If the machine runs temporarily short of memory the recieve
ring will be refilled at the next Rx interrupt. Only if it's persistently
short of memory will the receive ring empty of buffers and drop packets.
This "buffer deficit" scheme has another beneficial effect: most new chips
which implement hardware flow control can only trigger FC packets when
running out of buffers, not in response to a driver-level request. BTW,
hardware flow control is unrelated to the misnamed code in the new kernel.
That new code should be called "randomly drop packets when actually doing
work".
> Speaking of which, a quick browse shows me that:
>
> tulip.c at 2.0.35 has this rx_skbuff[] ring problem.
See above -- it just has a different structure.
> 3c59x.c (up to and including version 0.99G!) driver does blow up if that
> ring can not be refilled successfully...
The 3c59x.c driver has a related bug: it has a limit check at the end of a
loop instead of the beginning. (A mistaken optimization on my part.) If
the receive ring cannot be filled and empties completely, there are
obviously no packets to receive. But with the limit check at the end of the
loop, the driver will think that and old, already-processed packet is a new
one. Disaster occurs.. This same bug exists in the eepro100.c driver.
(However the current tulip driver *does* have the check in the correct
place.)
> eepro100.c's code (at 2.1.124, v0.36) is rather unclear at what it does,
> but it looks likely to be able to survive the situation.
That old driver version has another low-memory receive bug, and a multicast
list race condition. The new driver, v1.04, supports newer board versions, is
much more robust when rapidly changing large multicast filters (Appletalk
environments!), cache aligns transmit descriptors, and enables hardware flow
control on the i82558.
> de4x5.c (v0.542) driver will always alloc a new skb at an Alpha, and
> then copy the data there. If the alloc fails, it is skipped. (No kernel
> crash from that...) (But this isn't Donald's rx_skbuff[] beasts either..)
That's similar to what my drivers used to do. It's a natural progression
from the original driver model, which was "copy into a fresh skbuff", to the
zero-copy "receive directly into an skbuff and refill the slot immediately".
An advantage to copying with the Alpha is that the Tulip must have
word-aligned receive buffers. With 14 byte Ethernet headers, the IP headers
are always misaligned. The Linux protocol code doesn't (didn't) have
get_unaligned() calls for disassembling packets, so uncopied packets cause
alignment traps. When copying packets (or with a better chip) you can do
skb_reserve(skb,2) to align the IP header.
Donald Becker [EMAIL PROTECTED]
USRA-CESDIS, Center of Excellence in Space Data and Information Sciences.
Code 930.5, Goddard Space Flight Center, Greenbelt, MD. 20771
301-286-0882 http://cesdis.gsfc.nasa.gov/people/becker/whoiam.html
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]