On Sun, 25 Apr 2004, David Burns wrote:No argument that a DELAY(x) delays for a minimum of x microseconds - this is what we're seeing. The fact that we're using a DELAY() which can be interrupted inside locked code seems problematic - although I guess it just slows driver operation down.
Mike Silbersack wrote:
On Sat, 24 Apr 2004, David Burns wrote:
NB this assumes that a DELAY(1) is really a delay of 1?s! Which I don't think it is ... :-(
Correct, DELAY takes far longer than it should.
Actually, it takes at least as long as it should (normally a few microseconds longer than the specified delay, but hundreds or thousands of microseconds longer if it is interrupted). The bus ISA accesses in it can also be delayed by (PCI) bus activity (I've measured 170 usec for the 3 accesses in getit() which normally take 3-4 usec).
If you're really interested in fixing the problem and not inadvertantly breaking older cards, what you should do is implement a nanodelay function that actually delays for the time it's supposed to and then delay the rated amount. Removing all delays will probably break something somewhere.
We could probably build a driver specific nanodelay function based on dummy PCI operations. Some will say this sucks but then I'd argue it's better than the current DELAY implementation.
No, it would be considerably worse. DELAY() has a poor resolution because non-dummy ISA operations that it uses to read the time are slow. Dummy PCI operations aren't much faster, depending on which address they are at. They would be at least 3 times faster in practice since the current implementation of DELAY() needs 3 ISA operations. DELAY() could probably use only the low byte of an unlatched counter if its efficiency were important. I think it is unimportant, since only broken code uses busy-wait.
Sorry I should have made myself clearer. Given the evidence that a DELAY(1) delays for far more than 1 microsecond we just need some other kind of known delay which will allow us to wait a few hundred nanoseconds (the MDIO clock period of most 100Mb/s PHYs) instead of a DELAY which is an order of magnitude higher (and is subject to interrupts). A dummy PCI operation would achieve this.
Yes the term nanosecond delay is inappropriate - when it is only a submicrosecond delay we need.
Anyway, you won't get near nansoseconds reasolution or PCI clock resolution (30 nsec) using PCI i/o instructions. rdtsc on i386's and delay loops on all machines can easily do better provided the CPU doesn't get throttled.
Of course just sending one bit of data on the MDIO will take us about 600 nanoseconds - resulting in a 1.6MHz clock.
Except some machines add lots of wait states. I have a PCI card which can usually be accessed in 467 nsec (write) and 150 nsec (read) on one machine, but on a newer machine which is otherwise 6 times faster but appears to have a slow PCI bugs (ASUS A7N8X-E), the access times increase to 943 nsec (write) and 290 nsec (read).
A PCI implementation built from ISA components perhaps ... :-)
It still comes back to slowing down PHY accesses without using DELAY().
The fact that ste DELAY() removal provided a small but non-trivial improvement in network performance (including other network cards on the same PCI bus) underlines how horrible the use of DELAY() is.
I'm only after a simple fix - experiment with removal of MII code DELAY() on the affected drivers and commit the change only where testing results are good.
David _______________________________________________ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
