Bruce Evans wrote:

On Sun, 25 Apr 2004, David Burns wrote:

Mike Silbersack wrote:

On Sat, 24 Apr 2004, David Burns wrote:

NB this assumes that a DELAY(1) is really a delay of 1?s! Which I don't
think it is ... :-(

Correct, DELAY takes far longer than it should.

Actually, it takes at least as long as it should (normally a few microseconds longer than the specified delay, but hundreds or thousands of microseconds longer if it is interrupted). The bus ISA accesses in it can also be delayed by (PCI) bus activity (I've measured 170 usec for the 3 accesses in getit() which normally take 3-4 usec).

No argument that a DELAY(x) delays for a minimum of x microseconds - this is what we're seeing. The fact that we're using a DELAY() which can be interrupted inside locked code seems problematic - although I guess it just slows driver operation down.

If you're really interested in fixing the problem and not inadvertantly
breaking older cards, what you should do is implement a nanodelay function
that actually delays for the time it's supposed to and then delay the
rated amount.  Removing all delays will probably break something
somewhere.

We could probably build a driver specific nanodelay function based on dummy PCI operations. Some will say this sucks but then I'd argue it's better than the current DELAY implementation.


No, it would be considerably worse.  DELAY() has a poor resolution
because non-dummy ISA operations that it uses to read the time are
slow.  Dummy PCI operations aren't much faster, depending on which
address they are at.  They would be at least 3 times faster in practice
since the current implementation of DELAY() needs 3 ISA operations.
DELAY() could probably use only the low byte of an unlatched counter
if its efficiency were important.  I think it is unimportant, since
only broken code uses busy-wait.

Sorry I should have made myself clearer. Given the evidence that a DELAY(1) delays for far more than 1 microsecond we just need some other kind of known delay which will allow us to wait a few hundred nanoseconds (the MDIO clock period of most 100Mb/s PHYs) instead of a DELAY which is an order of magnitude higher (and is subject to interrupts). A dummy PCI operation would achieve this.

Anyway, you won't get near nansoseconds reasolution or PCI clock resolution (30 nsec) using PCI i/o instructions. rdtsc on i386's and delay loops on all machines can easily do better provided the CPU doesn't get throttled.

Yes the term nanosecond delay is inappropriate - when it is only a submicrosecond delay we need.

Of course just sending one bit of data on the MDIO will take us about
600 nanoseconds - resulting in a 1.6MHz clock.


Except some machines add lots of wait states.  I have a PCI card which
can usually be accessed in 467 nsec (write) and 150 nsec (read) on one
machine, but on a newer machine which is otherwise 6 times faster but
appears to have a slow PCI bugs (ASUS A7N8X-E), the access times
increase to 943 nsec (write) and 290 nsec (read).


A PCI implementation built from ISA components perhaps ... :-)


It still comes back to slowing down PHY accesses without using DELAY().
The fact that ste DELAY() removal provided a small but non-trivial improvement in network performance (including other network cards on the same PCI bus) underlines how horrible the use of DELAY() is.


I'm only after a simple fix - experiment with removal of MII code DELAY() on the affected drivers and commit the change only where testing results are good.

David
_______________________________________________
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to