Re: NIC card problems

2006-01-07 Thread Peter C. Lai
Peter Jeremy wrote:
>Real DEC Tulip cards do this when running Tru64 as well.  My guess is that
>it's a bug in the NIC.  (And it looks like AMDtek have copied it).

Peter, Warner, Stefan, et al.:

I just found this thread on the mailing list, and am responding to it, a year
later :) I also believe the problem is a bug in the NIC as well, since the 
ADMTek 985 appears to not listen to the "automagic buffer underrun recovery" 
command. Silby added some patches to mbuf allocation in 2003 after stress 
testing dc(4), which improves the situation somewhat (ability to 
sustain the traffic longer) but doesn't solve it.

While my system doesn't reboot (panic), it will often hang as a result of
this. What happens then is that when the interface tries to transmit, a 
"No buffer space available" error occurs. If one can access the console, it 
can be rescued by bringing the interface down and then up again using 
ifconfig(8). This will reset the card and presumably flush the buffers.

I wonder if any work has been done on the driver in -CURRENT (and I am too
lazy to look), but in the next few weeks the machine is getting overhauled
from 4.11 to 6 (reformat/reinstall) so we shall see if it does anything.

-- 
Peter C. Lai
Dept. of Neurobiology
Yale University School of Medicine
http://cowbert.2y.net/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NIC card problems....

2005-01-24 Thread Wilko Bulte
On Mon, Jan 24, 2005 at 11:14:26AM +1100, Peter Jeremy wrote..
> On Mon, 2005-Jan-24 00:27:38 +0100, Stefan Eßer wrote:
> >The TX threshold messages issued by the dc driver appear more as an
> >indication that the PCI bus is under severe load, than as a hint that
> >the dc driver is causing the reboots, IMHO.
> 
> Under Tru64, I've seen Tulip cards report backoff to 1024 byte FIFO and
> then switching to store-and-forward.  That's about 100µsec latency and
> nothing should be holding the PCI bus that long.  I am inclined to
> believe that something is stuffed in the PCI interface logic in the NIC.

According to my Tru64 colleagues not all Tulip chip versions are born
equal, some are better than others.  I do not have more detail here.

-- 
Wilko Bulte [EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NIC card problems....

2005-01-24 Thread Nick Barnes
At 2005-01-24 00:00:32+, "Net Virtual Mailing Lists" writes:
> Hello Stefan (and everyone else!),
> 
> Thank you for your great comments!  I think I have a 3c9xx card around
> here somewhere, I will give that a shot when it reboots the next time
> (just to see).  It looks like for future systems I'll standardize on the
> Intel fxp-based cards, I really appreciate that advice!

A last piece of advice: get real EtherExpress cards.  A number of
Intel motherboards come with onboard Ethernet interfaces (maybe they
all do now), which the fxp driver understands, but the performance of
such interfaces is reputed to be much lower.

Nick Barnes
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NIC card problems....

2005-01-23 Thread Warner Losh
> The TX threshold messages issued by the dc driver appear more as an
> indication that the PCI bus is under severe load, than as a hint that
> the dc driver is causing the reboots, IMHO.

I've also seen them when the card is a CardBus card, which may
indicate some slightly pessimal performance parameters are set at the
pci cardbus bridge... 

Warner
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NIC card problems....

2005-01-23 Thread Peter Jeremy
On Mon, 2005-Jan-24 00:27:38 +0100, Stefan Eßer wrote:
>The TX threshold messages issued by the dc driver appear more as an
>indication that the PCI bus is under severe load, than as a hint that
>the dc driver is causing the reboots, IMHO.

Under Tru64, I've seen Tulip cards report backoff to 1024 byte FIFO and
then switching to store-and-forward.  That's about 100µsec latency and
nothing should be holding the PCI bus that long.  I am inclined to
believe that something is stuffed in the PCI interface logic in the NIC.

-- 
Peter Jeremy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NIC card problems....

2005-01-23 Thread Net Virtual Mailing Lists
Hello Stefan (and everyone else!),

Thank you for your great comments!  I think I have a 3c9xx card around
here somewhere, I will give that a shot when it reboots the next time
(just to see).  It looks like for future systems I'll standardize on the
Intel fxp-based cards, I really appreciate that advice!


As for what might actually be causing this crash: I just checked the PCI
configuration and don't see anything in the BIOS which would suggest that
anything you mentioned is something I can modify - is that correct?  Or
are you saying that I am simply pushing past the limits of what this
hardware (and PCI bus) is capable of?

For whatever it is worth, all I have in terms of PCI cards is this NIC
and VGA card (which doesn't run anything gui-like).  I am using the
onboard IDE controller, not sure if that is considered a "PCI card" for
this purpose.  There are no sound cards or anything like that installed.
 The motherboard has no built-in audio.  I can copy all of the possible
PCI settings I have in my bios setup and what they are set to, if you
think that would be helpful here (I would have done it, but I just am not
sure if you are hinting at the possibility there may be something wrong
with the BIOS configuration here)?

I will say that what you are describing could very well be the case, I've
got two disks (one on each of the two built-in controllers) running
pretty hot-and-heavy during most of this too.

- Greg

>A master latency timer value of 32 (0x20) should keep the bus-master
>switch overhead down to 20% (i.e. 80% left for data transfers) and
>should keep the latency in the range of 1 microsecond per bus-master
>(i.e. 5 microseconds if there are 2 Ethernet cards, 2 disk controllers
>and one host bridge active at the same time). In that case, each PCI
>device could expect to transfer 100 bytes every 5 microseconds. A
>buffer of 128 bytes ought to suffice for a fast Ethernet card, in
>that case.
>
...
>
>The TX threshold messages issued by the dc driver appear more as an
>indication that the PCI bus is under severe load, than as a hint that
>the dc driver is causing the reboots, IMHO.
>
>Regards, STefan
>


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NIC card problems....

2005-01-23 Thread Stefan Eßer
On 2005-01-23 05:57 -0800, Net Virtual Mailing Lists <[EMAIL PROTECTED]> wrote:
> My latest problem is with a:
> 
> dc0:  port 0xe800-0xe8ff mem 0xe608-
> 0xe60803ff irq 11 at device 10.0 on pci0
> dc0: Ethernet address: 00:0c:41:ed:ae:85
> 
> ...  after several hours of *HEAVY* (I'm probably understating this)
> utilization I get:
> 
> dc0: TX underrun -- increasing TX threshold
> dc0: TX underrun -- increasing TX threshold
> .. repeats numerous times..
> dc0: TX underrun -- using store and forward mode

Well, that's nothing to worry too much about ...

The device has a data FIFO that is filled with data words fetched
by its bus-master DMA engine via the PCI bus. As an optimization,
sending the Ethernet frame may optionally start, before all data
for the frame has been put into the FIFO. Normally, data is fetched
faster by DMA then sent by Ethernet, but if there are a significant
number of simultanous PCI data transfers by other devices, there is
the risk that the FIFO runs out of data (buffer underflow). In such
a case, the current Ethernet frame can't be finished (dummy data and
a bogus CRC are added to have the receiving party discard the frame).

If such a sittuation exists, the driver will increase the amount of
data required in the FIFO, before transmission of the next Ethernet
frame starts. After multiple underruns, the driver will configure
the Ethernet chip to buffer the full contents of each frame (store
and forward mode) and will avoid the early transmission start, since
the hardware apparently is not capable of providing data fast enough.

> .. at this point the system simply reboots.  I have attempted to apply a

Then there definitely is a bug somewhere, but not neccessarily in the
dc driver. Instead of switching Ethernet cards, you may want to check
the PCI performance of your mainboard. The PCI latency timers (master
latency timer and individual timers in each device) may play a role.

They decide about the maximum "time slice" assigned to each bus-master
in turn. Too small a value causes the PCI bus throughput to suffer
(because of a few PCI bus clocks are lost each time the next bus-master
takes over), while too high a value may cause a device to starve waiting
to get access to the bus granted.

A master latency timer value of 32 (0x20) should keep the bus-master
switch overhead down to 20% (i.e. 80% left for data transfers) and
should keep the latency in the range of 1 microsecond per bus-master
(i.e. 5 microseconds if there are 2 Ethernet cards, 2 disk controllers
and one host bridge active at the same time). In that case, each PCI
device could expect to transfer 100 bytes every 5 microseconds. A
buffer of 128 bytes ought to suffice for a fast Ethernet card, in
that case.

But this is a simplified view and calculation. Devices may keep the
PCI bus longer than the granted time slice, if there are (what used
to be called) "wait cycles" inserted by slow devices. (And some sound
cards have been reported to cause high PCI load, even when idle.)

The TX threshold messages issued by the dc driver appear more as an
indication that the PCI bus is under severe load, than as a hint that
the dc driver is causing the reboots, IMHO.

Regards, STefan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NIC card problems....

2005-01-23 Thread Pete French
> You know, I don't really care what NIC I use - I really don't.  I'm not
> so much interested in trying to figure out why this NIC is giving me
> grief as much as I am in finding one that will work.  I would just like
> someone somewhere to tell me what is a stable NIC to use for FreeBSD,

As several other people have mentioned, Intel fxp cards work. I have
used a wide variety of these on several system and they all perform
excellently and I have never had a problem. Am currently running
them in a number of production environments.

-pcf.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NIC card problems....

2005-01-23 Thread Chuck Swiger
Net Virtual Mailing Lists wrote:
[ ... ]
[ ... ]  It would be nice if somewhere there was
some statement of a "fact" that NIC  is known to work well with
FreeBSD.  I'm aware of all of the FUD out there, about people beating
their chests saying how wonderful NIC-A is or NIC-B is, and I've tried
'em all and had problems with each and every one of them so far.  Surely
someone out there must use FreeBSD in an environment where the "network
is the bottleneck"^2 - right?
I have never had a problem with a Intel fxp0 NIC.
My company probably has used over fifty of them.
I have never had a problem with the 3com 9xx xl0 NICs.
About ten of those are in proximity to me over the years.

DEC-branded 21x4x Tulip cards have also been good, but I've had three out of 4 
Asante cards using a PNIC Tulip lookalike have failed, and I'm dubious about 
the last one.  Avoid Tulip clones.


I've not had problems with a sis0 (NatSemi DP83815), but not enough of them to 
generalize.  I've seen mild problems with the vr0 (VIA chipsets) & Broadcom 
chips, and the Realtek and NE2000 clones that ISPs give to their customers for 
free may not be worth even that much.

--
-Chuck
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NIC card problems....

2005-01-23 Thread Peter Jeremy
On Sun, 2005-Jan-23 05:57:53 -0800, Net Virtual Mailing Lists wrote:
>  It would be nice if somewhere there was
>some statement of a "fact" that NIC  is known to work well with
>FreeBSD.

I recall seeing quite a few such statements about different cards over
the years.  In general, such statements have a limited lifetime because
NIC vendors have an ongoing tendency to "improve" their NICs to the
point of incompatibility.  There are often useful comments about the
NICs at the top of the driver code for that NIC (though there's nothing
about your AN985).

>...  after several hours of *HEAVY* (I'm probably understating this)
>utilization I get:
>
>dc0: TX underrun -- increasing TX threshold
>dc0: TX underrun -- increasing TX threshold
>.. repeats numerous times..
>dc0: TX underrun -- using store and forward mode

Real DEC Tulip cards do this when running Tru64 as well.  My guess is that
it's a bug in the NIC.  (And it looks like AMDtek have copied it).

>.. at this point the system simply reboots.

This is undesirable.  However, you have not provided any information that
would allow anyone to assist you.  Please have a look at
http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html
http://www.freebsd.org/doc/en_US.ISO8859-1/articles/problem-reports/article.html

>  I have attempted to apply a
>patch () which I
>found which patches sys/pci/if_dcreg.h and sys/pci/if_dc.c.

This PR is closed which means that the patches (or functional equivalents)
should have already been applied.

>  I would just like
>someone somewhere to tell me what is a stable NIC to use for FreeBSD,

I've been using Intel EtherExpress Pro/100+ cards (fxp driver) in some
systems where the network gets hammered and haven't had any problems.

-- 
Peter Jeremy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


NIC card problems....

2005-01-23 Thread Net Virtual Mailing Lists
Hello,


I regret that I have never posted to this list before, despite the fact I
have been using FreeBSD in one form or another for many years now (since
2.x era).  I'm a bit cranky, so please do not take any of this wrong.  I
have a system running FreeBSD 4.10-STABLE (bites tongue).

In any event, I just have to ask: what the heck is up with support for
network cards?... I mean, seriously, no offense intended to all of the
folks who work so hard on this stuff, but every single FreeBSD system I
have installed has at one time or another encountered *some* problem
relating to its network card.  It would be nice if somewhere there was
some statement of a "fact" that NIC  is known to work well with
FreeBSD.  I'm aware of all of the FUD out there, about people beating
their chests saying how wonderful NIC-A is or NIC-B is, and I've tried
'em all and had problems with each and every one of them so far.  Surely
someone out there must use FreeBSD in an environment where the "network
is the bottleneck"^2 - right?

By this I mean it is a development system which runs 4 simultaneous
processes which run 24x7 (and use network bandwidth in spades - most
across a 100mb pipe), a web server, postgres, and nfs server -- some of
these processes needed to have MAXDSIZ increased to support up to 1gb (I
set MAXDSIZ="(1024*1024*1024)") (and yes, this excessive memory
consumption is beyond my control)...  The system has 2gb of ram with 8gb
of swap (just to make sure!) and is a 600Mhz AMD K6... I've ensured that
all of these "memory intensive" processes are forked so that they are
able to release their memory when finished.  I don't really care how slow
it runs -- as long as it runs reliably.  I know that this may not be an
ideal configuration given what I'm asking from this limited hardware, but
I did not experience any reliability problems prior to installing FreeBSD
on this exact same hardware (more about this later) with similar tweaks
to just make sure everything is able to run (regardless of actual
performance).

My latest problem is with a:

dc0:  port 0xe800-0xe8ff mem 0xe608-
0xe60803ff irq 11 at device 10.0 on pci0
dc0: Ethernet address: 00:0c:41:ed:ae:85

...  after several hours of *HEAVY* (I'm probably understating this)
utilization I get:

dc0: TX underrun -- increasing TX threshold
dc0: TX underrun -- increasing TX threshold
.. repeats numerous times..
dc0: TX underrun -- using store and forward mode


.. at this point the system simply reboots.  I have attempted to apply a
patch () which I
found which patches sys/pci/if_dcreg.h and sys/pci/if_dc.c.  The patch
file was not correct for my version and it looks to me as if sys/pci/
if_dcreg.h already had the changes applied, but I went ahead and added
the few minor changes it had for pci/sys/if_dc.c, just on a whim (I
should probably note that my version of if_dc.c before the patch was:
"$FreeBSD: src/sys/pci/if_dc.c,v 1.9.2.56 2004/04/22 22:03:27 ru Exp").   

.. In any event, my pager just went off again, the system rebooted, again
with the exact same symptom.

You know, I don't really care what NIC I use - I really don't.  I'm not
so much interested in trying to figure out why this NIC is giving me
grief as much as I am in finding one that will work.  I would just like
someone somewhere to tell me what is a stable NIC to use for FreeBSD,
minus all the speculation as to what might or might not work correctly. 
Just tell me what to buy and I'll go buy it, really I will!  If I need a
specific chipset with a specific revision that's fine too just please
tell me what it is.  Hell at this point I'll even look for NICs from
specific lot numbers.  I'm getting ready to roll out a production version
of what I've been testing the last few months and I simply would like to
know what NIC I can depend on to not give me these sort of fits

Thanks and again sorry if I seem frustrated, but you are getting a 10
second summary of 8 years of frustration.  Maybe I just have bad luck and
always get the "bad NIC", I dunno.

Other than this repeating issue with the support for NIC cards, I've
found FreeBSd to be absolutely rock solid.  It is a real shame that you
guys don't get more support from the hardware manufacturers.  I truly
wish that I were savvy enough to even know where to begin debugging
something like this or to help with the development effort.  Perhaps
someday I will be, who knows.

For whatever its worth this exact system has been running Solaris 2.6
(x86 of course) for the last couple of years and has never crashed for
any reason, ever.  I installed FreeBSD 4.10 because it is what I had
laying around and it is what this client wants to use for their project.
 Over the past week these crashes occur several times throughout the day
and obviously I can't release this project until I am satisfied that it
will run reliably.

..God you have no idea how much I hated writing 90% this, I hope nobody
takes it t