>>> Did some research and found this sometimes occurs when the
>>> speed of the server nic is so much faster then the client nic.
>>> Apparently this showed up with the 2.6 kernel.
>       [...]
>> Could not find where I saw the original write up.  But it has to
>> do with the NIC in the server box being 1G and the NIC in the test
>> laptop being only 100MB.
>
>
>I suspect that such mismatches are not the problem, because the
>fact that *any* bits are flowing means that a whole bunch of
>low-level plumbing is working properly.

Not at all!  In my NFS on Tru64 Unix days, I was horrified at some of
the abysmal buffering on some top of the line Gigabit Ethernet switches.
A typical box could buffer 350 KB, i.e. 2.8 Mb, and that is about 3 msec
of GbE wire time.  Silly me figured a switch could buffer at least a
second's worth of data before discarding data, maybe 100 msec at worst.
I would love to get my paws on the marketroid that suggested people should
sell Gbe for a corporate backbone and servers with 100 Mb to the workstations.

In my case, we typically had 8 48 KB NFS data messages in flight (UDP)
or 8 64 KB when using NFS over TCP.  384-512 KB.  Oops.  (And I used
1 MB TCP window sizes in part to get around an interesting problem with
BSD's TCP code sharing a socket across multiple threads and in part
to get the "bandwidth x delay" product to get full GbE on a network
with a delay as long as 8 msec.  I couldn't bring myself to make it
bigger.)

Given the cost of memory these days, I figured something else must
have been behind the atrocious buffering than RAM, and found a big
clue in a Cisco document that talked about the FIFOs connecting
sections in a GbE switch.  Those are basically large, byte at a
time shift registers and far more expensive that DRAM.  On the
smallish switches I had, I couldn't find statistics in the boxes
for the number of messages discarded due to congestion.  I suspect
that may be a marketing solution to what would have been an obvious
customer support issue.

I never tried buying/borrowing consumer grade switches to abuse, but
I'm sure I'd be appalled.

Last I looked, implementations of TFTP never let much data on the
wire, so that may not be the problem.  However, I can imagine a
zillion ways something else could screwup.  At any rate, without
good network traces (at both the 1 GbE and 100 MbE sides), all
we can do is speculate where things might be failing.

Using a homogenous speed on a network doesn't work well either at
times.  One NFS client reading from two servers can clog its
incoming link just fine.

   -Ric Werme
_______________________________________________
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Reply via email to