Patched RealTek driver -- please test

1999-04-05 Thread Bill Paul
Okay, today (and over part of the weekend) I ripped the RealTek driver
apart and put it back together again, this time in a hopefully working
form. The temporary patch version is at the following locations:

http://www.freebsd.org/~wpaul/RealTek/test/2.2  source for FreeBSD 2.2.x
http://www.freebsd.org/~wpaul/RealTek/test/3.0  source for FreeBSD 3.x/4.x

If you've been having problems with RealTek 8139 cards, please try this
version and let me know if it makes a differences. All of the main
changes are in the transmit code. I also think I know why the transmitter
was getting wedged. The sort answer: I'm a twit. The long answer:
when ifinit() was changed so that it warned about ifq_maxlen not being set
by the driver, I went in and set it to RL_TX_LIST_CNT - 1, which is
approximately what I'd done for the other drivers. However the RealTek
only has four transmit 'descriptors' which means the ifq_maxlen for the
interface was being set to the ridiculously low value of 3. This causes
transmissions of large packet sequences to quickly fill up the send
queue. (For example, try doing a ping -s 8100  and see if it
actually works. My bet is that it won't, because this will generate a
series of six or seven frames in rapid succession, and after the first
3 or 4, the queue fills up.)

In addition to fixing this, I also re-wrote rl_start() and rl_txeof()
to hopefully be a little simpler and less brain damaged. I still need
to fill in rl_txeoc() correctly, but once I know for sure that I've
fixed all the major problems, I can probably do that in an hour or two.

I experimented with this driver version using a FreeBSD 2.2.7 server and
a FreeBSD 3.0 client (sorry, it's all I had) and I couldn't get NFS
to hang. I also bombarded the server with a TCP stream from the client
while the NFS test was running and it didn't lock up.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
"Mulder, toads just fell from the sky!" "I guess their parachutes didn't open."
=


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Patched RealTek driver -- please test

1999-04-05 Thread mestery

Hi Bill,

Just tried the new Realtek driver with 4.0, and it works MUCH better.  I
was seeing weird NFS transmit errors with UDP, and the ping you
suggested below did not work with the old driver (it reported an out of
buffer space message and didn't transmit any packets).   With the new
driver, the ping works, and nfsv3 with UDP works flawlessly.  Also,
ftping into my machine with the Realtek used to yield a max of 66K per
second.  Now I see closer to 2MB per second.  Looks good so far!
Thanks!

--
Kyle Mestery
StorageTek's Storage Networking Group
Protect your right to privacy: www.freecrypto.org

On Mon, 5 Apr 1999, Bill Paul wrote:

> Okay, today (and over part of the weekend) I ripped the RealTek driver
> apart and put it back together again, this time in a hopefully working
> form. The temporary patch version is at the following locations:
> 
> http://www.freebsd.org/~wpaul/RealTek/test/2.2source for FreeBSD 2.2.x
> http://www.freebsd.org/~wpaul/RealTek/test/3.0source for FreeBSD 
> 3.x/4.x
> 
> If you've been having problems with RealTek 8139 cards, please try this
> version and let me know if it makes a differences. All of the main
> changes are in the transmit code. I also think I know why the transmitter
> was getting wedged. The sort answer: I'm a twit. The long answer:
> when ifinit() was changed so that it warned about ifq_maxlen not being set
> by the driver, I went in and set it to RL_TX_LIST_CNT - 1, which is
> approximately what I'd done for the other drivers. However the RealTek
> only has four transmit 'descriptors' which means the ifq_maxlen for the
> interface was being set to the ridiculously low value of 3. This causes
> transmissions of large packet sequences to quickly fill up the send
> queue. (For example, try doing a ping -s 8100  and see if it
> actually works. My bet is that it won't, because this will generate a
> series of six or seven frames in rapid succession, and after the first
> 3 or 4, the queue fills up.)
> 
> In addition to fixing this, I also re-wrote rl_start() and rl_txeof()
> to hopefully be a little simpler and less brain damaged. I still need
> to fill in rl_txeoc() correctly, but once I know for sure that I've
> fixed all the major problems, I can probably do that in an hour or two.
> 
> I experimented with this driver version using a FreeBSD 2.2.7 server and
> a FreeBSD 3.0 client (sorry, it's all I had) and I couldn't get NFS
> to hang. I also bombarded the server with a TCP stream from the client
> while the NFS test was running and it didn't lock up.
> 
> -Bill
> 
> -- 
> =
> -Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
> Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
> Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
> =
> "Mulder, toads just fell from the sky!" "I guess their parachutes didn't 
> open."
> =
> 
> 
> To Unsubscribe: send mail to majord...@freebsd.org
> with "unsubscribe freebsd-current" in the body of the message
> 



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Patched RealTek driver -- please test

1999-04-05 Thread Stephen Hocking-Senior Programmer PGS Tensor Perth
Well, I nipped home over my lunch break & gave it a try - some progress, of a 
sort. My NFS problems have gone away (at least under light activity), but it 
now seems rather sensitive to sending lots of stuff. The symptoms observed are 
a hard hang of the whole machine, no response to pings or keyboard action. I 
cant even break into DDB. How I reproduced this is as follows - get the 
netpipe program off ports, then set up a receiver on the non-realtek machine 
as follows -

NPtcp -s -r

Then on the RealTek machine do this -

NPtcp -s -t -h non-realtek-hostname -P

After  about 5 or so lines of throughput stats, it dies in the bum.



Stephen
-- 
  The views expressed above are not those of PGS Tensor.

"We've heard that a million monkeys at a million keyboards could produce
 the Complete Works of Shakespeare; now, thanks to the Internet, we know
 this is not true."Robert Wilensky, University of California




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Patched RealTek driver -- please test

1999-04-05 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Stephen
Hocking-Senior Programmer PGS Tensor Perth had to walk into mine and say:
 
> Well, I nipped home over my lunch break & gave it a try - some progress, of a 
> sort. My NFS problems have gone away (at least under light activity), but it 
> now seems rather sensitive to sending lots of stuff. The symptoms observed 
> are 
> a hard hang of the whole machine, no response to pings or keyboard action. I 
> cant even break into DDB. How I reproduced this is as follows - get the 
> netpipe program off ports, then set up a receiver on the non-realtek machine 
> as follows -

[chop]

Sorry, I did traffic generation tests. I banged on it as hard as I could.
I didn't have any problems with lockups.

> NPtcp -s -r
> 
> Then on the RealTek machine do this -
> 
> NPtcp -s -t -h non-realtek-hostname -P
> 
> After  about 5 or so lines of throughput stats, it dies in the bum.

Don't tell me 'after about 5 lines.' Tell me in minutes. Seconds.
Hours. Weeks. How _LONG_ does it run before it locks up!! And what
do the stats sat anyway!

Alright, now see here: I put up yet another test version. This one
has code in every conceivable place where the driver might get caught
in an infinite loop (which is the only thing that might cause the
system to appear to hang, short of executing a halt instruction).
Same place (www.freebsd.org:/~wpaul/RealTek/test/3.0). Try _THIS_
version. Tell me if it locks up. Watch the console. Tell me if you
see any messages (i.e. "rl0: looping in "). If you do, report
them to me VERBATIM. (No paraphrasing, no inventing new messages
which you think represent what you saw. VERBATIM. Or else.)

If it still locks up and you don't see any errors, then the problem
you're experiencing is either not related to the driver, or is related
in some way that only manifiests itself on your hardware and which I
will never be able to reproduce (since your hardware is over there,
and I'm way over here). Maybe it's some peculiar kind of hardware
fault. Maybe your PCI chipset blows. Maybe the RealTek blows when used
in combination with your PCI chipset. Regardless, it's a condition
which I can't reproduce on my own hardware, and if I can't reproduce
the problem, I can't fix it.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
"Mulder, toads just fell from the sky!" "I guess their parachutes didn't open."
=


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Patched RealTek driver -- please test

1999-04-06 Thread Stephen Hocking-Senior Programmer PGS Tensor Perth
OK - I've banged on the new version with  extra debug messages and it still 
locks up, but without any messages! I can only conclude that the 486MB BIOS is 
iffy. I haven't tried any other slots in the MB, but have tried various PCI 
settings, all to no avail. I have swapped the de0 and the rl0 between 
machines, and the rl0 is happy in it's new home - hasn't fallen over, although 
it's netpipe performance sucks with very small packets. I think we can write 
this one off as a faulty PCI implementation on the 486 motherboard. Thanks for 
your patience & time.

Stephen
-- 
  The views expressed above are not those of PGS Tensor.

"We've heard that a million monkeys at a million keyboards could produce
 the Complete Works of Shakespeare; now, thanks to the Internet, we know
 this is not true."Robert Wilensky, University of California




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Patched RealTek driver -- please test

1999-04-06 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Stephen
Hocking-Senior Programmer PGS Tensor Perth had to walk into mine and say:
 
> OK - I've banged on the new version with  extra debug messages and it still 
> locks up, but without any messages!

Grr.

> I can only conclude that the 486MB BIOS is 
> iffy. I haven't tried any other slots in the MB, but have tried various PCI 
> settings, all to no avail. I have swapped the de0 and the rl0 between 
> machines, and the rl0 is happy in it's new home - hasn't fallen over, 
> although 
> it's netpipe performance sucks with very small packets. I think we can write 
> this one off as a faulty PCI implementation on the 486 motherboard. Thanks 
> for 
> your patience & time.

I have one more thing you can try for me (I hope it's not too much trouble
to put the NIC back where it was). This latest test version has a small
change to rl_start() which modifies the transmit behavior: instead of
trying to fill up as many transmit 'descriptors' as possible, it should
never be possible now to have more than one transmission in progress at
any one time. That is, instead of trying to fill up all four TX
'descriptors' and issue four transmissions in rapid succession and then
waiting to clean up the buffers later, it issues a single transmission,
waits for completion, then issues another transmission, waits for
completion, and so on.

This will probably worsen performance at 100Mbps, but it would be
interesting to see if it fixes your problem. Please try it and let me
know what happens. (I left the loop detection code in place just for
giggles.)

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
"Mulder, toads just fell from the sky!" "I guess their parachutes didn't open."
=


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Patched RealTek driver -- please test

1999-04-06 Thread Bill Paul
Whoops... I just noticed I made a small boo-boo in that last patch,
which I just fixed. When downloading, make sure you get the version of
if_rl.c with the following ID strings:

for 3.0: $Id: if_rl.c,v 1.28 1999/04/06 15:29:01 wpaul Exp $
for 2.2: $Id: if_rl.c,v 1.17 1999/04/06 15:29:26 wpaul Exp $

Sorry about that.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
"Mulder, toads just fell from the sky!" "I guess their parachutes didn't open."
=


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



Re: Patched RealTek driver -- please test

1999-04-07 Thread Stephen Hocking-Senior Programmer PGS Tensor Perth

This version survived for a little longer, but hung (on the 486 box) whilst
doing a recursive ls of a large directory tree. Again, no messages, except
for one which came up as the box was booting, whilst it was starting squid.
The box was OK for about 4 minutes after this message, which was

"rl0: watchdog timeout"

Hope this helps.


Stephen


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message