Re: Abysmal RECV network performance
Hi, I'm guessing that the tulip driver is not setting the chip up correctly. I've seen this happen with other tulip variants (21143) when tries to autonegotiate. if you do an ifconfig eth1 you will see numerous carrier and crc errors. Set the tulip_debug flag to 2 or 3 in /etc/modules.conf and see what gets said. A newer version of the driver may help you. You might try the one on sourceforge. Also, I've only ever seen full 100BaseT speeds with decent adapters, like 21143 based tulips, Intel eepros, and vortex/boomerang 3com cards. A lot of the cheaper controllers just won't get there. skd On Mon, May 28, 2001 at 03:47:22AM +, John William wrote: > Can someone please help me troubleshoot this problem - I am getting abysmal > (see numbers below) network performance on my system, but the poor > performance seems limited to receiving data. Transmission is OK. > > The computer in question is a dual Pentium 90 machine. The machine has > RedHat 7.0 (kernel 2.2.16-22 from RedHat). I have compiled 2.2.19 (stock) > and 2.4.3 (stock) for the machine and used those for testing. I had a > NetGear FA310TX card that I used with the "tulip" driver and a 3Com 3CSOHO > card (Hurricane chipset) that I used with the "3c59x" driver. I used the > netperf package to test performance (latest version, but I don't have the > version number off-hand). The numbers netperf is giving me seem to correlate > well to FTP statistics I see to the box. > > I have a second machine (P2-350) with a NetGear FA311 (running 2.4.3 and the > "natsemi" driver) that I used to talk with the Pentium 90 machine. The two > machines are connected through a NetGear FS105 10/100 switch. I also tried > using a 10BT hub (see below). > > When connected, the switch indicated 100 Mbps, full duplex connections to > both cards. This matches the speed indicator lights on both cards. I have > run the miidiag program in the past to verify that the cards are actually > set to full duplex, but I didn't run it again this time (this isn't the > first time I have tried to chase this problem down). > > For the purposes of this message, call the P2-350 machine "A" and the dual > P-90 machine "B". I ran the following tests: > > Machine "A" to localhost 754.74 Mbps > > Kernel 2.2.19SMP > Machine "B" to localhost 80.63 Mbps > Machine "B" to "A" (tulip)55.38 Mbps > Machine "A" to "B" (tulip)10.60 Mbps > Machine "A" to "B" (3c95x)12.10 Mbps > > Kernel 2.4.3 SMP > Machine "B" to localhost 83.87 Mbps > Machine "B" to "A" (tulip)68.07 Mbps > Machine "A" to "B" (tulip)1.62Mbps > Machine "A" to "B" (3c95x)2.37Mbps > > Kernel 2.2.16-22 (RedHat kernel) > Machine "B" to localhost 92.29 Mbps > Machine "B" to "A" (tulip)57.34 Mbps > Machine "A" to "B" (tulip)9.98Mbps > Machine "A" to "B" (3c95x)9.05Mbps > > Now, with both "A" and "B" plugged into a 10BT hub: > > Kernel 2.2.19SMP > Machine "B" to "A" (tulip)6.96Mbps > Machine "A" to "B" (tulip)6.89Mbps > > At the end of the runs, I do not see any messages in syslog that would > indicate a problem. Using the switch, there were no collisions but looking > at /sbin/ifconfig there were a lot of "Frame:" errors on receive. "A lot" > means ~30% of the total packets received. This happened with both cards and > all kernels. > > The conclusions I draw from this data are: > > 1) Both machines connecting to localhost (data not going out over the wire) > give reasonable numbers and are considerably above what I actually see going > over the network (as would be expected). > 2) The P-90 machine seems to have good transmit speed over both cards and > all kernels. Transmit performance is close to the localhost numbers, so I > can believe them. In the past, I have compared the performance of the FA310 > to the 3ComSOHO card and there did not seem to be any measurable performance > difference between the two. > 3) Both the FA310 and the 3ComSOHO card have similar receive speeds, leading > me to believe that the problem lies with either the machine or the kernel > and not the individual cards or drivers. > 4) Booting the machine as a uni-processor machine (with a non-SMP 2.2.16 > kernel) did not change anything, so it does not appear to be a problem with > SMP. > 5) Kernel 2.4.3 receive performance is significantly lower than either 2.2.x > kernel, so that tends to point to some fundamental problem in the kernel. > 6) As I understand it, the 3Com card has some hardware acceleration for > checksumming, and this is a slow machine, so why is the performance almost > identical to the FA310? > > So, my questions are: > > What kind of performance should I be seeing with a P-90 on a 100Mbps > connection? I was expecting something in the range of 40-70 Mbps - certainly > not 1-2 Mbps. > > What can I do to track this problem down? Has anyone else had problems like > this? > > Thanks in advance
Re: Abysmal RECV network performance
Hi, I'm guessing that the tulip driver is not setting the chip up correctly. I've seen this happen with other tulip variants (21143) when tries to autonegotiate. if you do an ifconfig eth1 you will see numerous carrier and crc errors. Set the tulip_debug flag to 2 or 3 in /etc/modules.conf and see what gets said. A newer version of the driver may help you. You might try the one on sourceforge. Also, I've only ever seen full 100BaseT speeds with decent adapters, like 21143 based tulips, Intel eepros, and vortex/boomerang 3com cards. A lot of the cheaper controllers just won't get there. skd On Mon, May 28, 2001 at 03:47:22AM +, John William wrote: Can someone please help me troubleshoot this problem - I am getting abysmal (see numbers below) network performance on my system, but the poor performance seems limited to receiving data. Transmission is OK. The computer in question is a dual Pentium 90 machine. The machine has RedHat 7.0 (kernel 2.2.16-22 from RedHat). I have compiled 2.2.19 (stock) and 2.4.3 (stock) for the machine and used those for testing. I had a NetGear FA310TX card that I used with the tulip driver and a 3Com 3CSOHO card (Hurricane chipset) that I used with the 3c59x driver. I used the netperf package to test performance (latest version, but I don't have the version number off-hand). The numbers netperf is giving me seem to correlate well to FTP statistics I see to the box. I have a second machine (P2-350) with a NetGear FA311 (running 2.4.3 and the natsemi driver) that I used to talk with the Pentium 90 machine. The two machines are connected through a NetGear FS105 10/100 switch. I also tried using a 10BT hub (see below). When connected, the switch indicated 100 Mbps, full duplex connections to both cards. This matches the speed indicator lights on both cards. I have run the miidiag program in the past to verify that the cards are actually set to full duplex, but I didn't run it again this time (this isn't the first time I have tried to chase this problem down). For the purposes of this message, call the P2-350 machine A and the dual P-90 machine B. I ran the following tests: Machine A to localhost 754.74 Mbps Kernel 2.2.19SMP Machine B to localhost 80.63 Mbps Machine B to A (tulip)55.38 Mbps Machine A to B (tulip)10.60 Mbps Machine A to B (3c95x)12.10 Mbps Kernel 2.4.3 SMP Machine B to localhost 83.87 Mbps Machine B to A (tulip)68.07 Mbps Machine A to B (tulip)1.62Mbps Machine A to B (3c95x)2.37Mbps Kernel 2.2.16-22 (RedHat kernel) Machine B to localhost 92.29 Mbps Machine B to A (tulip)57.34 Mbps Machine A to B (tulip)9.98Mbps Machine A to B (3c95x)9.05Mbps Now, with both A and B plugged into a 10BT hub: Kernel 2.2.19SMP Machine B to A (tulip)6.96Mbps Machine A to B (tulip)6.89Mbps At the end of the runs, I do not see any messages in syslog that would indicate a problem. Using the switch, there were no collisions but looking at /sbin/ifconfig there were a lot of Frame: errors on receive. A lot means ~30% of the total packets received. This happened with both cards and all kernels. The conclusions I draw from this data are: 1) Both machines connecting to localhost (data not going out over the wire) give reasonable numbers and are considerably above what I actually see going over the network (as would be expected). 2) The P-90 machine seems to have good transmit speed over both cards and all kernels. Transmit performance is close to the localhost numbers, so I can believe them. In the past, I have compared the performance of the FA310 to the 3ComSOHO card and there did not seem to be any measurable performance difference between the two. 3) Both the FA310 and the 3ComSOHO card have similar receive speeds, leading me to believe that the problem lies with either the machine or the kernel and not the individual cards or drivers. 4) Booting the machine as a uni-processor machine (with a non-SMP 2.2.16 kernel) did not change anything, so it does not appear to be a problem with SMP. 5) Kernel 2.4.3 receive performance is significantly lower than either 2.2.x kernel, so that tends to point to some fundamental problem in the kernel. 6) As I understand it, the 3Com card has some hardware acceleration for checksumming, and this is a slow machine, so why is the performance almost identical to the FA310? So, my questions are: What kind of performance should I be seeing with a P-90 on a 100Mbps connection? I was expecting something in the range of 40-70 Mbps - certainly not 1-2 Mbps. What can I do to track this problem down? Has anyone else had problems like this? Thanks in advance for any help you can offer. - John _ Get your FREE download of MSN Explorer at
Re: magic device renumbering was -- Re: Linux 2.4.2ac20
Hi, The solution is not to go down the path2inst road, that is full of its own traps. You want volume labels via a volume manager (do lvm and raid already do this?) and/or filesystem labels (see e2fslabel). This won't solve all of the ills associated with device instance changes, but it will certainly address the biggest one. skd On Wed, Mar 14, 2001 at 10:36:40AM -0500, John Jasen wrote: > > The problem: > > drivers change their detection schemes; and changes in the kernel can > change the order in which devices are assigned names. > > For example, the DAC960(?) drivers changed their order of > detecting controllers, and I did _not_ have fun, given that the machine in > question had about 40 disks to deal with, spread across two controllers. > > This can create a lot of problems for people upgrading large, production > quality systems -- as, in the worst case, the system won't complete the > boot cycle; or in middle cases, the user/sysadmin is stuck rewriting X > amount of files and trying again; or in small cases, you find out that > your SMC and Intel ethernet cards are reversed, and have to go fix things > ... > > Possible solutions(?): > > Solaris uses an /etc/path_to_inst file, to keep track of device ordering, > et al. > > Maybe we should consider something similar, where a physical device to > logical device map is kept and used to keep things consistent on > kernel/driver changes; device addition/removal, and so forth ... > > I am, of course, open to better solutions. > > -- > -- John E. Jasen ([EMAIL PROTECTED]) > -- In theory, theory and practise are the same. In practise, they aren't. > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: magic device renumbering was -- Re: Linux 2.4.2ac20
Hi, The solution is not to go down the path2inst road, that is full of its own traps. You want volume labels via a volume manager (do lvm and raid already do this?) and/or filesystem labels (see e2fslabel). This won't solve all of the ills associated with device instance changes, but it will certainly address the biggest one. skd On Wed, Mar 14, 2001 at 10:36:40AM -0500, John Jasen wrote: The problem: drivers change their detection schemes; and changes in the kernel can change the order in which devices are assigned names. For example, the DAC960(?) drivers changed their order of detecting controllers, and I did _not_ have fun, given that the machine in question had about 40 disks to deal with, spread across two controllers. This can create a lot of problems for people upgrading large, production quality systems -- as, in the worst case, the system won't complete the boot cycle; or in middle cases, the user/sysadmin is stuck rewriting X amount of files and trying again; or in small cases, you find out that your SMC and Intel ethernet cards are reversed, and have to go fix things ... Possible solutions(?): Solaris uses an /etc/path_to_inst file, to keep track of device ordering, et al. Maybe we should consider something similar, where a physical device to logical device map is kept and used to keep things consistent on kernel/driver changes; device addition/removal, and so forth ... I am, of course, open to better solutions. -- -- John E. Jasen ([EMAIL PROTECTED]) -- In theory, theory and practise are the same. In practise, they aren't. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ARP out the wrong interface
Hi, What you describe below is having the client mis-addressed to have the same IP as the server. Is this what you meant? skd On Thu, Feb 08, 2001 at 09:09:49PM -0800, dean gaudet wrote: > this appears to occur with both 2.2.16 and 2.4.1. > > server: > > eth0 is 192.168.250.11 netmask 255.255.255.0 > eth1 is 192.168.251.11 netmask 255.255.255.0 > > they're both connected to the same switch. > > client: > > eth0 is 192.168.251.11 netmask 255.255.255.0 > > connected to the same switch as both of server's eth. > > on client i try "ping 192.168.251.11". > > responses come back from both eth0 and eth1, listing each of their > respective MAC addresses... it's essentially a race condition at this > point as to whether i'll get the right MAC address. ("right" means the > MAC for server:eth1). > > client# tcpdump -n arp > Kernel filter, protocol ALL, datagram packet socket > tcpdump: listening on all devices > 21:03:05.695089 eth0 > arp who-has 192.168.251.11 tell 192.168.251.25 >(0:3:47:0:25:80) > 21:03:05.695405 eth0 < arp reply 192.168.251.11 is-at 0:d0:b7:be:3e:aa >(0:3:47:0:25:80) > 21:03:05.695523 eth0 < arp reply 192.168.251.11 is-at 0:d0:b7:1f:ea:35 >(0:3:47:0:25:80) > > > server# cat /proc/sys/net/ipv4/ip_forward > 0 > server# cat /proc/sys/net/ipv4/conf/*/proxy_arp > 0 > 0 > 0 > 0 > 0 > 0 > 0 > > is this expected? it seems broken. > > thanks > -dean > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: ARP out the wrong interface
Hi, What you describe below is having the client mis-addressed to have the same IP as the server. Is this what you meant? skd On Thu, Feb 08, 2001 at 09:09:49PM -0800, dean gaudet wrote: this appears to occur with both 2.2.16 and 2.4.1. server: eth0 is 192.168.250.11 netmask 255.255.255.0 eth1 is 192.168.251.11 netmask 255.255.255.0 they're both connected to the same switch. client: eth0 is 192.168.251.11 netmask 255.255.255.0 connected to the same switch as both of server's eth. on client i try "ping 192.168.251.11". responses come back from both eth0 and eth1, listing each of their respective MAC addresses... it's essentially a race condition at this point as to whether i'll get the right MAC address. ("right" means the MAC for server:eth1). client# tcpdump -n arp Kernel filter, protocol ALL, datagram packet socket tcpdump: listening on all devices 21:03:05.695089 eth0 arp who-has 192.168.251.11 tell 192.168.251.25 (0:3:47:0:25:80) 21:03:05.695405 eth0 arp reply 192.168.251.11 is-at 0:d0:b7:be:3e:aa (0:3:47:0:25:80) 21:03:05.695523 eth0 arp reply 192.168.251.11 is-at 0:d0:b7:1f:ea:35 (0:3:47:0:25:80) server# cat /proc/sys/net/ipv4/ip_forward 0 server# cat /proc/sys/net/ipv4/conf/*/proxy_arp 0 0 0 0 0 0 0 is this expected? it seems broken. thanks -dean - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
tulip autonegotiation patch
Hi, This one-liner fixes a subtle 21143 autonegotiation problem for me on a Zynx quad card. The driver would claim to negotiate 100-FD, but would report late collisions and bad transmit throughput. The driver still allows packets to be transmitted during autonegotiation, but that only drops a few packets. skd --- 21142.c.bad Sun Jan 28 15:26:25 2001 +++ 21142.c Sun Jan 28 11:51:59 2001 @@ -171,7 +171,7 @@ for (i = 0; i < tp->mtable->leafcount; i++) if (tp->mtable->mleaf[i].media == dev->if_port) { tp->cur_index = i; - tulip_select_media(dev, 0); + tulip_select_media(dev, 1); setup_done = 1; break; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
tulip autonegotiation patch
Hi, This one-liner fixes a subtle 21143 autonegotiation problem for me on a Zynx quad card. The driver would claim to negotiate 100-FD, but would report late collisions and bad transmit throughput. The driver still allows packets to be transmitted during autonegotiation, but that only drops a few packets. skd --- 21142.c.bad Sun Jan 28 15:26:25 2001 +++ 21142.c Sun Jan 28 11:51:59 2001 @@ -171,7 +171,7 @@ for (i = 0; i tp-mtable-leafcount; i++) if (tp-mtable-mleaf[i].media == dev-if_port) { tp-cur_index = i; - tulip_select_media(dev, 0); + tulip_select_media(dev, 1); setup_done = 1; break; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/