Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
On Mon, Mar 16, 2009 at 4:07 PM, Laurent CARON wrote: > Henning Brauer wrote: > >> this is extremely stupid. >> > > I know, I'm a very stupid guy ;) You are not the first... http://search.gmane.org/?query=stupid&author=henning&group=gmane.os.openbsd.misc&sort=relevance&DEFAULTOP=and&xP=Zstupid&xFILTERS=Gos.openbsd.misc-Ahennings---A
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
Henning Brauer wrote: this is extremely stupid. I know, I'm a very stupid guy ;)
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
Whilst I can't comment on the foibles of modern Cisco switches, I can certainly say that Cisco switches I've used somewhat more recently than fifteen years ago (but more than five) refused to autonegotiate to some servers. So far they remain the only switches I've had to manually set the speed and duplex on, AFAICR. Given that experience, I can certainly see why people might enforce a configuration for longer than might be necessary. PK - Original Message - From: "Michal" To: Sent: Monday, March 16, 2009 1:51 PM Subject: Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance Sorry but I worked for a very successful company in the UK that didn't use auto neg's on Cisco switches and routers so I wouldn't call it evil AT all, please explain why manual is evil. C -Original Message- From: owner-m...@openbsd.org [mailto:owner-m...@openbsd.org] On Behalf Of Henning Brauer Sent: 16 March 2009 13:29 To: OpenBSD Subject: Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance * Laurent CARON [2009-02-28 21:33]: Steve Shockley wrote: On 2/27/2009 8:43 AM, Laurent CARON wrote: - Forcing speed on switch - Forcing speed on nic Why? This practice made sense when 10baseT gear from different vendors wasn't compatible, but not for the last 15-20 years. This practice still makes sense, at least with broadcom cards. no, it is pure bullshit and the source of many many many errors. just because cisco failed miserably in implementing autoneg for years. even they managed now. so stop spreading this bullshit. autoneg is good. manual is evil. that simple. I always do force the speed on servers. this is extremely stupid. -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
On Mon, Mar 16, 2009 at 01:51:19PM -, Michal wrote: | Sorry but I worked for a very successful company in the UK that didn't use | auto neg's on Cisco switches and routers so I wouldn't call it evil AT all, | please explain why manual is evil. Manual is error-prone. If everything defaults to auto, everything should work and only broken setups (ancient hardware / broken drivers / etc) will give you problems. Broken setups are the things you want to get rid of, not work around. Configuring speed and duplex is not a solution but a workaround. The workaround used to be common place (in my experience, it was mostly between 3com NICs and Cisco switches, YMMV) but that doesn't make it any less of a workaround. Of course, the proper solution is to fix broken drivers / switches. Workaround bad, mmmkay ? Paul 'WEiRD' de WEerd -- >[<++>-]<+++.>+++[<-->-]<.>+++[<+ +++>-]<.>++[<>-]<+.--.[-] http://www.weirdnet.nl/
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
* Michal [2009-03-16 14:56]: > Sorry but I worked for a very successful company in the UK that didn't use > auto neg's on Cisco switches and routers so I wouldn't call it evil AT all, > please explain why manual is evil. because it leads to errors, sooner or later, that are hard to debug. And not using autoneg doesn;t solve a problem in the first place. Not using autoneg is stupid. period. -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
Sorry but I worked for a very successful company in the UK that didn't use auto neg's on Cisco switches and routers so I wouldn't call it evil AT all, please explain why manual is evil. C -Original Message- From: owner-m...@openbsd.org [mailto:owner-m...@openbsd.org] On Behalf Of Henning Brauer Sent: 16 March 2009 13:29 To: OpenBSD Subject: Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance * Laurent CARON [2009-02-28 21:33]: > Steve Shockley wrote: >> On 2/27/2009 8:43 AM, Laurent CARON wrote: >>> - Forcing speed on switch >>> - Forcing speed on nic >> >> Why? This practice made sense when 10baseT gear from different vendors >> wasn't compatible, but not for the last 15-20 years. > > This practice still makes sense, at least with broadcom cards. no, it is pure bullshit and the source of many many many errors. just because cisco failed miserably in implementing autoneg for years. even they managed now. so stop spreading this bullshit. autoneg is good. manual is evil. that simple. > I always do force the speed on servers. this is extremely stupid. -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
* Laurent CARON [2009-02-28 21:33]: > Steve Shockley wrote: >> On 2/27/2009 8:43 AM, Laurent CARON wrote: >>> - Forcing speed on switch >>> - Forcing speed on nic >> >> Why? This practice made sense when 10baseT gear from different vendors >> wasn't compatible, but not for the last 15-20 years. > > This practice still makes sense, at least with broadcom cards. no, it is pure bullshit and the source of many many many errors. just because cisco failed miserably in implementing autoneg for years. even they managed now. so stop spreading this bullshit. autoneg is good. manual is evil. that simple. > I always do force the speed on servers. this is extremely stupid. -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
On 2/28/2009 4:45 PM, Brian Keefer wrote: I've had problems with bge(4)s in IBM xSeries machines that required forcing speed/duplex, else they would negotiate to 100/half. Probably your switch was forced to 100/full... autonegotiation needs to be enabled on both ends of the connection.
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
On Feb 28, 2009, at 12:28 PM, Laurent CARON wrote: Steve Shockley wrote: On 2/27/2009 8:43 AM, Laurent CARON wrote: - Forcing speed on switch - Forcing speed on nic Why? This practice made sense when 10baseT gear from different vendors wasn't compatible, but not for the last 15-20 years. This practice still makes sense, at least with broadcom cards. I had spurious problems 2 years ago with a Gigabit Ethernet interface showing lots of error while using autoneg (hooked to a 3com switch or to a cisco one). Those problems did instantly disappear after forcing the speed on both, the card AND the switch. I always do force the speed on servers. I don't say it is the only way to go, but my way to handle it. Laurent I've had problems with bge(4)s in IBM xSeries machines that required forcing speed/duplex, else they would negotiate to 100/half. -- bk
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
Steve Shockley wrote: On 2/27/2009 8:43 AM, Laurent CARON wrote: - Forcing speed on switch - Forcing speed on nic Why? This practice made sense when 10baseT gear from different vendors wasn't compatible, but not for the last 15-20 years. This practice still makes sense, at least with broadcom cards. I had spurious problems 2 years ago with a Gigabit Ethernet interface showing lots of error while using autoneg (hooked to a 3com switch or to a cisco one). Those problems did instantly disappear after forcing the speed on both, the card AND the switch. I always do force the speed on servers. I don't say it is the only way to go, but my way to handle it. Laurent
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
On 2/27/2009 8:43 AM, Laurent CARON wrote: - Forcing speed on switch - Forcing speed on nic Why? This practice made sense when 10baseT gear from different vendors wasn't compatible, but not for the last 15-20 years. http://www.ethermanage.com/ethernet/pdf/dell-auto-neg.pdf Moreover, gigabit links require autonegotiation (http://en.wikipedia.org/wiki/Autonegotiation#Overview).
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
Pete Vickers wrote: The bge driver sucks for these cards - just chuck in an em(4) NIC and you should see instant improvement. Those cards have always been unreliable for me under Linux and OpenBSD.
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
On 2009/02/28 12:10, Pereresus ne Vlezaet Buggy wrote: > On 28 February 2009 ?. 01:58:29 Stuart Henderson wrote: > > On 2009-02-27, Pete Vickers wrote: > > > The bge driver sucks for these cards - just chuck in an em(4) NIC > > > and you should see instant improvement. > > > > > > 'netstat -I bge0' will confirm the packet errors > > > > this was fixed a year ago. > > Maybe not fully fixed, here is some sort of suscipious output from DL120 G5: maybe you have a faulty cable or switch-port or nic. > NameMtu Network Address Ipkts IerrsOpkts Oerrs > Colls > lo0 3316044550 044550 0 > 0 > lo0 33160 localhost localhost.corp.ar44550 044550 0 > 0 > lo0 33160 localhost.c localhost.corp.ar44550 044550 0 > 0 > lo0 33160 fe80::%lo0/ fe80::1%lo0 44550 044550 0 > 0 > em0 150000:15:17:93:a1:04 3981794 0 3588281 0 > 0 > em0 1500 89-235-155- 89-235-155-228.ad 3981794 0 3588281 0 > 0 > em0 1500 fe80::%em0/ fe80::215:17ff:fe 3981794 0 3588281 0 > 0 > em1 150000:15:17:93:a1:05 867952 0 325838 0 > 0 > em1 1500 213.234.230 213.234.230.206 867952 0 325838 0 > 0 > em1 1500 fe80::%em1/ fe80::215:17ff:fe 867952 0 325838 0 > 0 > em2 150000:1f:29:54:2f:78 1921436 016203 0 > 0 > em2 1500 193.168.1/2 193.168.1.51921436 016203 0 > 0 > em2 1500 fe80::%em2/ fe80::21f:29ff:fe 1921436 016203 0 > 0 > em3 150000:1f:29:54:2f:79 32213605 013069 0 > 0 > em3 1500 192.168.0/2 192.168.0.5 32213605 013069 0 > 0 > em3 1500 fe80::%em3/ fe80::21f:29ff:fe 32213605 013069 0 > 0 > bge0150000:1f:29:0e:7b:57 9977060 654 5961231 0 > 0 > bge01500 192.168.1/2 proxy.corp.arbat2 9977060 654 5961231 0 > 0 > bge01500 fe80::%bge0 fe80::21f:29ff:fe 9977060 654 5961231 0 > 0 > bge01500 192.168.200 192.168.200.2549977060 654 5961231 0 > 0 > enc0* 1536 0 00 0 > 0 > pflog0 331600 0 212721 0 > 0 > pflog1 331600 04 0 > 0 > pflow0 1464 0 00 0 > 0 > pflog2* 331600 0 7956 0 > 0 > > -- > WBR, > Pereresus ne Vlezaet Buggy
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
On 28 February 2009 G. 01:58:29 Stuart Henderson wrote: > On 2009-02-27, Pete Vickers wrote: > > The bge driver sucks for these cards - just chuck in an em(4) NIC > > and you should see instant improvement. > > > > 'netstat -I bge0' will confirm the packet errors > > this was fixed a year ago. Maybe not fully fixed, here is some sort of suscipious output from DL120 G5: NameMtu Network Address Ipkts IerrsOpkts Oerrs Colls lo0 3316044550 044550 0 0 lo0 33160 localhost localhost.corp.ar44550 044550 0 0 lo0 33160 localhost.c localhost.corp.ar44550 044550 0 0 lo0 33160 fe80::%lo0/ fe80::1%lo0 44550 044550 0 0 em0 150000:15:17:93:a1:04 3981794 0 3588281 0 0 em0 1500 89-235-155- 89-235-155-228.ad 3981794 0 3588281 0 0 em0 1500 fe80::%em0/ fe80::215:17ff:fe 3981794 0 3588281 0 0 em1 150000:15:17:93:a1:05 867952 0 325838 0 0 em1 1500 213.234.230 213.234.230.206 867952 0 325838 0 0 em1 1500 fe80::%em1/ fe80::215:17ff:fe 867952 0 325838 0 0 em2 150000:1f:29:54:2f:78 1921436 016203 0 0 em2 1500 193.168.1/2 193.168.1.51921436 016203 0 0 em2 1500 fe80::%em2/ fe80::21f:29ff:fe 1921436 016203 0 0 em3 150000:1f:29:54:2f:79 32213605 013069 0 0 em3 1500 192.168.0/2 192.168.0.5 32213605 013069 0 0 em3 1500 fe80::%em3/ fe80::21f:29ff:fe 32213605 013069 0 0 bge0150000:1f:29:0e:7b:57 9977060 654 5961231 0 0 bge01500 192.168.1/2 proxy.corp.arbat2 9977060 654 5961231 0 0 bge01500 fe80::%bge0 fe80::21f:29ff:fe 9977060 654 5961231 0 0 bge01500 192.168.200 192.168.200.2549977060 654 5961231 0 0 enc0* 1536 0 00 0 0 pflog0 331600 0 212721 0 0 pflog1 331600 04 0 0 pflow0 1464 0 00 0 0 pflog2* 331600 0 7956 0 0 -- WBR, Pereresus ne Vlezaet Buggy
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
On 2009-02-27, Pete Vickers wrote: > The bge driver sucks for these cards - just chuck in an em(4) NIC and > you should see instant improvement. > > 'netstat -I bge0' will confirm the packet errors this was fixed a year ago.
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
On Fri, Feb 27, 2009 at 3:43 PM, Han Boetes wrote: > cache_dir aufs xxx yyy Thank you, I've switched to aufs, I hope it works ok on OpenBSD (docs mention "POSIX threads"). The netstat actually doesn't show I/O errors: afar...@ablprx01:squid> netstat -I bge0 NameMtu Network Address Ipkts IerrsOpkts Oerrs Colls bge0150000:16:35:5b:39:ae 4057388 0 3557300 0 0 bge01500 10.121/16 ablprx01.internal 4057388 0 3557300 0 0 bge01500 fe80::%bge0 fe80::216:35ff:fe 4057388 0 3557300 0 0 I'm starting to suspect url_rewrite_program - maybe it blocks for too many requests even though I have "redirector_bypass on". The http://openports.se/www/squid mentions the squid-2.7.STABLE6 package being available, but I don't understand where to get it (if I don't want to use ports)? It's not at ftp://ftp.de.openbsd.org/pub/OpenBSD/4.4/packages/i386/ I also wonder what do folks use to measure web proxy performance in a simple and reliable way Thanks Alex
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
I recommend you switch to: cache_dir aufs xxx yyy # Han
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
The bge driver sucks for these cards - just chuck in an em(4) NIC and you should see instant improvement. 'netstat -I bge0' will confirm the packet errors /Pete On 27 Feb 2009, at 14:33, Alexander Farber wrote: bge0 at pci3 dev 6 function 0 "Broadcom BCM5704C" rev 0x10, BCM5704 B0 (0x2100): apic 6 int 0 (irq 7), address 00:16:35:5b:39:ae brgphy0 at bge0 phy 1: BCM5704 10/100/1000baseT PHY, rev. 0 bge1 at pci3 dev 6 function 1 "Broadcom BCM5704C" rev 0x10, BCM5704 B0 (0x2100): apic 6 int 1 (irq 10), address 00:16:35:5b:39:ad
Re: HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
Alexander Farber wrote: Hello, our web proxy for 400 users (actually at the moment less than 100, but we are going to switch the others to use it soon) is slow. It is a HP Proliant DL385 running OpenBSD 4.4-stable with the squid-2.7.STABLE3 from packages (dmesg below). Does anybody please have a good advice how to find the reason? Hi, I would try: - Forcing speed on switch - Forcing speed on nic - Using an intel PCI-X or PCI-express Gigabit NIC Laurent
HP Proliant DL385 with Squid at a Gigabit-switch - bad network performance
Hello, our web proxy for 400 users (actually at the moment less than 100, but we are going to switch the others to use it soon) is slow. It is a HP Proliant DL385 running OpenBSD 4.4-stable with the squid-2.7.STABLE3 from packages (dmesg below). Does anybody please have a good advice how to find the reason? A quick big file download test at the other machines (running CentOS Linux) in the same rack, at the same gigabit switch shows download speeds (with lynx) which are 10 times better. The Procurve switch 2848 (J4904A) displays following errors after I've put the network cable from the OpenBSD machine to another port on it: Excessive jabbering Excessive CRC/alignment errors Excessive broadcasts (but I'm not sure if it's a result of me changing the cable) and the green lamp is always burning (at the other ports with Linux machines the lamps are blinking). # ifconfig -a lo0: flags=8049 mtu 33204 groups: lo inet 127.0.0.1 netmask 0xff00 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4 bge0: flags=8843 mtu 1500 lladdr 00:16:35:5b:39:ae groups: egress media: Ethernet autoselect (1000baseT full-duplex) status: active inet 10.121.42.32 netmask 0x broadcast 10.121.255.255 inet6 fe80::216:35ff:fe5b:39ae%bge0 prefixlen 64 scopeid 0x1 bge1: flags=8802 mtu 1500 lladdr 00:16:35:5b:39:ad media: Ethernet autoselect (none) status: no carrier enc0: flags=0<> mtu 1536 (i.e. only 1 NIC is connected) # cat /etc/hostname.bge0 inet 10.121.42.32 255.255.0.0 NONE ! route add -inet 172.25.0.0/16 10.121.42.1 The machine only runs httpd and squid: # cat /etc/rc.conf.local rdate_flags="-n ablwdc01" httpd_flags=YES amd=NO lockd=YES pf=NO portmap=YES check_quotas=NO yppasswdd_flags=NO # grep -v ^# /etc/squid/squid.conf |grep -v ^$ acl all src all acl manager proto cache_object acl localhost src 127.0.0.1/32 acl to_localhost dst 127.0.0.0/8 acl localnet src 172.25.0.0/16 # RFC1918 possible internal network acl localnet src 10.0.0.0/8 # RFC1918 possible internal network acl SSL_ports port 443 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 # https acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port 1025-65535 # unregistered ports acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl CONNECT method CONNECT http_access allow manager localhost http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports http_access allow localnet http_access deny all icp_access allow localnet icp_access deny all http_port 3128 http_port 8080 hierarchy_stoplist cgi-bin ? cache_dir ufs /var/squid/cache 10 16 256 access_log /var/squid/logs/access.log squid client_netmask 255.255.255.0 url_rewrite_program /usr/local/sbin/squid_redirect url_rewrite_children 32 redirector_bypass on refresh_pattern ^ftp: 144020% 10080 refresh_pattern ^gopher:14400% 1440 refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 refresh_pattern . 0 20% 4320 acl apache rep_header Server ^Apache broken_vary_encoding allow apache cache_mgr i...@xx.com dns_defnames on coredump_dir /var/squid/cache # df -h Filesystem SizeUsed Avail Capacity Mounted on /dev/sd0a 7.9G2.1G5.4G28%/ /dev/sd0d 7.9G494K7.5G 0%/var/log /dev/sd0e 111G7.1G 98.3G 7%/var/squid ablnas02:/vol/ablhom01/home/afarber535G310G225G58% /home/afarber ablnas02:/vol/ablhom01/home/ldrisis535G310G225G58% /home/ldrisis ablnas02:/vol/ablhom01/home/psauer 535G310G225G58% /home/psauer # mount /dev/sd0a on / type ffs (local, softdep) /dev/sd0d on /var/log type ffs (local, nodev, nosuid) /dev/sd0e on /var/squid type ffs (local, nodev, nosuid, softdep) ablnas02:/vol/ablhom01/home/afarber on /home/afarber type nfs (nodev, nosuid, v3, udp, timeo=100) # top 63 processes: 62 idle, 1 on processor CPU0 states: 0.9% user, 0.0% nice, 0.3% system, 0.2% interrupt, 98.6% idle CPU1 states: 0.3% user, 0.0% nice, 0.0% system, 0.0% interrupt, 99.7% idle Memory: Real: 271M/703M act/tot Free: 3157M Swap: 0K/8196M used/tot PID USERNAME PRI NICE SIZE RES STATEWAIT TIMECPU COMMAND 24833 _squid 20 75M 76M sleep/0 poll 0:20 0.24% squid 12497 _squid 20 6028K 7688K sleep/0 netio 0:16 0.20% perl 29957 _squid 20 6104K 7640K sleep/1 netio 0:02 0.00% perl 17945 _squid 20 6132K 7636K sleep/0 netio 0:01 0.00% perl 2335 _squid 20 5940K 7620K idle netio 0:00 0.00