Re: bge Ierr rate increase from 5.3R - 6.1R

2007-01-25 Thread Robin Gruyters

Quoting Jeremy Chadwick [EMAIL PROTECTED]:
[...]


Set your Cisco configuration to use 100/full, and edit the
ifconfig_bge0 line in rc.conf on your FreeBSD box to have media
100baseTX mediaopt full-duplex, then reboot the FreeBSD box.
If the problem continues, there may be faulty cabling, but
usually errors on one direction are a sign of duplex mismatch.
If after replacing the cabling the issue continues, then there's
a chance the bge(4) driver may be obtaining statistics wrong for
the particular chip revision being used (this is hearsay on my
part; I'm just guessing...)

Ok, I have set the Cisco port to 100/full-duplex and update the bge*  
interfaces on the development server, but the problem still exists.


I have also updated the other server, which is connected to another  
Cisco switch, but the same results.


Regards,

Robin

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2007-01-25 Thread Jeremy Chadwick
On Thu, Jan 25, 2007 at 12:07:22PM +0100, Robin Gruyters wrote:
 Quoting Jeremy Chadwick [EMAIL PROTECTED]:
 [...]
 
 Set your Cisco configuration to use 100/full, and edit the
 ifconfig_bge0 line in rc.conf on your FreeBSD box to have media
 100baseTX mediaopt full-duplex, then reboot the FreeBSD box.
 If the problem continues, there may be faulty cabling, but
 usually errors on one direction are a sign of duplex mismatch.
 If after replacing the cabling the issue continues, then there's
 a chance the bge(4) driver may be obtaining statistics wrong for
 the particular chip revision being used (this is hearsay on my
 part; I'm just guessing...)
 
 Ok, I have set the Cisco port to 100/full-duplex and update the bge*  
 interfaces on the development server, but the problem still exists.
 
 I have also updated the other server, which is connected to another  
 Cisco switch, but the same results.

Okay so at least we know it's not specific to your switch, or
to auto-neg nor the cabling (two different switches + boxes with
the same problem probably isn't your fault.  :) ).  That's definitely
evidence that it's a driver problem, probably specific to the 5704
(since I have two machines using 5750s without this problem).

Looks like we'll need someone with the Broadcom data sheet for
the 5704 to help out.  There's also the Bill Paul [EMAIL PROTECTED]
and David Christensen [EMAIL PROTECTED] (who worked on bce(4),
but might know of some details here...)

-- 
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networkinghttp://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2007-01-25 Thread Robin Gruyters

Quoting Jeremy Chadwick [EMAIL PROTECTED]:


On Thu, Jan 25, 2007 at 12:07:22PM +0100, Robin Gruyters wrote:

Quoting Jeremy Chadwick [EMAIL PROTECTED]:
[...]

Set your Cisco configuration to use 100/full, and edit the
ifconfig_bge0 line in rc.conf on your FreeBSD box to have media
100baseTX mediaopt full-duplex, then reboot the FreeBSD box.
If the problem continues, there may be faulty cabling, but
usually errors on one direction are a sign of duplex mismatch.
If after replacing the cabling the issue continues, then there's
a chance the bge(4) driver may be obtaining statistics wrong for
the particular chip revision being used (this is hearsay on my
part; I'm just guessing...)

Ok, I have set the Cisco port to 100/full-duplex and update the bge*
interfaces on the development server, but the problem still exists.

I have also updated the other server, which is connected to another
Cisco switch, but the same results.


Okay so at least we know it's not specific to your switch, or
to auto-neg nor the cabling (two different switches + boxes with
the same problem probably isn't your fault.  :) ).  That's definitely
evidence that it's a driver problem, probably specific to the 5704
(since I have two machines using 5750s without this problem).

Looks like we'll need someone with the Broadcom data sheet for
the 5704 to help out.  There's also the Bill Paul [EMAIL PROTECTED]
and David Christensen [EMAIL PROTECTED] (who worked on bce(4),
but might know of some details here...)

Hmmm, ok. BTW, I found out there another thread going on on the  
freebsd-net mailinglist about the same issue(s):


http://marc.theaimsgroup.com/?l=freebsd-netw=2r=1s=bge+ierrq=b

There are some patches available, but looks like only for the -CURRENT  
not for 6.x releases.


Regards,

Robin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2007-01-25 Thread Jeremy Chadwick
On Thu, Jan 25, 2007 at 03:46:01PM +0100, Robin Gruyters wrote:
 Hmmm, ok. BTW, I found out there another thread going on on the  
 freebsd-net mailinglist about the same issue(s):
 
 http://marc.theaimsgroup.com/?l=freebsd-netw=2r=1s=bge+ierrq=b
 
 There are some patches available, but looks like only for the -CURRENT  
 not for 6.x releases.

Definitely related, and probably the cause.  I hope someone backports
this to RELENG_6 once everything is tested thoroughly.

-- 
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networkinghttp://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2007-01-24 Thread Robin Gruyters

On 13 Dec 2006, at 19:30, Greg Eden wrote:


On 13 Dec 2006, at 09:18, Gleb Smirnoff wrote:

 D  Greg Eden wrote:
 D  Hello
 D 
 D  I recently updated two production servers from 5.3 to 6.1 via
 D  cvsup and
 D  buildworld. Since the upgrade I've seen an increase in the   
 number of

 D  Input packet errors reported on the bge cards in on both boxes.

 In 5.3-RELEASE the bge(4) driver did not read the error count from the
 chip at all. So errors were not accounted.

Many thanks for clearing up the mystery

I have the same problem here. At the moment I only have two servers  
upgraded from FreeBSD 5.4R to FreeBSD 6.1R and one to FreeBSD 6.2R.


[FreeBSD 6.1R]
Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 6.1-RELEASE-p10 #1: Tue Oct 24 10:44:15 CEST 2006
[EMAIL PROTECTED]:/data/obj/data/src_6_1/sys/YIRDIS
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 3.40GHz (3400.14-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf41  Stepping = 1
   
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE

  Features2=0x649dSSE3,RSVD2,MON,DS_CPL,EST,CNTX-ID,CX16,b14
  AMD Features=0x2000LM
  Logical CPUs per core: 2
real memory  = 2147430400 (2047 MB)
avail memory = 2100654080 (2003 MB)
ACPI APIC Table: HP 0083
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0 Version 2.0 irqs 0-23 on motherboard
ioapic1 Version 2.0 irqs 24-47 on motherboard
ioapic2 Version 2.0 irqs 48-71 on motherboard
ioapic3 Version 2.0 irqs 72-95 on motherboard
ioapic4 Version 2.0 irqs 96-119 on motherboard
acpi0: HP P51 on motherboard
acpi0: Power Button (fixed)
Timecounter ACPI-safe frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x908-0x90b on acpi0
cpu0: ACPI CPU on acpi0
cpu1: ACPI CPU on acpi0
pcib0: ACPI Host-PCI bridge on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: ACPI PCI-PCI bridge at device 2.0 on pci0
pci2: ACPI PCI bus on pcib1
pcib2: ACPI PCI-PCI bridge at device 0.0 on pci2
pci3: ACPI PCI bus on pcib2
bge0: Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2100 mem  
0xfdef-0xfdef irq 25 at device 1.0 on pci3

miibus0: MII bus on bge0
brgphy0: BCM5704 10/100/1000baseTX PHY on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,  
1000baseTX-FDX, auto

bge0: Ethernet address: 00:12:79:d7:bb:99
bge1: Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2100 mem  
0xfdee-0xfdee irq 26 at device 1.1 on pci3

miibus1: MII bus on bge1
brgphy1: BCM5704 10/100/1000baseTX PHY on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,  
1000baseTX-FDX, auto

bge1: Ethernet address: 00:12:79:d7:bb:98
pcib3: ACPI PCI-PCI bridge at device 0.2 on pci2
pci4: ACPI PCI bus on pcib3
ciss0: HP Smart Array 6i port 0x4000-0x40ff mem  
0xfdff-0xfdff1fff,0xfdf8-0xfdfb irq 51 at device 3.0 on pci4

ciss0: [GIANT-LOCKED]
pcib4: ACPI PCI-PCI bridge at device 6.0 on pci0
pci5: ACPI PCI bus on pcib4
pcib5: ACPI PCI-PCI bridge at device 0.0 on pci5
pci6: ACPI PCI bus on pcib5
pcib6: ACPI PCI-PCI bridge at device 0.2 on pci5
pci10: ACPI PCI bus on pcib6
pci0: serial bus, USB at device 29.0 (no driver attached)
pci0: serial bus, USB at device 29.1 (no driver attached)
pci0: serial bus, USB at device 29.2 (no driver attached)
pci0: serial bus, USB at device 29.3 (no driver attached)
pci0: serial bus, USB at device 29.7 (no driver attached)
pcib7: ACPI PCI-PCI bridge at device 30.0 on pci0
pci1: ACPI PCI bus on pcib7
pci1: display, VGA at device 3.0 (no driver attached)
pci1: base peripheral at device 4.0 (no driver attached)
pci1: base peripheral at device 4.2 (no driver attached)
isab0: PCI-ISA bridge at device 31.0 on pci0
isa0: ISA bus on isab0
atapci0: Intel ICH5 UDMA100 controller port  
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x500-0x50f at device 31.1 on pci0

ata0: ATA channel 0 on atapci0
ata1: ATA channel 1 on atapci0
acpi_tz0: Thermal Zone on acpi0
atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0
atkbd0: AT Keyboard irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: PS/2 Mouse irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse Explorer, device ID 4
sio0: Standard PC COM port port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
fdc0: floppy drive controller (FDE) port 0x3f2-0x3f5 irq 6 drq 2 on acpi0
fdc0: [FAST]
pmtimer0 on isa0
orm0: ISA Option ROMs at iomem  
0xc-0xc7fff,0xc8000-0xcbfff,0xcc000-0xcd7ff,0xee000-0xe on isa0

sc0: System console at flags 0x100 on isa0
sc0: VGA 16 virtual consoles, flags=0x300
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb 

Re: bge Ierr rate increase from 5.3R - 6.1R

2007-01-24 Thread Jeremy Chadwick
On Wed, Jan 24, 2007 at 10:57:41AM +0100, Robin Gruyters wrote:
 I have the same problem here. At the moment I only have two servers  
 upgraded from FreeBSD 5.4R to FreeBSD 6.1R and one to FreeBSD 6.2R.
 
 And here is the netstat -ni output from our development server:
 
 [netstat -ni]
 NameMtu Ipkts   Ierrs   Opkts   Oerrs   Coll
 ste0*   15000   0   0   0   0
 ste1*   15000   0   0   0   0
 ste2*   15000   0   0   0   0
 ste3*   15000   0   0   0   0
 bge015009866912 2114443 188352090   0
 bge015002004841 -   18833483-   -
 bge015001723393 -   1719554 -   -
 bge0150082  -   66  -   -
 bge0150019036813-   14796159-   -
 bge0150038709278-   35167554-   -
 bge015000   -   0   -   -
 bge01500621 -   0   -   -
 bge015001716-   0   -   -
 bge01500184 -   0   -   -
 bge0150052881   -   2336-   -
 bge1*   15000   0   0   0   0
 pflog   33208   0   0   0   0
 lo0 16384   0   516926240   0
 lo0 16384   6611-   6611-   -
 [...]
 
 Is there a fix for it already, or maybe a workaround?

The problem was that the driver code was not properly obtaining
error statistics from the Broadcom chip, thus errors _were_
(before the fix) not being calculated/accounted for.  Now (after
the fix) errors are being accounted for correctly.

So the errors you see in your netstat output are probably real/
ccurate.  I'll vote for a duplex-related problem or some naughty
cabling.

-- 
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networkinghttp://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2007-01-24 Thread Robin Gruyters


Quoting Jeremy Chadwick [EMAIL PROTECTED]:



Is there a fix for it already, or maybe a workaround?


The problem was that the driver code was not properly obtaining
error statistics from the Broadcom chip, thus errors _were_
(before the fix) not being calculated/accounted for.  Now (after
the fix) errors are being accounted for correctly.

So the errors you see in your netstat output are probably real/
ccurate.  I'll vote for a duplex-related problem or some naughty
cabling.


Should this not be visible on the switch as well?!?

Here some output from the interface on the server and from the switch (Cisco)

[development interface]
bge0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
options=1bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING
ether 00:12:79:94:ed:12
media: Ethernet autoselect (100baseTX full-duplex)
status: active
[/development interface]

[switch]
FastEthernet0/3 is up, line protocol is up (connected)
  Hardware is Fast Ethernet, address is 000e.84d0.de03 (bia 000e.84d0.de03)
  Description: development
  MTU 1500 bytes, BW 10 Kbit, DLY 100 usec,
 reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 100Mb/s
  input flow-control is off, output flow-control is off
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input never, output 00:00:02, output hang never
  Last clearing of show interface counters never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 179000 bits/sec, 28 packets/sec
  5 minute output rate 56000 bits/sec, 24 packets/sec
 22823978 packets input, 4067576147 bytes, 0 no buffer
 Received 13138 broadcasts (0 multicast)
 0 runts, 0 giants, 0 throttles
 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
 0 watchdog, 12929 multicast, 0 pause input
 0 input packets with dribble condition detected
 15673035 packets output, 3975127029 bytes, 0 underruns
 0 output errors, 0 collisions, 1 interface resets
 0 babbles, 0 late collision, 0 deferred
 0 lost carrier, 0 no carrier, 0 PAUSE output
 0 output buffer failures, 0 output buffers swapped out
[/switch]

Regards,

Robin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2007-01-24 Thread Jeremy Chadwick
On Wed, Jan 24, 2007 at 03:25:37PM +0100, Robin Gruyters wrote:
 Should this not be visible on the switch as well?!?

 Here some output from the interface on the server and from the switch 
 (Cisco)
 
 [development interface]
 bge0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
 options=1bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING
 ether 00:12:79:94:ed:12
 media: Ethernet autoselect (100baseTX full-duplex)
 status: active
 [/development interface]
 
 [switch]
 FastEthernet0/3 is up, line protocol is up (connected)
   Hardware is Fast Ethernet, address is 000e.84d0.de03 (bia 000e.84d0.de03)
   Description: development
   MTU 1500 bytes, BW 10 Kbit, DLY 100 usec,
  reliability 255/255, txload 1/255, rxload 1/255
   Encapsulation ARPA, loopback not set
   Keepalive set (10 sec)
   Full-duplex, 100Mb/s
   input flow-control is off, output flow-control is off
   ARP type: ARPA, ARP Timeout 04:00:00
   Last input never, output 00:00:02, output hang never
   Last clearing of show interface counters never
   Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
   Queueing strategy: fifo
   Output queue: 0/40 (size/max)
   5 minute input rate 179000 bits/sec, 28 packets/sec
   5 minute output rate 56000 bits/sec, 24 packets/sec
  22823978 packets input, 4067576147 bytes, 0 no buffer
  Received 13138 broadcasts (0 multicast)
  0 runts, 0 giants, 0 throttles
  0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
  0 watchdog, 12929 multicast, 0 pause input
  0 input packets with dribble condition detected
  15673035 packets output, 3975127029 bytes, 0 underruns
  0 output errors, 0 collisions, 1 interface resets
  0 babbles, 0 late collision, 0 deferred
  0 lost carrier, 0 no carrier, 0 PAUSE output
  0 output buffer failures, 0 output buffers swapped out
 [/switch]

You've got a Cisco involved, so I doubt it.  :-)  I've seen
many times in the past where one end of the link shows errors
while the other end (in my experiences, Cisco Catalysts) shows
none.

When dealing with Cisco-othervendor, I've never seen auto-neg
work properly.  One has to always hard set the speed and duplex
on both sides for it to work.

For example: I lease space in two co-location facilities, from
different providers.  Both providers use Cisco switches (different
models), while we use HP ProCurves.  In both facilities, we saw
input errors on our ProCurve, while the providers saw absolutely
no errors.  Both sides were set to auto.  The instant I had the
providers set 100/full on their Ciscos and I set 100/FD on our
ProCurves, the error counts completely disappeared.

Set your Cisco configuration to use 100/full, and edit the
ifconfig_bge0 line in rc.conf on your FreeBSD box to have media
100baseTX mediaopt full-duplex, then reboot the FreeBSD box.
If the problem continues, there may be faulty cabling, but
usually errors on one direction are a sign of duplex mismatch.
If after replacing the cabling the issue continues, then there's
a chance the bge(4) driver may be obtaining statistics wrong for
the particular chip revision being used (this is hearsay on my
part; I'm just guessing...)

-- 
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networkinghttp://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2007-01-24 Thread sthaug
 When dealing with Cisco-othervendor, I've never seen auto-neg
 work properly.  One has to always hard set the speed and duplex
 on both sides for it to work.

Nowadays auto-neg almost always works for us, even for Fast Ethernet.
We have Cisco-Extreme, Cisco-Juniper, Cisco-Cisco, Cisco-Riverstone
and of course lots of Cisco-various hosts. Basically, things just
work. We always use auto-neg for Gigabit Ethernet.

3-4 years ago the situation was different, and we always used to set
speed/duplex. Not any more.

Maybe you've been unlucky.

Steinar Haug, Nethelp consulting, [EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-13 Thread Gleb Smirnoff
D  Greg Eden wrote:
D  Hello
D 
D  I recently updated two production servers from 5.3 to 6.1 via  
D  cvsup and
D  buildworld. Since the upgrade I've seen an increase in the number of
D  Input packet errors reported on the bge cards in on both boxes.  

In 5.3-RELEASE the bge(4) driver did not read the error count from the
chip at all. So errors were not accounted.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-13 Thread Greg Eden


On 13 Dec 2006, at 09:18, Gleb Smirnoff wrote:


D  Greg Eden wrote:
D  Hello
D 
D  I recently updated two production servers from 5.3 to 6.1 via
D  cvsup and
D  buildworld. Since the upgrade I've seen an increase in the  
number of

D  Input packet errors reported on the bge cards in on both boxes.

In 5.3-RELEASE the bge(4) driver did not read the error count from the
chip at all. So errors were not accounted.


Many thanks for clearing up the mystery

best wishes

greg.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-12 Thread Greg Eden


Hello Doug

On 11 Dec 2006, at 21:46, Doug Barton wrote:


Greg Eden wrote:

Hello

I recently updated two production servers from 5.3 to 6.1 via  
cvsup and

buildworld. Since the upgrade I've seen an increase in the number of
Input packet errors reported on the bge cards in on both boxes.  
One is a
HP DL360g3, the other is a HP DL380g3. Both have a pair of 2.8GHz  
Xeons

with a SMP kernel.


It would be quite useful at this point if you could update a box or
two to the RELENG_6_2 code base so that we can see if this problem is
solved in the latest release candidate. If it is, your problem is
solved, and if it's not, you're a lot closer to the point where we can
usefully assist you.


I've just updated a box (HP DL360g4) to RELENG_6_2 as requested.  
Problem still appears to be present. sftping over 750MB of log files  
produced the characteristic errors.


netstat -i
NameMtu Network   Address  Ipkts IerrsOpkts  
Oerrs  Coll
bge0   1500 Link#1  00:12:79:3b:**:**   31309124
217911 0 0
bge0   1500 192.168.100   192.168.***.*** 312894 -
217905 - -
bge1*  1500 Link#2  00:12:79:3b:**:**0 0 
0 0 0
lo0   16384 Link#3   0 0 
0 0 0
lo0   16384 your-net  localhost0 - 
0 - -


Here's the dmesg output for the box.

Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights  
reserved.

FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-RC1 #0: Tue Dec 12 14:07:47 GMT 2006
:/usr/obj/usr/src/sys/MASAQ
ACPI APIC Table: HP 0083
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 3.00GHz (3000.12-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf41  Stepping = 1
   
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE 
,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE

  Features2=0x641dSSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,b14
  AMD Features=0x2010NX,LM
  Logical CPUs per core: 2
real memory  = 1073688576 (1023 MB)
avail memory = 1045893120 (997 MB)
ioapic1: Changing APIC ID to 9
ioapic0 Version 2.0 irqs 0-23 on motherboard
ioapic1 Version 2.0 irqs 24-47 on motherboard
ioapic2 Version 2.0 irqs 48-71 on motherboard
ioapic3 Version 2.0 irqs 72-95 on motherboard
kbd1 at kbdmux0
acpi0: HP P52 on motherboard
acpi0: Power Button (fixed)
Timecounter ACPI-safe frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x908-0x90b on acpi0
cpu0: ACPI CPU on acpi0
pcib0: ACPI Host-PCI bridge on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: ACPI PCI-PCI bridge at device 2.0 on pci0
pci13: ACPI PCI bus on pcib1
pcib2: ACPI PCI-PCI bridge at device 4.0 on pci0
pci6: ACPI PCI bus on pcib2
pcib3: ACPI PCI-PCI bridge at device 0.0 on pci6
pci7: ACPI PCI bus on pcib3
pcib4: ACPI PCI-PCI bridge at device 0.2 on pci6
pci10: ACPI PCI bus on pcib4
pcib5: ACPI PCI-PCI bridge at device 6.0 on pci0
pci3: ACPI PCI bus on pcib5
pcib6: ACPI PCI-PCI bridge at device 28.0 on pci0
pci2: ACPI PCI bus on pcib6
ciss0: HP Smart Array 6i port 0x4000-0x40ff mem  
0xfdff-0xfdff1fff,0xfdf8-0xfdfb irq 24 at device 1.0 on pci2

ciss0: [GIANT-LOCKED]
bge0: Broadcom BCM5704 B0, ASIC rev. 0x2100 mem  
0xfdf7-0xfdf7 irq 25 at device 2.0 on pci2

miibus0: MII bus on bge0
brgphy0: BCM5704 10/100/1000baseTX PHY on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,  
1000baseTX-FDX, auto

bge0: Ethernet address: 00:12:79:3b:83:52
bge1: Broadcom BCM5704 B0, ASIC rev. 0x2100 mem  
0xfdf6-0xfdf6 irq 26 at device 2.1 on pci2

miibus1: MII bus on bge1
brgphy1: BCM5704 10/100/1000baseTX PHY on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,  
1000baseTX-FDX, auto

bge1: Ethernet address: 00:12:79:3b:83:51
uhci0: UHCI (generic) USB controller port 0x2000-0x201f irq 16 at  
device 29.0 on pci0

uhci0: [GIANT-LOCKED]
usb0: UHCI (generic) USB controller on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: UHCI (generic) USB controller port 0x2020-0x203f irq 19 at  
device 29.1 on pci0

uhci1: [GIANT-LOCKED]
usb1: UHCI (generic) USB controller on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
pci0: base peripheral at device 29.4 (no driver attached)
pci0: base peripheral, interrupt controller at device 29.5 (no  
driver attached)
ehci0: Intel 6300ESB USB 2.0 controller mem 0xfbee-0xfbee03ff  
irq 23 at device 29.7 on pci0

ehci0: [GIANT-LOCKED]
usb2: EHCI version 1.0
usb2: companion controllers, 2 ports each: usb0 usb1
usb2: Intel 6300ESB USB 2.0 controller on ehci0
usb2: USB 

Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-12 Thread Andrew Pantyukhin

On 12/11/06, Greg Eden [EMAIL PROTECTED] wrote:

I recently updated two production servers from 5.3 to 6.1 via cvsup
and buildworld. Since the upgrade I've seen an increase in the number
of Input packet errors reported on the bge cards in on both boxes.
One is a HP DL360g3, the other is a HP DL380g3. Both have a pair of
2.8GHz Xeons with a SMP kernel.


Just to be sure, is polling disabled?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-12 Thread Jeremy Chadwick
On Tue, Dec 12, 2006 at 03:07:28PM +, Greg Eden wrote:
 I recently updated two production servers from 5.3 to 6.1 via  
 cvsup and
 buildworld.

Greg,

This may or may not be any help (read: possible red herring).
But from looking at your below dmesg, I don't see any signs
of SMP being used:

  Since the upgrade I've seen an increase in the number of
 Input packet errors reported on the bge cards in on both boxes.  
 One is a
 HP DL360g3, the other is a HP DL380g3. Both have a pair of 2.8GHz  
 Xeons with a SMP kernel.
  ^

 CPU: Intel(R) Xeon(TM) CPU 3.00GHz (3000.12-MHz 686-class CPU)
   Origin = GenuineIntel  Id = 0xf41  Stepping = 1

 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE 
 ,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
   Features2=0x641dSSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,b14
   AMD Features=0x2010NX,LM
   Logical CPUs per core: 2
 real memory  = 1073688576 (1023 MB)
 avail memory = 1045893120 (997 MB)
--- SMP details are missing from here ---
 ioapic1: Changing APIC ID to 9
 ioapic0 Version 2.0 irqs 0-23 on motherboard
 ioapic1 Version 2.0 irqs 24-47 on motherboard
 ioapic2 Version 2.0 irqs 48-71 on motherboard
 ioapic3 Version 2.0 irqs 72-95 on motherboard
 kbd1 at kbdmux0
 ...

Normally, SMP kernels display something like this:

CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2010.31-MHz 686-class CPU)
  Origin = AuthenticAMD  Id = 0x20f32  Stepping = 2
  
Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT
  Features2=0x1SSE3
  AMD Features=0xe2500800SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow
  AMD Features2=0x3LAHF,CMP
  Cores per package: 2
real memory  = 2147418112 (2047 MB)
avail memory = 2096336896 (1999 MB)
ACPI APIC Table: Nvidia AWRDACPI
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0: Changing APIC ID to 2
ioapic0 Version 1.1 irqs 0-23 on motherboard

Or, for comparison, a 4.11 box:

CPU: Intel Pentium III (933.03-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x68a  Stepping = 10
  
Features=0x383fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE
real memory  = 536805376 (524224K bytes)
avail memory = 518811648 (506652K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 - irq 0
FreeBSD/SMP: Multiprocessor motherboard: 2 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee0
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x00178011, at 0xfec0
Preloaded elf kernel kernel at 0xc0358000.

Additionally, your Email says two 2.8GHz Xeons, but it looks as
if you have one physical 3.0GHz Xeon that has dual cores.

-- 
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networkinghttp://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-12 Thread Greg Eden


On 12 Dec 2006, at 20:48, Andrew Pantyukhin wrote:


On 12/11/06, Greg Eden [EMAIL PROTECTED] wrote:

I recently updated two production servers from 5.3 to 6.1 via cvsup
and buildworld. Since the upgrade I've seen an increase in the number
of Input packet errors reported on the bge cards in on both boxes.
One is a HP DL360g3, the other is a HP DL380g3. Both have a pair of
2.8GHz Xeons with a SMP kernel.


Just to be sure, is polling disabled?


yes. i don't use it on any of five the machines producing the problem.

best.
greg.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-12 Thread Greg Eden


On 12 Dec 2006, at 21:39, Jeremy Chadwick wrote:


On Tue, Dec 12, 2006 at 03:07:28PM +, Greg Eden wrote:

I recently updated two production servers from 5.3 to 6.1 via
cvsup and
buildworld.


Greg,

This may or may not be any help (read: possible red herring).
But from looking at your below dmesg, I don't see any signs
of SMP being used:


good - it's not an SMP box :)

sorry for any confusion. the box I *was* able to upgrade to  
RELENG_6_2, and reported in last the email with the dmesg output is  
not SMP. all five HP DL3xx boxes (two of which *are* SMP) show  
exactly the same behaviour irrespective of being SMP or UP. none of  
them have polling enabled.


best.
greg.


Since the upgrade I've seen an increase in the number of
Input packet errors reported on the bge cards in on both boxes.
One is a
HP DL360g3, the other is a HP DL380g3. Both have a pair of 2.8GHz
Xeons with a SMP kernel.

  ^


CPU: Intel(R) Xeon(TM) CPU 3.00GHz (3000.12-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf41  Stepping = 1

Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR, 
PGE

,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x641dSSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,b14
  AMD Features=0x2010NX,LM
  Logical CPUs per core: 2
real memory  = 1073688576 (1023 MB)
avail memory = 1045893120 (997 MB)

--- SMP details are missing from here ---

ioapic1: Changing APIC ID to 9
ioapic0 Version 2.0 irqs 0-23 on motherboard
ioapic1 Version 2.0 irqs 24-47 on motherboard
ioapic2 Version 2.0 irqs 48-71 on motherboard
ioapic3 Version 2.0 irqs 72-95 on motherboard
kbd1 at kbdmux0
...


Normally, SMP kernels display something like this:

CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2010.31-MHz  
686-class CPU)

  Origin = AuthenticAMD  Id = 0x20f32  Stepping = 2
   
Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,P 
GE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT

  Features2=0x1SSE3
  AMD Features=0xe2500800SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow
  AMD Features2=0x3LAHF,CMP
  Cores per package: 2
real memory  = 2147418112 (2047 MB)
avail memory = 2096336896 (1999 MB)
ACPI APIC Table: Nvidia AWRDACPI
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0: Changing APIC ID to 2
ioapic0 Version 1.1 irqs 0-23 on motherboard

Or, for comparison, a 4.11 box:

CPU: Intel Pentium III (933.03-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x68a  Stepping = 10
   
Features=0x383fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG 
E,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE

real memory  = 536805376 (524224K bytes)
avail memory = 518811648 (506652K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 - irq 0
FreeBSD/SMP: Multiprocessor motherboard: 2 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee0
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x00178011, at 0xfec0
Preloaded elf kernel kernel at 0xc0358000.

Additionally, your Email says two 2.8GHz Xeons, but it looks as
if you have one physical 3.0GHz Xeon that has dual cores.

--
| Jeremy Chadwick jdc at  
parodius.com |
| Parodius Networkinghttp:// 
www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA,  
USA |
| Making life hard for others since 1977.   PGP:  
4BD6C0CB |




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


bge Ierr rate increase from 5.3R - 6.1R

2006-12-11 Thread Greg Eden

Hello

I recently updated two production servers from 5.3 to 6.1 via cvsup  
and buildworld. Since the upgrade I've seen an increase in the number  
of Input packet errors reported on the bge cards in on both boxes.  
One is a HP DL360g3, the other is a HP DL380g3. Both have a pair of  
2.8GHz Xeons with a SMP kernel.


So - was the old driver previously underreporting, is the new one  
over-reporting/causing the error rate or is it something else? Cables  
and cabling have not changed and the pickup in number of errors is  
quite distinct. I monitor the nightly 'periodic daily' phone homes  
closely.


Example from the DL380g3.

The 5.3-6.1 upgrade was on 8 November. 58,000 errors in 1 month  
compared to 2 errors in 1 year with 5.3.


Network interface status:
NameMtu Network   Address  Ipkts IerrsOpkts  
Oerrs  Coll
bge0   1500 Link#1  00:0f:20:f6:**:** 1344182650 58292  
2701993948 0 0
bge0   1500 192.168.**/25 **1344176611 -  
2701984851 - -
bge1*  1500 Link#2  00:0f:20:f6:**:**0 0 
0 0 0
lo0   16384 Link#3   69549 0 
69549 0 0
lo0   16384 your-net  localhost69549 - 
69549 - -


A couple of weeks ago I turned off tx and rx check summing on this  
box as I gathered from googling it might be contributing. That had no  
effect.


Upon further investigation it appears six other boxes with bge ports  
(mostly HP DL360g4) running 6.1 started reporting errors when moved  
to 6.1. As they do only a small fraction of the traffic that the  
above box does I hadn't noticed it.


This box (a UP HP DL360g4) is on a completely different network,  
different switch, cabling etc. Again, prior to 6.1 it had never  
reported an error in 18 months of service.


Network interface status:
NameMtu Network   Address  Ipkts IerrsOpkts  
Oerrs  Coll
bge0   1500 Link#1  00:12:79:3b:**:** 781001814  1980  
1056534485 0 0
bge0   1500 192.168.***   192.168.***.***   783877018 -  
1061029115 - -


I don't have a spare box with a bge interface to test 6.2 for the  
same behaviour, but would be interested if anyone had an explanation.


Best wishes.

Greg.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-11 Thread Doug Barton
Greg Eden wrote:
 Hello
 
 I recently updated two production servers from 5.3 to 6.1 via cvsup and
 buildworld. Since the upgrade I've seen an increase in the number of
 Input packet errors reported on the bge cards in on both boxes. One is a
 HP DL360g3, the other is a HP DL380g3. Both have a pair of 2.8GHz Xeons
 with a SMP kernel.

It would be quite useful at this point if you could update a box or
two to the RELENG_6_2 code base so that we can see if this problem is
solved in the latest release candidate. If it is, your problem is
solved, and if it's not, you're a lot closer to the point where we can
usefully assist you.

Doug

-- 

This .signature sanitized for your protection
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]