Re: Usefull info for a bug report regarding carp/pfsync?

2008-03-31 Thread Johan Fredin

On 08-03-31 10.44, Simon Kammerer wrote:

Hi!

after several years without any problems, we upgraded the hardware of 
our carp/pfsync gateway about four week ago. Two weeks ago, the gateway 
crashed completely: Both nodes were unreachable on all network 
interfaces, we had to reset both machines. Same problem last night. I 
can't find anything strange in  the logs.

Its 4.2 from the official CD set, AMD64.


Did you update your system with patch 004 from 
http://www.openbsd.org/errata42.html?


I believe that bug has been known to lock up machines like yours did.

/Johan



Re: Usefull info for a bug report regarding carp/pfsync?

2008-04-01 Thread Preston Kutzner
On Mon, 31 Mar 2008 10:44:28 +0200
Simon Kammerer <[EMAIL PROTECTED]> wrote:

> Hi!
>
> after several years without any problems, we upgraded the hardware of
> our carp/pfsync gateway about four week ago. Two weeks ago, the gateway
> crashed completely: Both nodes were unreachable on all network
> interfaces, we had to reset both machines. Same problem last night. I
> can't find anything strange in  the logs.
> Its 4.2 from the official CD set, AMD64.
>
> Any hints what to add to a usefull bug report in addition to dmesg output?
>
> Thanks
> Simon
>
>

While I'm not having exactly the same issue, I am having a similar
issue.  Here's what I've been experiencing that I've found no
resolution to:

I'm running a small shuttle box (AMD_64) with nForce3 chipset.  It's
using the nfe(4) driver.  This box is used as a basic transparent
caching proxy server (squid + squidGuard)  Throughput is fairly
low-volume, as we only have a 1.5Mbit T1/DS1 connection.  The problem
I'm having is that periodically (and seemingly randomly) the TCP/IP
stack will apparently lock-up.  All network communication will cease
and a restart is needed to correct the problem.  While the network is
locked-up on the machine, I am still able to login via a local console,
and everything else seems to be working correctly.

I'm running OpenBSD 4.2.  Here is my dmesg output, as well as my sysctl
output.  I've tweaked a couple of settings in hopes that it would fix
the network lock-up issue, but so far, it hasn't.

DMESG output:

OpenBSD 4.2 (GENERIC) #1179: Tue Aug 28 10:37:50 MDT 2007
[EMAIL PROTECTED]:/usr/src/sys/arch/amd64/compile/GENERIC
real mem = 1073278976 (1023MB)
avail mem = 1030926336 (983MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.2 @ 0xf (39 entries)
bios0: vendor Phoenix Technologies, LTD version "6.00 PG" date
06/28/2005 bios0: Shuttle Inc SN95V30
acpi at mainbus0 not configured
cpu0 at mainbus0: (uniprocessor)
cpu0: AMD Athlon(tm) 64 Processor 3700+, 2211.01 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS
H,MMX,FXSR,SSE,SSE2,SSE3,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 1MB
64b/line 16-way L2 cache cpu0: ITLB 32 4KB entries fully associative, 8
4MB entries fully associative cpu0: DTLB 32 4KB entries fully
associative, 8 4MB entries fully associative cpu0: AMD erratum 89
present, BIOS upgrade may be required pci0 at mainbus0 bus 0:
configuration mode 1 pchb0 at pci0 dev 0 function 0 "NVIDIA nForce3 250
PCI Host" rev 0xa1 pcib0 at pci0 dev 1 function 0 "NVIDIA nForce3 250
ISA" rev 0xa2 nviic0 at pci0 dev 1 function 1 "NVIDIA nForce3 250
SMBus" rev 0xa1 iic0 at nviic0 iic1 at nviic0
adt0 at iic1 addr 0x2e: adm1027 rev 0x6a
iic1: addr 0x4e 03=06 04=06 12=ff 13=0f 28=83 29=12 2a=12 2b=28
ohci0 at pci0 dev 2 function 0 "NVIDIA nForce3 250 USB" rev 0xa1: irq
7, version 1.0, legacy support ohci1 at pci0 dev 2 function 1 "NVIDIA
nForce3 250 USB" rev 0xa1: irq 5, version 1.0, legacy support ehci0 at
pci0 dev 2 function 2 "NVIDIA nForce3 250 USB2" rev 0xa2: irq 10 usb0
at ehci0: USB revision 2.0 uhub0 at usb0: NVIDIA EHCI root hub, rev
2.00/1.00, addr 1 nfe0 at pci0 dev 5 function 0 "NVIDIA nForce3 LAN"
rev 0xa2: irq 10, address 00:30:1b:ba:2d:ee eephy0 at nfe0 phy 1:
Marvell 88E Gigabit PHY, rev. 2 auich0 at pci0 dev 6 function 0
"NVIDIA nForce3 250 AC97" rev 0xa1: irq 7, nForce3 AC97 ac97: codec id
0x414c4760 (Avance Logic ALC655 rev 0) audio0 at auich0
pciide0 at pci0 dev 8 function 0 "NVIDIA nForce3 250 IDE" rev 0xa2:
DMA, channel 0 configured to compatibility, channel 1 configured to
compatibility pciide0: channel 0 disabled (no drives) atapiscsi0 at
pciide0 channel 1 drive 1 scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0:  SCSI0
5/cdrom removable cd0(pciide0:1:1): using PIO mode 4, Ultra-DMA mode 4
pciide1 at pci0 dev 10 function 0 "NVIDIA nForce3 250 SATA" rev 0xa2:
DMA pciide1: using irq 11 for native-PCI interrupt
wd0 at pciide1 channel 0 drive 0: 
wd0: 16-sector PIO, LBA48, 238475MB, 488397168 sectors
wd0(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 5
ppb0 at pci0 dev 11 function 0 "NVIDIA nForce3 250 AGP" rev 0xa2
pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 "ATI Radeon 9200 SE Sec" rev 0x01
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
"ATI Radeon 9200 SE" rev 0x01 at pci1 dev 0 function 1 not configured
ppb1 at pci0 dev 14 function 0 "NVIDIA nForce3 250 PCI-PCI" rev 0xa2
pci2 at ppb1 bus 2
"VIA VT6306 FireWire" rev 0x80 at pci2 dev 7 function 0 not configured
pchb1 at pci0 dev 24 function 0 "AMD AMD64 HyperTransport" rev 0x00
pchb2 at pci0 dev 24 function 1 "AMD AMD64 Address Map" rev 0x00
pchb3 at pci0 dev 24 function 2 "AMD AMD64 DRAM Cfg" rev 0x00
pchb4 at pci0 dev 24 function 3 "AMD AMD64 Misc Cfg" rev 0x00
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5
pckbd

Re: Usefull info for a bug report regarding carp/pfsync?

2008-04-01 Thread Preston Kutzner
On Tue, 1 Apr 2008 18:16:05 -0400
"Richard Daemon" <[EMAIL PROTECTED]> wrote:

> On Tue, Apr 1, 2008 at 12:12 PM, Preston Kutzner
> <[EMAIL PROTECTED]> wrote:
> <---snip--->
> It's not by chance your PF state table that may be maxed?
>

I'm not using PF on this box, so I wouldn't think it is.  PF is running
on a different piece of hardware and forwarding to this box.  It hasn't
had any problems.  When this machine locks up, I can't even ping out
from this box.  I get nothing in the logs after a lock-up either, which
makes it all the more frustrating.

I've tried making changes like upping kern.maxclusters,
net.inet.tcp.recvspace/sendspace, net.inet.udp.recvspace/sendspace,
net.inet.icmp.errppslimit and inet.inet.ip.maxqueue, but those changes
only wound up making it lock-up more frequently.

The only changes that seemed to help were upping kern.maxfiles,
kern.maxproc, and kern.somaxconn.  And while increasing values for
these options helped some (in the form of making lock-ups less
frequent), I'm still getting them.

[demime 1.01d removed an attachment of type application/pgp-signature which had 
a name of signature.asc]



Re: Usefull info for a bug report regarding carp/pfsync?

2008-04-01 Thread Richard Daemon
On Tue, Apr 1, 2008 at 12:12 PM, Preston Kutzner
<[EMAIL PROTECTED]> wrote:
>
> On Mon, 31 Mar 2008 10:44:28 +0200
>  Simon Kammerer <[EMAIL PROTECTED]> wrote:
>
>  > Hi!
>  >
>  > after several years without any problems, we upgraded the hardware of
>  > our carp/pfsync gateway about four week ago. Two weeks ago, the gateway
>  > crashed completely: Both nodes were unreachable on all network
>  > interfaces, we had to reset both machines. Same problem last night. I
>  > can't find anything strange in  the logs.
>  > Its 4.2 from the official CD set, AMD64.
>  >
>  > Any hints what to add to a usefull bug report in addition to dmesg output?
>  >
>  > Thanks
>  > Simon
>  >
>  >
>
>  While I'm not having exactly the same issue, I am having a similar
>  issue.  Here's what I've been experiencing that I've found no
>  resolution to:
>
>  I'm running a small shuttle box (AMD_64) with nForce3 chipset.  It's
>  using the nfe(4) driver.  This box is used as a basic transparent
>  caching proxy server (squid + squidGuard)  Throughput is fairly
>  low-volume, as we only have a 1.5Mbit T1/DS1 connection.  The problem
>  I'm having is that periodically (and seemingly randomly) the TCP/IP
>  stack will apparently lock-up.  All network communication will cease
>  and a restart is needed to correct the problem.  While the network is
>  locked-up on the machine, I am still able to login via a local console,
>  and everything else seems to be working correctly.
>
>  I'm running OpenBSD 4.2.  Here is my dmesg output, as well as my sysctl
>  output.  I've tweaked a couple of settings in hopes that it would fix
>  the network lock-up issue, but so far, it hasn't.
>
>  DMESG output:
>
>  OpenBSD 4.2 (GENERIC) #1179: Tue Aug 28 10:37:50 MDT 2007
> [EMAIL PROTECTED]:/usr/src/sys/arch/amd64/compile/GENERIC
>  real mem = 1073278976 (1023MB)
>  avail mem = 1030926336 (983MB)
>  mainbus0 at root
>  bios0 at mainbus0: SMBIOS rev. 2.2 @ 0xf (39 entries)
>  bios0: vendor Phoenix Technologies, LTD version "6.00 PG" date
>  06/28/2005 bios0: Shuttle Inc SN95V30
>  acpi at mainbus0 not configured
>  cpu0 at mainbus0: (uniprocessor)
>  cpu0: AMD Athlon(tm) 64 Processor 3700+, 2211.01 MHz
>  cpu0:
>  FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS
>  H,MMX,FXSR,SSE,SSE2,SSE3,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
>  cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 1MB
>  64b/line 16-way L2 cache cpu0: ITLB 32 4KB entries fully associative, 8
>  4MB entries fully associative cpu0: DTLB 32 4KB entries fully
>  associative, 8 4MB entries fully associative cpu0: AMD erratum 89
>  present, BIOS upgrade may be required pci0 at mainbus0 bus 0:
>  configuration mode 1 pchb0 at pci0 dev 0 function 0 "NVIDIA nForce3 250
>  PCI Host" rev 0xa1 pcib0 at pci0 dev 1 function 0 "NVIDIA nForce3 250
>  ISA" rev 0xa2 nviic0 at pci0 dev 1 function 1 "NVIDIA nForce3 250
>  SMBus" rev 0xa1 iic0 at nviic0 iic1 at nviic0
>  adt0 at iic1 addr 0x2e: adm1027 rev 0x6a
>  iic1: addr 0x4e 03=06 04=06 12=ff 13=0f 28=83 29=12 2a=12 2b=28
>  ohci0 at pci0 dev 2 function 0 "NVIDIA nForce3 250 USB" rev 0xa1: irq
>  7, version 1.0, legacy support ohci1 at pci0 dev 2 function 1 "NVIDIA
>  nForce3 250 USB" rev 0xa1: irq 5, version 1.0, legacy support ehci0 at
>  pci0 dev 2 function 2 "NVIDIA nForce3 250 USB2" rev 0xa2: irq 10 usb0
>  at ehci0: USB revision 2.0 uhub0 at usb0: NVIDIA EHCI root hub, rev
>  2.00/1.00, addr 1 nfe0 at pci0 dev 5 function 0 "NVIDIA nForce3 LAN"
>  rev 0xa2: irq 10, address 00:30:1b:ba:2d:ee eephy0 at nfe0 phy 1:
>  Marvell 88E Gigabit PHY, rev. 2 auich0 at pci0 dev 6 function 0
>  "NVIDIA nForce3 250 AC97" rev 0xa1: irq 7, nForce3 AC97 ac97: codec id
>  0x414c4760 (Avance Logic ALC655 rev 0) audio0 at auich0
>  pciide0 at pci0 dev 8 function 0 "NVIDIA nForce3 250 IDE" rev 0xa2:
>  DMA, channel 0 configured to compatibility, channel 1 configured to
>  compatibility pciide0: channel 0 disabled (no drives) atapiscsi0 at
>  pciide0 channel 1 drive 1 scsibus0 at atapiscsi0: 2 targets
>  cd0 at scsibus0 targ 0 lun 0:  SCSI0
>  5/cdrom removable cd0(pciide0:1:1): using PIO mode 4, Ultra-DMA mode 4
>  pciide1 at pci0 dev 10 function 0 "NVIDIA nForce3 250 SATA" rev 0xa2:
>  DMA pciide1: using irq 11 for native-PCI interrupt
>  wd0 at pciide1 channel 0 drive 0: 
>  wd0: 16-sector PIO, LBA48, 238475MB, 488397168 sectors
>  wd0(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 5
>  ppb0 at pci0 dev 11 function 0 "NVIDIA nForce3 250 AGP" rev 0xa2
>  pci1 at ppb0 bus 1
>  vga1 at pci1 dev 0 function 0 "ATI Radeon 9200 SE Sec" rev 0x01
>  wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
>  wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
>  "ATI Radeon 9200 SE" rev 0x01 at pci1 dev 0 function 1 not configured
>  ppb1 at pci0 dev 14 function 0 "NVIDIA nForce3 250 PCI-PCI" rev 0xa2
>  pci2 at ppb1 bus 2
>  "VIA VT6306 FireWire" rev 0x80 at pci2 dev 7 function 0 not configured
>  pchb1 at pci0 dev 24 funct