Re: Kernel Panic on AMD64 24 June snapshot

2008-07-03 Thread Henning Brauer
* Insan Praja SW [EMAIL PROTECTED] [2008-06-24 18:32]:
 Hi Misc@,
 I currently caught a kernel panic that says:
 uvm_fault(0x 80b7b0e0, 0x0, 0, 1) - e
 kernel : page fault trap, code=0
 Stopped at  pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi)
 ddb {0} trace

this problem has been reported by a few people, but so far we're unable 
to track it down or even reproduce. it would help enourmously if we 
knew WHEN this was introduced. so if someone who can reproduce this can 
compile kernels going backwards day by day (cvs -D) and then ideally even 
spot the commit that introduced it, that would help a LOT. yes, it is a 
lot of work :(

in short, it seems some element of the pf state table (which is an RB 
tree, pf_state_tree) gets freed or overwritten before being removed 
from the RB tree, or something tries to remove it before it was 
inserted. Ryan and I have been reading the code up and down without 
being able to spot such a case yet.

-- 
Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED]
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg  Amsterdam



Re: Kernel Panic on AMD64 24 June snapshot

2008-07-03 Thread Wade, Daniel
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
 Behalf Of Henning Brauer
 Sent: Thursday, July 03, 2008 9:04 AM
 To: misc@openbsd.org
 Subject: Re: Kernel Panic on AMD64 24 June snapshot

 * Insan Praja SW [EMAIL PROTECTED] [2008-06-24 18:32]:
  Hi Misc@,
  I currently caught a kernel panic that says:
  uvm_fault(0x 80b7b0e0, 0x0, 0, 1) - e
  kernel : page fault trap, code=0
  Stopped at  pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl
 $0x1,0x40(%rsi)
  ddb {0} trace

 this problem has been reported by a few people, but so far we're
 unable
 to track it down or even reproduce. it would help enourmously if we
 knew WHEN this was introduced. so if someone who can reproduce this
 can
 compile kernels going backwards day by day (cvs -D) and then
 ideally even
 spot the commit that introduced it, that would help a LOT. yes, it
 is a
 lot of work :(

 in short, it seems some element of the pf state table (which is an
 RB
 tree, pf_state_tree) gets freed or overwritten before being removed
 from the RB tree, or something tries to remove it before it was
 inserted. Ryan and I have been reading the code up and down without
 being able to spot such a case yet.

 --
 Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED]
 BS Web Services, http://bsws.de
 Full-Service ISP - Secure Hosting, Mail and DNS Services
 Dedicated Servers, Rootservers, Application Hosting - Hamburg 
 Amsterdam


I hit this with OpenBSD 4.3-current (GENERIC) #935: Sun Jun 15 19:31:26 MDT
2008

So at least that far back



Re: Kernel Panic on AMD64 24 June snapshot

2008-07-03 Thread Josh
Any chance of giving some info about how your PF is used, that I might
set up a similar box in the hope of reproducing it?


On Thu, 2008-07-03 at 09:20 -0400, Wade, Daniel wrote:
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
  Behalf Of Henning Brauer
  Sent: Thursday, July 03, 2008 9:04 AM
  To: misc@openbsd.org
  Subject: Re: Kernel Panic on AMD64 24 June snapshot
 
  * Insan Praja SW [EMAIL PROTECTED] [2008-06-24 18:32]:
   Hi Misc@,
   I currently caught a kernel panic that says:
   uvm_fault(0x 80b7b0e0, 0x0, 0, 1) - e
   kernel : page fault trap, code=0
   Stopped at  pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl
  $0x1,0x40(%rsi)
   ddb {0} trace
 
  this problem has been reported by a few people, but so far we're
  unable
  to track it down or even reproduce. it would help enourmously if we
  knew WHEN this was introduced. so if someone who can reproduce this
  can
  compile kernels going backwards day by day (cvs -D) and then
  ideally even
  spot the commit that introduced it, that would help a LOT. yes, it
  is a
  lot of work :(
 
  in short, it seems some element of the pf state table (which is an
  RB
  tree, pf_state_tree) gets freed or overwritten before being removed
  from the RB tree, or something tries to remove it before it was
  inserted. Ryan and I have been reading the code up and down without
  being able to spot such a case yet.
 
  --
  Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED]
  BS Web Services, http://bsws.de
  Full-Service ISP - Secure Hosting, Mail and DNS Services
  Dedicated Servers, Rootservers, Application Hosting - Hamburg 
  Amsterdam
 
 
 I hit this with OpenBSD 4.3-current (GENERIC) #935: Sun Jun 15 19:31:26 MDT
 2008
 
 So at least that far back



Re: Kernel Panic on AMD64 24 June snapshot

2008-07-03 Thread Josh
Not an AMD64 specific thing then?

On Thu, 2008-07-03 at 15:03 +0200, Henning Brauer wrote:
 * Insan Praja SW [EMAIL PROTECTED] [2008-06-24 18:32]:
  Hi Misc@,
  I currently caught a kernel panic that says:
  uvm_fault(0x 80b7b0e0, 0x0, 0, 1) - e
  kernel : page fault trap, code=0
  Stopped at  pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi)
  ddb {0} trace
 
 this problem has been reported by a few people, but so far we're unable 
 to track it down or even reproduce. it would help enourmously if we 
 knew WHEN this was introduced. so if someone who can reproduce this can 
 compile kernels going backwards day by day (cvs -D) and then ideally even 
 spot the commit that introduced it, that would help a LOT. yes, it is a 
 lot of work :(
 
 in short, it seems some element of the pf state table (which is an RB 
 tree, pf_state_tree) gets freed or overwritten before being removed 
 from the RB tree, or something tries to remove it before it was 
 inserted. Ryan and I have been reading the code up and down without 
 being able to spot such a case yet.



Re: Kernel Panic on AMD64 24 June snapshot

2008-07-03 Thread Henning Brauer
* Insan Praja SW [EMAIL PROTECTED] [2008-06-24 18:32]:
 Stopped at  pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi)

sometimes it takes a few reads until things are obvious.
please try this diff.

Index: pf.c
===
RCS file: /cvs/src/sys/net/pf.c,v
retrieving revision 1.604
diff -u -r1.604 pf.c
--- pf.c3 Jul 2008 15:46:23 -   1.604
+++ pf.c4 Jul 2008 00:04:27 -
@@ -687,8 +685,8 @@
}
pool_put(pf_state_key_pl, sk);
s-key[idx] = cur;
-   }
-   s-key[idx] = sk;
+   } else
+   s-key[idx] = sk;
 
if ((si = pool_get(pf_state_item_pl, PR_NOWAIT)) == NULL) {
pf_state_key_detach(s, idx);


-- 
Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED]
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg  Amsterdam



Kernel Panic on AMD64 24 June snapshot

2008-06-24 Thread Insan Praja SW

Hi Misc@,
I currently caught a kernel panic that says:
uvm_fault(0x 80b7b0e0, 0x0, 0, 1) - e
kernel : page fault trap, code=0
Stopped at  pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi)
ddb {0} trace
pf_state_tree_RB_REMOVE_COLOR() at pf_state_tree_RB_REMOVE_COLOR+0x1c0
pf_state_tree_RB_REMOVE() at pf_state_tree_RB_REMOVE+0x4d
pf_state_tree_key_detach() at pf_state_key_detech+0x9d
pf_state_state() at pf_detach_state_key_detach+0x9d
pf_purge_expired_states() at pf_purge_expired_state+0x9d
pf_purge_thread() at pf_purge_thread+0x53
end trace frame : 0x0, counnt: -6
ddb {0}

and this is the dmesg:
OpenBSD 4.3-current (GENERIC.MP) #7: Tue Jun 24 20:27:50 WIT 2008
[EMAIL PROTECTED]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 2124914688 (2026MB)
avail mem = 2063269888 (1967MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.34 @ 0x7f6ee000 (78 entries)
bios0: vendor FUJITSU SIEMENS // Phoenix Technologies Ltd. version 5.00  
R1.10.2151.A1 date 05/08/2006

bios0: FUJITSU SIEMENS D2151-A1
acpi0 at bios0: rev 2
acpi0: tables DSDT FACP ASF! SSDT MCFG HPET APIC BOOT
acpi0: wakeup devices PEXA(S4) PEXB(S4) PEXC(S4) PEXD(S4) PEXE(S4)  
USB1(S4) USB2(S4) USB3(S4) USB4(S4) USB5(S4) PCIH(S4) KEYB(

S4) PS2M(S4) COM1(S1) COM2(S1)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Pentium(R) D CPU 2.66GHz, 2660.51 MHz
cpu0:  
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,S

SE3,MWAIT,DS-CPL,TM2,CNXT-ID,CX16,xTPR,NXE,LONG
cpu0: 1MB 64b/line 8-way L2 cache
cpu0: apic clock running at 133MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Pentium(R) D CPU 2.66GHz, 2660.07 MHz
cpu1:  
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,S

SE3,MWAIT,DS-CPL,TM2,CNXT-ID,CX16,xTPR,NXE,LONG
cpu1: 1MB 64b/line 8-way L2 cache
ioapic0 at mainbus0 apid 2 pa 0xfec0, version 20, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEXA)
acpiprt2 at acpi0: bus 3 (PEXB)
acpiprt3 at acpi0: bus 5 (PEXC)
acpiprt4 at acpi0: bus 7 (PEXD)
acpiprt5 at acpi0: bus 9 (PEXE)
acpiprt6 at acpi0: bus 11 (PCIH)
acpicpu0 at acpi0: FVS, 2667, 1862 MHz
acpicpu1 at acpi0: FVS, 2667, 1862 MHz
acpibtn0 at acpi0: PWRB
pci0 at mainbus0 bus 0: configuration mode 1
pchb0 at pci0 dev 0 function 0 Intel 82945G Host rev 0x02
vga1 at pci0 dev 2 function 0 Intel 82945G Video rev 0x02
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
agp0 at vga1: aperture at 0xe000, size 0x1000
azalia0 at pci0 dev 27 function 0 Intel 82801GB HD Audio rev 0x01: apic  
2 int 18 (irq 9)

azalia0: codec[s]: Realtek ALC260
audio0 at azalia0
ppb0 at pci0 dev 28 function 0 Intel 82801GB PCIE rev 0x01: apic 2 int  
17 (irq 11)

pci1 at ppb0 bus 3
ppb1 at pci0 dev 28 function 1 Intel 82801GB PCIE rev 0x01: apic 2 int  
16 (irq 11)

pci2 at ppb1 bus 5
bge0 at pci2 dev 0 function 0 Broadcom BCM5751 rev 0x01, BCM5750 A1  
(0x4001): apic 2 int 17 (irq 11), address 00:30:05:c9:79

:df
brgphy0 at bge0 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0
ppb2 at pci0 dev 28 function 2 Intel 82801GB PCIE rev 0x01: apic 2 int  
18 (irq 9)

pci3 at ppb2 bus 7
ppb3 at pci0 dev 28 function 3 Intel 82801GB PCIE rev 0x01: apic 2 int  
19 (irq 9)

pci4 at ppb3 bus 9
uhci0 at pci0 dev 29 function 0 Intel 82801GB USB rev 0x01: apic 2 int  
23 (irq 11)
uhci1 at pci0 dev 29 function 1 Intel 82801GB USB rev 0x01: apic 2 int  
22 (irq 10)
uhci2 at pci0 dev 29 function 2 Intel 82801GB USB rev 0x01: apic 2 int  
21 (irq 5)
uhci3 at pci0 dev 29 function 3 Intel 82801GB USB rev 0x01: apic 2 int  
20 (irq 9)
ehci0 at pci0 dev 29 function 7 Intel 82801GB USB rev 0x01: apic 2 int  
23 (irq 11)

ehci0: timed out waiting for BIOS
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 Intel EHCI root hub rev 2.00/1.00 addr 1
ppb4 at pci0 dev 30 function 0 Intel 82801BA Hub-to-PCI rev 0xe1
pci5 at ppb4 bus 11
em0 at pci5 dev 5 function 0 Intel PRO/1000MT (82540EM) rev 0x02: apic 2  
int 22 (irq 10), address 00:07:e9:0f:44:37
rl0 at pci5 dev 7 function 0 D-Link Systems 530TX+ rev 0x10: apic 2 int  
21 (irq 5), address 00:11:95:63:48:63

rlphy0 at rl0 phy 0: RTL internal PHY
pcib0 at pci0 dev 31 function 0 Intel 82801GB LPC rev 0x01
pciide0 at pci0 dev 31 function 2 Intel 82801GB SATA rev 0x01: DMA,  
channel 0 wired to compatibility, channel 1 wired to com

patibility
wd0 at pciide0 channel 0 drive 0: ST3160211AS
wd0: 16-sector PIO, LBA48, 152627MB, 312581808 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
atapiscsi0 at pciide0 channel 1 drive 1
scsibus0 at atapiscsi0: 2 targets, initiator 7
cd0 at scsibus0 targ 0 lun 0: TSSTcorp, DVD-ROM SH-D162D, SB00 ATAPI  
5/cdrom removable

cd0(pciide0:1:1): using PIO mode 4, Ultra-DMA mode 

Re: Kernel Panic on AMD64 24 June snapshot

2008-06-24 Thread Insan Praja SW
On Tue, 24 Jun 2008 23:55:47 +0700, Stuart Henderson [EMAIL PROTECTED]  
wrote:



In gmane.os.openbsd.misc, you wrote:

Stopped at  pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi)



OpenBSD 4.3-current (GENERIC.MP) #7: Tue Jun 24 20:27:50 WIT 2008


We can't tell which files are in your build. If sys/net/pf_ioctl.c
is between 1.203-1.207, you need to update to 1.208.

If it's already at 1.208 please post back on the misc@ thread with
that information.


Hi Stuart and Misc@,
Well, it obviously says:
 $OpenBSD: pf_ioctl.c,v 1.208 2008/06/22 13:01:33 mcbride Exp $ */
on the /usr/src/sys/net/pf_ioctl.c
Thanks,

Insan

--
insandotpraja(at)gmaildotcom