Re: Kernel Panic on AMD64 24 June snapshot
* Insan Praja SW [EMAIL PROTECTED] [2008-06-24 18:32]: Hi Misc@, I currently caught a kernel panic that says: uvm_fault(0x 80b7b0e0, 0x0, 0, 1) - e kernel : page fault trap, code=0 Stopped at pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi) ddb {0} trace this problem has been reported by a few people, but so far we're unable to track it down or even reproduce. it would help enourmously if we knew WHEN this was introduced. so if someone who can reproduce this can compile kernels going backwards day by day (cvs -D) and then ideally even spot the commit that introduced it, that would help a LOT. yes, it is a lot of work :( in short, it seems some element of the pf state table (which is an RB tree, pf_state_tree) gets freed or overwritten before being removed from the RB tree, or something tries to remove it before it was inserted. Ryan and I have been reading the code up and down without being able to spot such a case yet. -- Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED] BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting - Hamburg Amsterdam
Re: Kernel Panic on AMD64 24 June snapshot
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Henning Brauer Sent: Thursday, July 03, 2008 9:04 AM To: misc@openbsd.org Subject: Re: Kernel Panic on AMD64 24 June snapshot * Insan Praja SW [EMAIL PROTECTED] [2008-06-24 18:32]: Hi Misc@, I currently caught a kernel panic that says: uvm_fault(0x 80b7b0e0, 0x0, 0, 1) - e kernel : page fault trap, code=0 Stopped at pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi) ddb {0} trace this problem has been reported by a few people, but so far we're unable to track it down or even reproduce. it would help enourmously if we knew WHEN this was introduced. so if someone who can reproduce this can compile kernels going backwards day by day (cvs -D) and then ideally even spot the commit that introduced it, that would help a LOT. yes, it is a lot of work :( in short, it seems some element of the pf state table (which is an RB tree, pf_state_tree) gets freed or overwritten before being removed from the RB tree, or something tries to remove it before it was inserted. Ryan and I have been reading the code up and down without being able to spot such a case yet. -- Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED] BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting - Hamburg Amsterdam I hit this with OpenBSD 4.3-current (GENERIC) #935: Sun Jun 15 19:31:26 MDT 2008 So at least that far back
Re: Kernel Panic on AMD64 24 June snapshot
Any chance of giving some info about how your PF is used, that I might set up a similar box in the hope of reproducing it? On Thu, 2008-07-03 at 09:20 -0400, Wade, Daniel wrote: From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Henning Brauer Sent: Thursday, July 03, 2008 9:04 AM To: misc@openbsd.org Subject: Re: Kernel Panic on AMD64 24 June snapshot * Insan Praja SW [EMAIL PROTECTED] [2008-06-24 18:32]: Hi Misc@, I currently caught a kernel panic that says: uvm_fault(0x 80b7b0e0, 0x0, 0, 1) - e kernel : page fault trap, code=0 Stopped at pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi) ddb {0} trace this problem has been reported by a few people, but so far we're unable to track it down or even reproduce. it would help enourmously if we knew WHEN this was introduced. so if someone who can reproduce this can compile kernels going backwards day by day (cvs -D) and then ideally even spot the commit that introduced it, that would help a LOT. yes, it is a lot of work :( in short, it seems some element of the pf state table (which is an RB tree, pf_state_tree) gets freed or overwritten before being removed from the RB tree, or something tries to remove it before it was inserted. Ryan and I have been reading the code up and down without being able to spot such a case yet. -- Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED] BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting - Hamburg Amsterdam I hit this with OpenBSD 4.3-current (GENERIC) #935: Sun Jun 15 19:31:26 MDT 2008 So at least that far back
Re: Kernel Panic on AMD64 24 June snapshot
Not an AMD64 specific thing then? On Thu, 2008-07-03 at 15:03 +0200, Henning Brauer wrote: * Insan Praja SW [EMAIL PROTECTED] [2008-06-24 18:32]: Hi Misc@, I currently caught a kernel panic that says: uvm_fault(0x 80b7b0e0, 0x0, 0, 1) - e kernel : page fault trap, code=0 Stopped at pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi) ddb {0} trace this problem has been reported by a few people, but so far we're unable to track it down or even reproduce. it would help enourmously if we knew WHEN this was introduced. so if someone who can reproduce this can compile kernels going backwards day by day (cvs -D) and then ideally even spot the commit that introduced it, that would help a LOT. yes, it is a lot of work :( in short, it seems some element of the pf state table (which is an RB tree, pf_state_tree) gets freed or overwritten before being removed from the RB tree, or something tries to remove it before it was inserted. Ryan and I have been reading the code up and down without being able to spot such a case yet.
Re: Kernel Panic on AMD64 24 June snapshot
* Insan Praja SW [EMAIL PROTECTED] [2008-06-24 18:32]: Stopped at pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi) sometimes it takes a few reads until things are obvious. please try this diff. Index: pf.c === RCS file: /cvs/src/sys/net/pf.c,v retrieving revision 1.604 diff -u -r1.604 pf.c --- pf.c3 Jul 2008 15:46:23 - 1.604 +++ pf.c4 Jul 2008 00:04:27 - @@ -687,8 +685,8 @@ } pool_put(pf_state_key_pl, sk); s-key[idx] = cur; - } - s-key[idx] = sk; + } else + s-key[idx] = sk; if ((si = pool_get(pf_state_item_pl, PR_NOWAIT)) == NULL) { pf_state_key_detach(s, idx); -- Henning Brauer, [EMAIL PROTECTED], [EMAIL PROTECTED] BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting - Hamburg Amsterdam
Kernel Panic on AMD64 24 June snapshot
Hi Misc@, I currently caught a kernel panic that says: uvm_fault(0x 80b7b0e0, 0x0, 0, 1) - e kernel : page fault trap, code=0 Stopped at pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi) ddb {0} trace pf_state_tree_RB_REMOVE_COLOR() at pf_state_tree_RB_REMOVE_COLOR+0x1c0 pf_state_tree_RB_REMOVE() at pf_state_tree_RB_REMOVE+0x4d pf_state_tree_key_detach() at pf_state_key_detech+0x9d pf_state_state() at pf_detach_state_key_detach+0x9d pf_purge_expired_states() at pf_purge_expired_state+0x9d pf_purge_thread() at pf_purge_thread+0x53 end trace frame : 0x0, counnt: -6 ddb {0} and this is the dmesg: OpenBSD 4.3-current (GENERIC.MP) #7: Tue Jun 24 20:27:50 WIT 2008 [EMAIL PROTECTED]:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 2124914688 (2026MB) avail mem = 2063269888 (1967MB) mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.34 @ 0x7f6ee000 (78 entries) bios0: vendor FUJITSU SIEMENS // Phoenix Technologies Ltd. version 5.00 R1.10.2151.A1 date 05/08/2006 bios0: FUJITSU SIEMENS D2151-A1 acpi0 at bios0: rev 2 acpi0: tables DSDT FACP ASF! SSDT MCFG HPET APIC BOOT acpi0: wakeup devices PEXA(S4) PEXB(S4) PEXC(S4) PEXD(S4) PEXE(S4) USB1(S4) USB2(S4) USB3(S4) USB4(S4) USB5(S4) PCIH(S4) KEYB( S4) PS2M(S4) COM1(S1) COM2(S1) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 14318179 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Pentium(R) D CPU 2.66GHz, 2660.51 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,S SE3,MWAIT,DS-CPL,TM2,CNXT-ID,CX16,xTPR,NXE,LONG cpu0: 1MB 64b/line 8-way L2 cache cpu0: apic clock running at 133MHz cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Pentium(R) D CPU 2.66GHz, 2660.07 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,S SE3,MWAIT,DS-CPL,TM2,CNXT-ID,CX16,xTPR,NXE,LONG cpu1: 1MB 64b/line 8-way L2 cache ioapic0 at mainbus0 apid 2 pa 0xfec0, version 20, 24 pins acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus -1 (PEXA) acpiprt2 at acpi0: bus 3 (PEXB) acpiprt3 at acpi0: bus 5 (PEXC) acpiprt4 at acpi0: bus 7 (PEXD) acpiprt5 at acpi0: bus 9 (PEXE) acpiprt6 at acpi0: bus 11 (PCIH) acpicpu0 at acpi0: FVS, 2667, 1862 MHz acpicpu1 at acpi0: FVS, 2667, 1862 MHz acpibtn0 at acpi0: PWRB pci0 at mainbus0 bus 0: configuration mode 1 pchb0 at pci0 dev 0 function 0 Intel 82945G Host rev 0x02 vga1 at pci0 dev 2 function 0 Intel 82945G Video rev 0x02 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) agp0 at vga1: aperture at 0xe000, size 0x1000 azalia0 at pci0 dev 27 function 0 Intel 82801GB HD Audio rev 0x01: apic 2 int 18 (irq 9) azalia0: codec[s]: Realtek ALC260 audio0 at azalia0 ppb0 at pci0 dev 28 function 0 Intel 82801GB PCIE rev 0x01: apic 2 int 17 (irq 11) pci1 at ppb0 bus 3 ppb1 at pci0 dev 28 function 1 Intel 82801GB PCIE rev 0x01: apic 2 int 16 (irq 11) pci2 at ppb1 bus 5 bge0 at pci2 dev 0 function 0 Broadcom BCM5751 rev 0x01, BCM5750 A1 (0x4001): apic 2 int 17 (irq 11), address 00:30:05:c9:79 :df brgphy0 at bge0 phy 1: BCM5750 10/100/1000baseT PHY, rev. 0 ppb2 at pci0 dev 28 function 2 Intel 82801GB PCIE rev 0x01: apic 2 int 18 (irq 9) pci3 at ppb2 bus 7 ppb3 at pci0 dev 28 function 3 Intel 82801GB PCIE rev 0x01: apic 2 int 19 (irq 9) pci4 at ppb3 bus 9 uhci0 at pci0 dev 29 function 0 Intel 82801GB USB rev 0x01: apic 2 int 23 (irq 11) uhci1 at pci0 dev 29 function 1 Intel 82801GB USB rev 0x01: apic 2 int 22 (irq 10) uhci2 at pci0 dev 29 function 2 Intel 82801GB USB rev 0x01: apic 2 int 21 (irq 5) uhci3 at pci0 dev 29 function 3 Intel 82801GB USB rev 0x01: apic 2 int 20 (irq 9) ehci0 at pci0 dev 29 function 7 Intel 82801GB USB rev 0x01: apic 2 int 23 (irq 11) ehci0: timed out waiting for BIOS usb0 at ehci0: USB revision 2.0 uhub0 at usb0 Intel EHCI root hub rev 2.00/1.00 addr 1 ppb4 at pci0 dev 30 function 0 Intel 82801BA Hub-to-PCI rev 0xe1 pci5 at ppb4 bus 11 em0 at pci5 dev 5 function 0 Intel PRO/1000MT (82540EM) rev 0x02: apic 2 int 22 (irq 10), address 00:07:e9:0f:44:37 rl0 at pci5 dev 7 function 0 D-Link Systems 530TX+ rev 0x10: apic 2 int 21 (irq 5), address 00:11:95:63:48:63 rlphy0 at rl0 phy 0: RTL internal PHY pcib0 at pci0 dev 31 function 0 Intel 82801GB LPC rev 0x01 pciide0 at pci0 dev 31 function 2 Intel 82801GB SATA rev 0x01: DMA, channel 0 wired to compatibility, channel 1 wired to com patibility wd0 at pciide0 channel 0 drive 0: ST3160211AS wd0: 16-sector PIO, LBA48, 152627MB, 312581808 sectors wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5 atapiscsi0 at pciide0 channel 1 drive 1 scsibus0 at atapiscsi0: 2 targets, initiator 7 cd0 at scsibus0 targ 0 lun 0: TSSTcorp, DVD-ROM SH-D162D, SB00 ATAPI 5/cdrom removable cd0(pciide0:1:1): using PIO mode 4, Ultra-DMA mode
Re: Kernel Panic on AMD64 24 June snapshot
On Tue, 24 Jun 2008 23:55:47 +0700, Stuart Henderson [EMAIL PROTECTED] wrote: In gmane.os.openbsd.misc, you wrote: Stopped at pf_state_tree_RB_REMOVE_COLOR + 0x1C0: cmpl $0x1,0x40(%rsi) OpenBSD 4.3-current (GENERIC.MP) #7: Tue Jun 24 20:27:50 WIT 2008 We can't tell which files are in your build. If sys/net/pf_ioctl.c is between 1.203-1.207, you need to update to 1.208. If it's already at 1.208 please post back on the misc@ thread with that information. Hi Stuart and Misc@, Well, it obviously says: $OpenBSD: pf_ioctl.c,v 1.208 2008/06/22 13:01:33 mcbride Exp $ */ on the /usr/src/sys/net/pf_ioctl.c Thanks, Insan -- insandotpraja(at)gmaildotcom