Re: 4.6 hang

2009-11-01 Thread Nicholas Marriott
Hi

On Fri, Oct 30, 2009 at 05:09:21PM -0500, Matthew Young wrote:
 Iam very curious about your problem, we all fear encountering
 something similar in the future...
 
 Why wasnt anybody able to help out based on your DDB trace ?  If I
 ever get such event what is the best information then that one should
 post to recieve help? (provided off course people are wanting to help)

http://www.openbsd.org/report.html

 Just curious! I want to be prepared to troubleshoot this better in the 
 future..
 
 
 Thanks
 
 --Matt
 
 
 On Thu, Oct 29, 2009 at 9:09 PM, Steve Shockley
 steve.shock...@shockley.net wrote:
  Just as another update, I replaced the fiber em card with a bge, and the
  problems went away.



Re: 4.6 hang

2009-10-30 Thread Matthew Young
Iam very curious about your problem, we all fear encountering
something similar in the future...

Why wasnt anybody able to help out based on your DDB trace ?  If I
ever get such event what is the best information then that one should
post to recieve help? (provided off course people are wanting to help)

Just curious! I want to be prepared to troubleshoot this better in the future..


Thanks

--Matt


On Thu, Oct 29, 2009 at 9:09 PM, Steve Shockley
steve.shock...@shockley.net wrote:
 Just as another update, I replaced the fiber em card with a bge, and the
 problems went away.



Re: 4.6 hang

2009-10-29 Thread Steve Shockley
Just as another update, I replaced the fiber em card with a bge, and the 
problems went away.




Re: 4.6 hang

2009-10-27 Thread Gregory Edigarov
On Tue, 27 Oct 2009 07:10:24 -0400
Steve Shockley steve.shock...@shockley.net wrote:

 I recently upgraded my firewall box from 4.4 to 4.6.  At first it was 
 running well (about a week), but yesterday I started getting
 occasional hangs where the screen would be blank and it'd stop
 responding to ping (and passing traffic).  Figuring it was a hardware
 failure, I swapped the drive into another box.  I still seem to be
 getting occasional hangs; I even turned off screen blanking, and when
 it hangs there's nothing on the screen (monitor goes to power save).
 The only shared hardware between the two machines is a Compaq fiber
 em NIC (which I'll replace tonight) and the hard drive (which isn't
 showing any errors). Assuming it is a software problem, how can I
 diagnose it?  I'll paste the dmesg below.  I'm running 4.6 with patch
 001 and 002 applied, and I've tried both the sp and mp kernels.

Although that may not be the problem, try to turn of acpi in kernel.
Helps me in 90% of sporadic hangs or reboots.
I even made that the routine: if I have new hardware and would like to
test it, first i try run it with acpi on, if it hangs or shows speed
regression - i just turn acpi off, and in 90% i am happy. for
the rest 10% i change my hardware.   

 OpenBSD 4.6-stable (GENERIC) #1: Tue Oct  6 05:40:03 EDT 2009
  r...@build46.localdomain:/usr/src/sys/arch/i386/compile/GENERIC
 cpu0: Intel(R) Pentium(R) 4 CPU 3.06GHz (GenuineIntel 686-class)
 3.07 GHz cpu0: 
 FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,CNXT-ID,xTPR
 real mem  = 3220668416 (3071MB)
 avail mem = 3120185344 (2975MB)
 mainbus0 at root
 bios0 at mainbus0: AT/286+ BIOS, date 10/14/04, BIOS32 rev. 0 @
 0xffe90, SMBIOS rev. 2.3 @ 0xfae10 (77 entries)
 bios0: vendor Dell Computer Corporation version A05 date 10/14/2004
 bios0: Dell Computer Corporation PowerEdge 650
 acpi0 at bios0: rev 0
 acpi0: tables DSDT FACP APIC SPCR
 acpi0: wakeup devices PCI0(S5) PCI1(S5) PCI2(S5)
 acpitimer0 at acpi0: 3579545 Hz, 32 bits
 acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
 cpu0 at mainbus0: apid 0 (boot processor)
 cpu0: apic clock running at 133MHz
 cpu at mainbus0: not configured
 ioapic0 at mainbus0: apid 2 pa 0xfec0, version 11, 16 pins
 ioapic0: misconfigured as apic 0, remapped to apid 2
 ioapic1 at mainbus0: apid 3 pa 0xfec01000, version 11, 16 pins
 ioapic1: misconfigured as apic 0, remapped to apid 3
 ioapic2 at mainbus0: apid 4 pa 0xfec02000, version 11, 16 pins
 ioapic2: misconfigured as apic 0, remapped to apid 4
 acpiprt0 at acpi0: bus 0 (PCI0)
 acpiprt1 at acpi0: bus 1 (PCI1)
 acpiprt2 at acpi0: bus 2 (PCI2)
 acpicpu0 at acpi0
 bios0: ROM list: 0xc/0x8000 0xc8000/0x4800 0xec000/0x4000!
 pci0 at mainbus0 bus 0: configuration mode 1 (bios)
 pchb0 at pci0 dev 0 function 0 ServerWorks GCNB-LE Host rev 0x32
 pchb1 at pci0 dev 0 function 1 ServerWorks GCNB-LE Host rev 0x00
 pci1 at pchb1 bus 1
 em0 at pci1 dev 3 function 0 Intel PRO/1000MT (82546EB) rev 0x01:
 apic 3 int 3 (irq 7), address 00:04:23:a5:c8:6e
 em1 at pci1 dev 3 function 1 Intel PRO/1000MT (82546EB) rev 0x01:
 apic 3 int 4 (irq 5), address 00:04:23:a5:c8:6f
 em2 at pci0 dev 3 function 0 Intel PRO/1000 (82542) rev 0x03: apic
 3 int 1 (irq 15), address 00:08:c7:86:39:f5
 vga1 at pci0 dev 4 function 0 ATI Rage XL rev 0x27
 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
 wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
 pciide0 at pci0 dev 5 function 0 CMD Technology PCI0680 rev 0x02
 pciide0: bus-master DMA support present
 pciide0: channel 0 wired to native-PCI mode
 pciide0: using apic 3 int 7 (irq 11) for native-PCI interrupt
 wd0 at pciide0 channel 0 drive 0: ST340014A
 wd0: 16-sector PIO, LBA48, 38166MB, 78165360 sectors
 wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
 pciide0: channel 1 wired to native-PCI mode
 piixpm0 at pci0 dev 15 function 0 ServerWorks CSB6 rev 0xa0: SMBus 
 disabled
 pciide1 at pci0 dev 15 function 1 ServerWorks CSB6 RAID/IDE rev
 0xa0: DMA atapiscsi0 at pciide1 channel 0 drive 0
 scsibus0 at atapiscsi0: 2 targets
 cd0 at scsibus0 targ 0 lun 0: TEAC, CD-224E, K.9A ATAPI 5/cdrom
 removable cd0(pciide1:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA
 mode 2 ohci0 at pci0 dev 15 function 2 ServerWorks CSB6 USB rev
 0x05: apic 2 int 10 (irq 10), version 1.0, legacy support
 pcib0 at pci0 dev 15 function 3 ServerWorks GCLE-2 Host rev 0x00
 pchb2 at pci0 dev 16 function 0 ServerWorks CIOB-E rev 0x12
 pchb3 at pci0 dev 16 function 2 ServerWorks CIOB-E rev 0x12
 pci2 at pchb3 bus 2
 usb0 at ohci0: USB revision 1.0
 uhub0 at usb0 ServerWorks OHCI root hub rev 1.00/1.00 addr 1
 isa0 at pcib0
 isadma0 at isa0
 com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
 pckbc0 at isa0 port 0x60/5
 pckbd0 at pckbc0 (kbd slot)
 pckbc0: using irq 1 for kbd slot
 wskbd0 at pckbd0: console keyboard, using wsdisplay0
 pms0 at pckbc0 (aux slot)
 pckbc0: using irq 12 for aux slot
 

Re: 4.6 hang

2009-10-27 Thread Steve Shockley

On 10/27/2009 7:44 AM, Gregory Edigarov wrote:

Although that may not be the problem, try to turn of acpi in kernel.
Helps me in 90% of sporadic hangs or reboots.


Thanks for the reply.  I'm trying with ACPI disabled now, but during the 
day today I did get a panic, details below.


panic: pool_do_get(mcl2k): free list modified: page 0xd99dd000; item 
addr 0xd99dd800; offset 0x0=0x800aabb

Stopped at  Debugger+0x4:   leave

Trace:
Debugger(d9695800,d0894098,df670e30,d99dd800,d0894020) at Debugger+0x4
panic(d0716100,d08470a0,d99dd000,d99dd800,0) at panic+0x55
pool_do_get(d0894020,0,df670ea0,df670e50,d0363faf,d0894020) at 
pool_do_get+0x2e3

pool_get(d0894020,0,df670ea0,d039afee,0) at pool_get+0x46
m_clget(d977c500,1,d3acb830,800) at m_clget+0x74
em_get_buf(d3acb800,d,200e0a0,d3acb830) at em_get_buf+0x64
em_rxfill(d3acb800,fffe,c0,0) at em_rxfill+0x3a
em_intr(d3acb800) at em_intr+0x9e
Xintr_ioapic() at Xintr_ioapic1+0x68
--- interrupt ---
cpu_idle_cycle(d09408e0) at cpu_idle_cycle+0xf
Bad frame pointer: 0xd09e9e78

ps on request, since I'm typing by hand from a digital photo.



Re: 4.6 hang

2009-10-27 Thread Steve Shockley
Just as an update, I've replaced the one NIC, so the only thing carried 
over from the other machine is the hard drive, and I'm still getting the 
exact same issue.