Re: amd64 -current kernel hang

2008-03-29 Thread RD Thrush
 rd == RD Thrush [EMAIL PROTECTED] writes:
rd I have experienced kernel hangs w/ -current snapshots on Athlon 64 X2
rd and Sempron boxes.  Both GENERIC and GENERIC.MP snapshots exhibit the
rd hang.  Once hung, the boxes don't respond to pings; however, keyboard
rd LEDs toggle as expected and I can enter ddb from the keyboard.
rd kernel/5777 [1] has the full problem report including dmesgs and ddb
rd logs.  The hang is reproducible by building the eclipse-sdk port.

rd This problem has developed fairly recently (within the past month or
rd so).

rd A similar box (w/ Opteron 165) runs -current i386 GENERIC.MP snapshots
rd and is able to do the same port builds without hanging.  It seems the
rd forementioned kernel hang doesn't affect the i386 variant.

rd I would be happy to provide additional information that would help
rd analyze and resolve the problem.


rd [1] 
http://cvs.openbsd.org/cgi-bin/query-pr-wrapper?full=yestextonly=yesnumbers=5777


I've now had the kernel hang occur on 3 different amd64 boxes.  (no
ping response.  top freezes.  Only kb LEDs work.  ddb can be entered
from the keyboard.)

Unfortunately, this last hang occured on a box without a serial port
so I have no ddb info to add to the above problem report.  I do have
crash dumps from this last hang.

The hang is reproducible by building the eclipse-sdk port on -current.
It does appear to require enough memory load to swap.  On this last
box, it wouldn't hang until I artificially added another 750MB load
(see below).

From the associated ddb ps listings (several sessions), I see several
processes WAITing on flt_noram[135].  Amongst those sessions, javadoc
usually shows up WAITing on flt_noram5.  I have no idea whether that
is relevant to the hang but it does seem to form a pattern.

I'd appreciate help to further analyze the problem report and correct
the problem.

FWIW, it seems odd that no other reports of (or replies about)
this problem have occurred.  Are my 3 boxes the only ones that hang?

Here's the simpleminded memory loader:
perl -e '$n=1024*1024;for($i=0;$i750;$i++) { $mem[$i]=z x $n; }; ;'

As previously mentioned, kernel/5777 contains the full report
including dmesgs.  I've appended the 3 dmesgs here as well.

### X2 #
OpenBSD 4.3-current (GENERIC.MP) #1589: Sat Mar 22 02:24:49 MDT 2008
[EMAIL PROTECTED]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 1060524032 (1011MB)
avail mem = 1029472256 (981MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xf0530 (67 entries)
bios0: vendor American Megatrends Inc. version 0221 date 12/06/2005
bios0: ASUSTeK Computer Inc. A8V
acpi0 at bios0: rev 2
acpi0: tables DSDT FACP APIC OEMB
acpi0: wakeup devices PCI0(S4) PS2K(S4) PS2M(S4) UAR1(S4) AC97(S4) USB1(S4) 
USB2(S4) USB3(S4) USB4(S4) EHCI(S4) PWRB(S4) SLPB(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+, 2002.89 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: apic clock running at 200MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+, 2002.56 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu1: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu1: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative
ioapic0 at mainbus0 apid 2 pa 0xfec0, version 3, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0: PSS
acpicpu1 at acpi0: PSS
acpibtn0 at acpi0: PWRB
acpibtn1 at acpi0: SLPB
cpu0: Cool'n'Quiet K8 2002 MHz: speeds: 2000 1800 1000 MHz
pci0 at mainbus0 bus 0: configuration mode 1
pchb0 at pci0 dev 0 function 0 VIA K8HTB Host rev 0x00
pchb1 at pci0 dev 0 function 1 VIA K8HTB Host rev 0x00
pchb2 at pci0 dev 0 function 2 VIA K8HTB Host rev 0x00
pchb3 at pci0 dev 0 function 3 VIA K8HTB Host rev 0x00
pchb4 at pci0 dev 0 function 4 VIA K8HTB Host rev 0x00
pchb5 at pci0 dev 0 function 7 VIA K8HTB Host rev 0x00
ppb0 at pci0 dev 1 function 0 VIA K8HTB AGP rev 0x00
pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 ATI Radeon 9200 PRO rev 0x01
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
agp at vga1 not configured
ATI Radeon 9200 PRO Sec rev 0x01 at pci1 dev 0 function 1 not configured
skc0 at pci0 dev 10 function 0 

Re: amd64 -current kernel hang

2008-03-28 Thread Peter_APIIT
Thanks. This will help OpenBSD to diagnose the problem. 
-- 
View this message in context: 
http://www.nabble.com/amd64--current-kernel-hang-tp16324333p16366188.html
Sent from the openbsd user - misc mailing list archive at Nabble.com.



amd64 -current kernel hang

2008-03-27 Thread RD Thrush
I have experienced kernel hangs w/ -current snapshots on Athlon 64 X2
and Sempron boxes.  Both GENERIC and GENERIC.MP snapshots exhibit the
hang.  Once hung, the boxes don't respond to pings; however, keyboard
LEDs toggle as expected and I can enter ddb from the keyboard.
kernel/5777 [1] has the full problem report including dmesgs and ddb
logs.  The hang is reproducible by building the eclipse-sdk port.

This problem has developed fairly recently (within the past month or
so).

A similar box (w/ Opteron 165) runs -current i386 GENERIC.MP snapshots
and is able to do the same port builds without hanging.  It seems the
forementioned kernel hang doesn't affect the i386 variant.

I would be happy to provide additional information that would help
analyze and resolve the problem.


[1] 
http://cvs.openbsd.org/cgi-bin/query-pr-wrapper?full=yestextonly=yesnumbers=5777