Re: some crashes with VIA VT-310DP (npxdna_xmm(d06e7660) at npxdna_xmm+0x71)
On Thu, Mar 30, 2006 at 12:54:16AM -0500, jared r r spiegel wrote: On Mon, Mar 27, 2006 at 03:11:49PM -0500, jared r r spiegel wrote: i forgot 'show panic' and 'show registers' these three times. this looks totally from outa space! can you please 'x /i' around the softclock+0x22c ? id dna indeed comes from there then there have to be some sort of fpu/xmm/blah instruction there. 10x cu ddb{0} show panic the kernel did not panic ddb{0} show registers ds 0x10 es 0x10 fs 0x58 gs 0x10 edi 0xd06e7660cpu_info_primary esi 0x20 ebp 0xe7d2be68 ebx0 edx 0x2 ecx0 eax0 eip 0xd0491475npxdna_xmm+0x71 cs 0x8 eflags 0x10246 esp 0xe7d2be40 ss0xe7d20010 npxdna_xmm+0x71:movl0x12c(%ebx),%eax ddb{0} trace npxdna_xmm(d06e7660) at npxdna_xmm+0x71 Xdna(d0657b2c,e7d2bef8,d02537f7,2000,0) at Xdna+0x39 softclock(0,58,10,10,10) at softclock+0x22c Xintrsoftclock() at Xintrsoftclock+0x56 --- interrupt --- Xdoreti() at Xdoreti+0x23 --- interrupt --- apm_cpu_idle(0,0,0,0,0) at apm_cpu_idle+0x4a have the machine running on uniprocessor kernel now and it's been stable for past 2 days ( previous max uptime on .mp was always 1d ) we're looking at moving it to 3.9, but trying to root around cvs{@,web} to see if we can find a commit that smells like it might be a fixing winner before going back to an MP kernel again. -- jared [ openbsd 3.9-current GENERIC ( mar 15 ) // i386 ] -- paranoic mickey (my employers have changed but, the name has remained)
Re: some crashes with VIA VT-310DP (npxdna_xmm(d06e7660) at npxdna_xmm+0x71)
On Thu, Mar 30, 2006 at 10:40:24AM +0200, mickey wrote: On Thu, Mar 30, 2006 at 12:54:16AM -0500, jared r r spiegel wrote: On Mon, Mar 27, 2006 at 03:11:49PM -0500, jared r r spiegel wrote: i forgot 'show panic' and 'show registers' these three times. this looks totally from outa space! can you please 'x /i' around the softclock+0x22c ? sure, will totally grab a bunch of examines if it happens again. my thanks for the suggestion ( i'm clearly not the ddb professional ) also received a few recommendations off-list that this all might be clear skies in 3.9 wrt this bug, so we're going to keep on the 3.8 uniproc kernel until sometime this weekend (hope) for a move to 3.9.mp -- depending on the timeline, there might be an opportunity to put the 3.8.mp back on, safeten up the HDs as much as i can, and see if we can tickle that fault to get the examine output jared
Re: some crashes with VIA VT-310DP (npxdna_xmm(d06e7660) at npxdna_xmm+0x71)
On Mon, Mar 27, 2006 at 03:11:49PM -0500, jared r r spiegel wrote: i forgot 'show panic' and 'show registers' these three times. ddb{0} show panic the kernel did not panic ddb{0} show registers ds 0x10 es 0x10 fs 0x58 gs 0x10 edi 0xd06e7660cpu_info_primary esi 0x20 ebp 0xe7d2be68 ebx0 edx 0x2 ecx0 eax0 eip 0xd0491475npxdna_xmm+0x71 cs 0x8 eflags 0x10246 esp 0xe7d2be40 ss0xe7d20010 npxdna_xmm+0x71:movl0x12c(%ebx),%eax ddb{0} trace npxdna_xmm(d06e7660) at npxdna_xmm+0x71 Xdna(d0657b2c,e7d2bef8,d02537f7,2000,0) at Xdna+0x39 softclock(0,58,10,10,10) at softclock+0x22c Xintrsoftclock() at Xintrsoftclock+0x56 --- interrupt --- Xdoreti() at Xdoreti+0x23 --- interrupt --- apm_cpu_idle(0,0,0,0,0) at apm_cpu_idle+0x4a have the machine running on uniprocessor kernel now and it's been stable for past 2 days ( previous max uptime on .mp was always 1d ) we're looking at moving it to 3.9, but trying to root around cvs{@,web} to see if we can find a commit that smells like it might be a fixing winner before going back to an MP kernel again. -- jared [ openbsd 3.9-current GENERIC ( mar 15 ) // i386 ]
some crashes with VIA VT-310DP (npxdna_xmm(d06e7660) at npxdna_xmm+0x71)
OPENBSD_3_8 from sources grabbed mar.2. kernel config: == $ diff -u GENERIC.MP GENERIC.MP.RAID --- GENERIC.MP Sun May 1 03:54:20 2005 +++ GENERIC.MP.RAID Sun Mar 26 21:45:32 2006 @@ -9,3 +9,6 @@ cpu* at mainbus? ioapic*at mainbus? + +option RAID_AUTOCONFIG +pseudo-device raid4 # RAIDframe disk driver == motherboard is VIA VT 310DP: http://www.via.com.tw/en/products/mainboards/mini_itx/vt_310dp/ ~3-8 users on the system, mostly used for email/MUD. i've got all the APM stuff in the bios that i could find set to not sleep or Disabled wrt HDD/CPU shutdowns for powersaving. $ sysctl hw.setperf sysctl: hw.setperf: value is not available has dropped town to ddb three times (about 1x/day) since installing this m/b. usually it's sometime during evening, but this morning it happened during the morning hours when most ppl are on the box with activity. if it helps(?), when the bios shows its summary screen before the obsd bootloader, it says: === CPU Type : VIA C3 CPU ID/ucode ID : 069A CPU Clock : 1.0A GHz === i believe the 'panic' is at: --- kernel: page fault trap, code=0 Stopped at npxdna_xmm+0x71:movl0x12c(%ebx),%eax --- i forgot 'show panic' and 'show registers' these three times. i tried 'boot dump' but that fails with: === ddb{0} boot dump syncing disks... panic: TLB IPI rendezvous failed (mask 2) Stopped at Debugger+0x4: leave === ( /var and /home are the raid(4) partitions ) here are three different instances of ddb. outside of 'ps' and 'trace' i am somewhat at a loss as to what i should be trying: =[1]=== ddb{0} ps PID PPID PGRPUID S FLAGS WAIT COMMAND 26144 11622 26144 1020 3 0x2004086 ttyin bash 11622 2133 2133 1020 7 0x104 sshd 2133 26799 2133 0 3 0x2004184 netio sshd 26486 22928 26486 1008 3 0x280508e poll mutt 16973525525 67 3 0x2000184 select httpd 22928 20772 22928 1008 3 0x2004086 pause ksh 20772 14451 14451 1008 3 0x2000184 select sshd 14451 26799 14451 0 3 0x2004184 netio sshd 28418 3192 28418515 3 0x2004084 piperd unlinkd 5372 1 5372 0 3 0x2004086 ttyin getty 2515 1 2515 0 3 0x2004086 ttyin getty 27726 1 27726 0 3 0x2004086 ttyin getty 8820 1 8820 0 3 0x2040184 select sendmail 14397 1 14397 0 3 0x284 select cron 2297 1 5573 1042 2 0x2400586 icecast 1668 1 1668 0 3 0x284 poll systrace 13668 27525 27525 0 3 0x285 lockf saslauthd 8217 27525 27525 0 3 0x285 netcon saslauthd 16227 27525 27525 0 3 0x285 lockf saslauthd 25578 27525 27525 0 3 0x285 lockf saslauthd 27525 1 27525 0 3 0x285 lockf saslauthd 3192 25974 25974515 3 0x2004184 poll squid 25974 1 25974 0 3 0x284 wait squid 13797 1 11575 1007 3 0x286 piperd moo 9844 1 11575 1007 3 0x286 piperd moo 11575 1 11575 1007 3 0x2004086 select moo 18138525525 67 3 0x2000184 semwaithttpd 20413525525 67 3 0x2000184 semwaithttpd 14911525525 67 3 0x2000184 semwaithttpd 8609525525 67 3 0x2000184 semwaithttpd 18236525525 67 3 0x2000184 semwaithttpd 26799 1 26799 0 3 0x284 select sshd 19880 1 19880 0 3 0x2000184 select inetd 525 1525 67 3 0x2000184 select httpd 26156 1998 1998 75 3 0x2000184 poll bgpd 28553 1998 1998 75 3 0x2000184 poll bgpd 1998 1 1998 0 3 0x284 poll bgpd 12208 13324 13324 83 3 0x2000184 poll ntpd 13324 1 13324 0 3 0x284 poll ntpd 3200 21340 21340 68 3 0x2000184 select isakmpd 21340 1 21340 0 3 0x284 netio isakmpd 28710 19800 19800 70 3 0x2000184 select named 19800 1 19800 0 3 0x2000184 netio named 10433 31475 31475 74 3 0x2000184 bpfpflogd 31475 1 31475 0 3 0x284 netio pflogd 23555 15566 15566 73 3 0x2000184 poll syslogd 15566 1 15566 0 3 0x284 netio syslogd 6947 1 6947 0 3 0x284 mfsidl mount_mfs 28151 0 0 0 3 0x2100204 rfwcondraid0 15 0 0 0 3 0x2100204 crypto_wa crypto 14 0 0 0 3 0x2100204