Re: some crashes with VIA VT-310DP (npxdna_xmm(d06e7660) at npxdna_xmm+0x71)

2006-03-30 Thread mickey
On Thu, Mar 30, 2006 at 12:54:16AM -0500, jared r r spiegel wrote:
 On Mon, Mar 27, 2006 at 03:11:49PM -0500, jared r r spiegel wrote:
 
i forgot 'show panic' and 'show registers' these three times.

this looks totally from outa space!
can you please 'x /i' around the softclock+0x22c ?
id dna indeed comes from there then there have to
be some sort of fpu/xmm/blah instruction there.
10x
cu

 ddb{0} show panic
 the kernel did not panic
 ddb{0} show registers
 ds  0x10
 es  0x10
 fs  0x58
 gs  0x10
 edi   0xd06e7660cpu_info_primary
 esi 0x20
 ebp   0xe7d2be68
 ebx0
 edx  0x2
 ecx0
 eax0
 eip   0xd0491475npxdna_xmm+0x71
 cs   0x8
 eflags   0x10246
 esp   0xe7d2be40
 ss0xe7d20010
 npxdna_xmm+0x71:movl0x12c(%ebx),%eax
 ddb{0} trace
 npxdna_xmm(d06e7660) at npxdna_xmm+0x71
 Xdna(d0657b2c,e7d2bef8,d02537f7,2000,0) at Xdna+0x39
 softclock(0,58,10,10,10) at softclock+0x22c
 Xintrsoftclock() at Xintrsoftclock+0x56
 --- interrupt ---
 Xdoreti() at Xdoreti+0x23
 --- interrupt ---
 apm_cpu_idle(0,0,0,0,0) at apm_cpu_idle+0x4a
 
   have the machine running on uniprocessor kernel
   now and it's been stable for past 2 days ( previous
   max uptime on .mp was always  1d )
 
   we're looking at moving it to 3.9, but trying to root
   around cvs{@,web} to see if we can find a commit that
   smells like it might be a fixing winner before going
   back to an MP kernel again.
 
 -- 
 
   jared
 
 [ openbsd 3.9-current GENERIC ( mar 15 ) // i386 ]
 

-- 
paranoic mickey   (my employers have changed but, the name has remained)



Re: some crashes with VIA VT-310DP (npxdna_xmm(d06e7660) at npxdna_xmm+0x71)

2006-03-30 Thread jared r r spiegel
On Thu, Mar 30, 2006 at 10:40:24AM +0200, mickey wrote:
 On Thu, Mar 30, 2006 at 12:54:16AM -0500, jared r r spiegel wrote:
  On Mon, Mar 27, 2006 at 03:11:49PM -0500, jared r r spiegel wrote:
  
 i forgot 'show panic' and 'show registers' these three times.
 
 this looks totally from outa space!
 can you please 'x /i' around the softclock+0x22c ?

  sure, will totally grab a bunch of examines if it happens again.
  my thanks for the suggestion ( i'm clearly not the ddb professional )
  
  also received a few recommendations off-list that this all might
  be clear skies in 3.9 wrt this bug, so we're going to keep on the 
  3.8 uniproc kernel until sometime this weekend (hope) for a move
  to 3.9.mp -- depending on the timeline, there might be an opportunity
  to put the 3.8.mp back on, safeten up the HDs as much as i can, and
  see if we can tickle that fault to get the examine output

  jared 



Re: some crashes with VIA VT-310DP (npxdna_xmm(d06e7660) at npxdna_xmm+0x71)

2006-03-29 Thread jared r r spiegel
On Mon, Mar 27, 2006 at 03:11:49PM -0500, jared r r spiegel wrote:

   i forgot 'show panic' and 'show registers' these three times.

ddb{0} show panic
the kernel did not panic
ddb{0} show registers
ds  0x10
es  0x10
fs  0x58
gs  0x10
edi   0xd06e7660cpu_info_primary
esi 0x20
ebp   0xe7d2be68
ebx0
edx  0x2
ecx0
eax0
eip   0xd0491475npxdna_xmm+0x71
cs   0x8
eflags   0x10246
esp   0xe7d2be40
ss0xe7d20010
npxdna_xmm+0x71:movl0x12c(%ebx),%eax
ddb{0} trace
npxdna_xmm(d06e7660) at npxdna_xmm+0x71
Xdna(d0657b2c,e7d2bef8,d02537f7,2000,0) at Xdna+0x39
softclock(0,58,10,10,10) at softclock+0x22c
Xintrsoftclock() at Xintrsoftclock+0x56
--- interrupt ---
Xdoreti() at Xdoreti+0x23
--- interrupt ---
apm_cpu_idle(0,0,0,0,0) at apm_cpu_idle+0x4a

  have the machine running on uniprocessor kernel
  now and it's been stable for past 2 days ( previous
  max uptime on .mp was always  1d )

  we're looking at moving it to 3.9, but trying to root
  around cvs{@,web} to see if we can find a commit that
  smells like it might be a fixing winner before going
  back to an MP kernel again.

-- 

  jared

[ openbsd 3.9-current GENERIC ( mar 15 ) // i386 ]



some crashes with VIA VT-310DP (npxdna_xmm(d06e7660) at npxdna_xmm+0x71)

2006-03-27 Thread jared r r spiegel
  OPENBSD_3_8 from sources grabbed mar.2.  

  kernel config:

==
$ diff -u GENERIC.MP GENERIC.MP.RAID
--- GENERIC.MP  Sun May  1 03:54:20 2005
+++ GENERIC.MP.RAID Sun Mar 26 21:45:32 2006
@@ -9,3 +9,6 @@

 cpu*   at mainbus?
 ioapic*at mainbus?
+
+option RAID_AUTOCONFIG
+pseudo-device  raid4   # RAIDframe disk driver
==

  motherboard is VIA VT 310DP:
http://www.via.com.tw/en/products/mainboards/mini_itx/vt_310dp/

  ~3-8 users on the system, mostly used for email/MUD.

  i've got all the APM stuff in the bios that i could find
  set to not sleep or Disabled wrt HDD/CPU shutdowns for
  powersaving.

$ sysctl hw.setperf
sysctl: hw.setperf: value is not available

  has dropped town to ddb three times (about 1x/day) since
  installing this m/b.  usually it's sometime during evening,
  but this morning it happened during the morning hours when
  most ppl are on the box with activity.

  if it helps(?), when the bios shows its summary screen
  before the obsd bootloader, it says:

===
CPU Type  : VIA C3  
CPU ID/ucode ID   : 069A
CPU Clock : 1.0A GHz
===

  i believe the 'panic' is at:

---
 kernel: page fault trap, code=0 
 Stopped at  npxdna_xmm+0x71:movl0x12c(%ebx),%eax 
---

  i forgot 'show panic' and 'show registers' these three times.

  i tried 'boot dump' but that fails with:

===
 ddb{0} boot dump 
 syncing disks... panic: TLB IPI rendezvous failed (mask 2) 
 Stopped at  Debugger+0x4:   leave 
===

  ( /var and /home are the raid(4) partitions )

  here are three different instances of ddb.  outside of
  'ps' and 'trace' i am somewhat at a loss as to what i should
  be trying:

=[1]===
ddb{0} ps
   PID   PPID   PGRPUID  S   FLAGS  WAIT   COMMAND
 26144  11622  26144   1020  3   0x2004086  ttyin  bash
 11622   2133   2133   1020  7   0x104 sshd
  2133  26799   2133  0  3   0x2004184  netio  sshd
 26486  22928  26486   1008  3   0x280508e  poll   mutt
 16973525525 67  3   0x2000184  select httpd
 22928  20772  22928   1008  3   0x2004086  pause  ksh
 20772  14451  14451   1008  3   0x2000184  select sshd
 14451  26799  14451  0  3   0x2004184  netio  sshd
 28418   3192  28418515  3   0x2004084  piperd unlinkd
  5372  1   5372  0  3   0x2004086  ttyin  getty
  2515  1   2515  0  3   0x2004086  ttyin  getty
 27726  1  27726  0  3   0x2004086  ttyin  getty
  8820  1   8820  0  3   0x2040184  select sendmail
 14397  1  14397  0  3   0x284  select cron
  2297  1   5573   1042  2   0x2400586 icecast
  1668  1   1668  0  3   0x284  poll   systrace
 13668  27525  27525  0  3   0x285  lockf  saslauthd
  8217  27525  27525  0  3   0x285  netcon saslauthd
 16227  27525  27525  0  3   0x285  lockf  saslauthd
 25578  27525  27525  0  3   0x285  lockf  saslauthd
 27525  1  27525  0  3   0x285  lockf  saslauthd
  3192  25974  25974515  3   0x2004184  poll   squid
 25974  1  25974  0  3   0x284  wait   squid
 13797  1  11575   1007  3   0x286  piperd moo
  9844  1  11575   1007  3   0x286  piperd moo
 11575  1  11575   1007  3   0x2004086  select moo
 18138525525 67  3   0x2000184  semwaithttpd
 20413525525 67  3   0x2000184  semwaithttpd
 14911525525 67  3   0x2000184  semwaithttpd
  8609525525 67  3   0x2000184  semwaithttpd
 18236525525 67  3   0x2000184  semwaithttpd
 26799  1  26799  0  3   0x284  select sshd
 19880  1  19880  0  3   0x2000184  select inetd
   525  1525 67  3   0x2000184  select httpd
 26156   1998   1998 75  3   0x2000184  poll   bgpd
 28553   1998   1998 75  3   0x2000184  poll   bgpd
  1998  1   1998  0  3   0x284  poll   bgpd
 12208  13324  13324 83  3   0x2000184  poll   ntpd
 13324  1  13324  0  3   0x284  poll   ntpd
  3200  21340  21340 68  3   0x2000184  select isakmpd
 21340  1  21340  0  3   0x284  netio  isakmpd
 28710  19800  19800 70  3   0x2000184  select named
 19800  1  19800  0  3   0x2000184  netio  named
 10433  31475  31475 74  3   0x2000184  bpfpflogd
 31475  1  31475  0  3   0x284  netio  pflogd
 23555  15566  15566 73  3   0x2000184  poll   syslogd
 15566  1  15566  0  3   0x284  netio  syslogd
  6947  1   6947  0  3   0x284  mfsidl mount_mfs
 28151  0  0  0  3   0x2100204  rfwcondraid0
15  0  0  0  3   0x2100204  crypto_wa  crypto
14  0  0  0  3   0x2100204