[zfs-discuss] pool causes kernel panic, recursive mutex enter, 134

2010-03-15 Thread Mark
hi,
i´m using opensolaris about 2 years with an mirrored rpool and an data pool 
with 3 x 2 (mirrored) drives.
the data pool drives are connected to SIL pci-express cards.

yesterday i updated from 130 to 134, everything seemed to be fine and i also 
replaced 1 pair of mirrored drives with larger disks.
still no problems, done some tests, rebooted a few times, checked logs, nothing 
special.

today i started copying a larger amount of data. while copying, at about 40gb, 
opensolaris gave me the first kernel panic ever seen on this system. system 
rebooted and while mounting the data pool, you may guess it, panic again.

what i did so far in trying to get it up again:

boot without data drive, try to mount manualy and with -F -n (non destructive 
as manual says)
tried to mount normal with different combination of mirrors taken offline, so 
that there is only a single drive for each slice. same panic.

i still have the drives that i replaced with the newer drives but i believe 
they are useless since the structure changed?

the kernel panic i get is cpu(0) recursive mutex enter and several lines of SIL 
driver errors. 
i tried also booting with previous BE 130 before the update and where the pools 
never got an error, same panic.

ANY ideas of volume rescue are welcome - if i missed some important 
information,please tell me.
regards, mark
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] pool causes kernel panic, recursive mutex enter, 134

2010-03-15 Thread Mark
some screenshots that may help:

 pool: tank
id: 5649976080828524375
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

data   ONLINE
  mirror-0 ONLINE
c27t2d0ONLINE
c27t0d0ONLINE
  mirror-1 ONLINE
c27t3d0ONLINE
c29t1d0ONLINE
  mirror-2 ONLINE
c27t1d0ONLINE
c29t0d0ONLINE



Mar 15 21:42:50 solaris1.local ^Mpanic[cpu0]/thread=d6792f00:   

   
Mar 15 21:42:50 solaris1.local genunix: [ID 335743 kern.notice] BAD TRAP: 
type=e (#pf Page fault) rp=d76d3658 addr=34 occurred in module zfs due to a 
NULL pointer dereference   
Mar 15 21:42:50 solaris1.local unix: [ID 10 kern.notice]

   
Mar 15 21:42:50 solaris1.local unix: [ID 839527 kern.notice] syseventd: 

   
Mar 15 21:42:50 solaris1.local unix: [ID 753105 kern.notice] #pf Page fault 

   
Mar 15 21:42:50 solaris1.local unix: [ID 532287 kern.notice] Bad kernel fault 
at addr=0x34
 
Mar 15 21:42:50 solaris1.local unix: [ID 243837 kern.notice] pid=93, 
pc=0xf924b97e, sp=0xd76d36c4, eflags=0x10282
  
Mar 15 21:42:50 solaris1.local unix: [ID 211416 kern.notice] cr0: 
8005003bpg,wp,ne,et,ts,mp,pe cr4: 6f8xmme,fxsr,pge,mce,pae,pse,de   
 
Mar 15 21:42:50 solaris1.local unix: [ID 624947 kern.notice] cr2: 34

   
Mar 15 21:42:50 solaris1.local unix: [ID 625075 kern.notice] cr3: 2ead020   

   
Mar 15 21:42:50 solaris1.local unix: [ID 10 kern.notice]

   
Mar 15 21:42:50 solaris1.local unix: [ID 537610 kern.notice] gs: d76d01b0  
fs:0  es:   cb0160  ds: e31a0160

Mar 15 21:42:50 solaris1.local unix: [ID 537610 kern.notice]edi:0 
esi: de581350 ebp: d76d36a4 esp: d76d3690   
 
Mar 15 21:42:50 solaris1.local unix: [ID 537610 kern.notice]ebx:0 
edx:b ecx:0 eax:0   
 
Mar 15 21:42:50 solaris1.local unix: [ID 537610 kern.notice]trp:e 
err:0 eip: f924b97e  cs:  158   
 
Mar 15 21:42:50 solaris1.local unix: [ID 717149 kern.notice]efl:10282 
usp: d76d36c4  ss: f924b9c6 
 
Mar 15 21:42:50 solaris1.local unix: [ID 10 kern.notice]

   
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d3594 
unix:die+93 (e, d76d3658, 34, 0)
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d3644 
unix:trap+1449 (d76d3658, 34, 0)
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d3658 
unix:cmntrap+7c (d76d01b0, 0, cb0160)   
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d36a4 
zfs:vdev_is_dead+6 (0, 0, cb36a7, e31ad)
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d36c4 
zfs:vdev_readable+e (0, 1, 0, fe96c13d) 
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d3704 
zfs:vdev_mirror_child_select+55 (dedc6560, 1, 0, f92)   
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice]