David,

thanks a lot for your support. I have been able to get both of my zpools up 
again by checking which zfs fs caused these problems.

And... today I also learned at least a bit about zpool troubleshooting.

Thanks
Stephan



--
Von meinem iPhone iOS4
 gesendet.

> In this function, the second argument is a pointer to the osname (mount).  
> You can dump out the string of what it is.
> ffffff0023b7db50 zfs_domount+0x17c(ffffff0588aaf698, ffffff0580cb3d80)
> 
> mdb unix.0 vmcore.0
> 
> ffffff0580cb3d80/S
> 
> Should print out the offending FS.  You could try to then import the pool 
> read only (-o ro) and set the parameter to the file system to readonly (zfs 
> set readonly=on <fs>).
> 
> Dave
> 
> On 11/11/10 09:37, Stephan Budach wrote:
>> David,
>> 
>> thanks so much (and of course to all other helpful souls here as well) for 
>> providing such great guidance!
>> Here we go:
>> 
>> Am 11.11.10 16:17, schrieb David Blasingame Oracle:
>>> The vmdump.0 is a compressed crash dump.  You will need to convert it to a 
>>> format that can be read.
>>> 
>>> #  savecore -f ./vmdump.0  ./
>>> 
>>> This will create a couple of files, but the ones you will need next is 
>>> unix.0 & vmcore.0.  Use mdb to print out the stack.
>>> 
>>> #  mdb unix.0 vmcore.0
>>> 
>>> run the following to print the stack.  This would at least tell you what 
>>> function the system is having a panic in.  You could then do a sunsolve 
>>> search or google search.
>>> 
>>> $C
>> > $C
>> ffffff0023b7d450 zap_leaf_lookup_closest+0x40(ffffff0588c61750, 0, 0,
>> ffffff0023b7d470)
>> ffffff0023b7d4e0 fzap_cursor_retrieve+0xc9(ffffff0588c61750, 
>> ffffff0023b7d5c0,
>> ffffff0023b7d600)
>> ffffff0023b7d5a0 zap_cursor_retrieve+0x19a(ffffff0023b7d5c0, 
>> ffffff0023b7d600)
>> ffffff0023b7d780 zfs_purgedir+0x4c(ffffff0581079260)
>> ffffff0023b7d7d0 zfs_rmnode+0x52(ffffff0581079260)
>> ffffff0023b7d810 zfs_zinactive+0xb5(ffffff0581079260)
>> ffffff0023b7d860 zfs_inactive+0xee(ffffff058118ae00, ffffff056ac3c108, 0)
>> ffffff0023b7d8b0 fop_inactive+0xaf(ffffff058118ae00, ffffff056ac3c108, 0)
>> ffffff0023b7d8d0 vn_rele+0x5f(ffffff058118ae00)
>> ffffff0023b7dac0 zfs_unlinked_drain+0xaf(ffffff05874c8b00)
>> ffffff0023b7daf0 zfsvfs_setup+0xfb(ffffff05874c8b00, 1)
>> ffffff0023b7db50 zfs_domount+0x17c(ffffff0588aaf698, ffffff0580cb3d80)
>> ffffff0023b7dc70 zfs_mount+0x1e4(ffffff0588aaf698, ffffff0588a9f100,
>> ffffff0023b7de20, ffffff056ac3c108)
>> ffffff0023b7dca0 fsop_mount+0x21(ffffff0588aaf698, ffffff0588a9f100,
>> ffffff0023b7de20, ffffff056ac3c108)
>> ffffff0023b7de00 domount+0xae3(0, ffffff0023b7de20, ffffff0588a9f100,
>> ffffff056ac3c108, ffffff0023b7de18)
>> ffffff0023b7de80 mount+0x121(ffffff0580c7e548, ffffff0023b7de98)
>> ffffff0023b7dec0 syscall_ap+0x8c()
>> ffffff0023b7df10 _sys_sysenter_post_swapgs+0x149()
>> 
>> 
>>> 
>>> 
>>> and gather zio_state data
>>> 
>>> ::zio_state
>> > ::zio_state
>> ADDRESS                                  TYPE  STAGE            WAITER
>> ffffff0584be2348                         NULL  OPEN             -
>> ffffff0570ebcc88                         NULL  OPEN             -
>> 
>> 
>>> 
>>> And check the msgbuf to see if there are any hardware problems.
>>> 
>>> ::msgbuf
>> 
>> > ::msgbuf
>> MESSAGE
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50925h, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50aefh, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> WARNING: pool 'obelixData' could not be loaded as it was last accessed by 
>> another system (host
>> : opensolaris hostid: 0x75b3c). See: http://www.sun.com/msg/ZFS-8000-EY
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50925h, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50aefh, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> WARNING: pool 'obelixData' could not be loaded as it was last accessed by 
>> another system (host
>> : opensolaris hostid: 0x75b3c). See: http://www.sun.com/msg/ZFS-8000-EY
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50aefh, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50925h, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> WARNING: pool 'obelixData' could not be loaded as it was last accessed by 
>> another system (host
>> : opensolaris hostid: 0x75b3c). See: http://www.sun.com/msg/ZFS-8000-EY
>> pseudo-device: devinfo0
>> devinfo0 is /pseudo/devi...@0
>> pcplusmp: asy (asy) instance 0 irq 0x4 vector 0xb0 ioapic 0x0 intin 0x4 is 
>> bound to cpu 2
>> ISA-device: asy0
>> asy0 is /p...@0,0/i...@1f/a...@1,3f8
>> pcplusmp: asy (asy) instance 1 irq 0x3 vector 0xb1 ioapic 0x0 intin 0x3 is 
>> bound to cpu 3
>> ISA-device: asy1
>> asy1 is /p...@0,0/i...@1f/a...@1,2f8
>> pseudo-device: ucode0
>> ucode0 is /pseudo/uc...@0
>> sgen0 at ata0: target 0 lun 0
>> sgen0 is /p...@0,0/pci-...@1f,2/i...@0/s...@0,0
>> sgen2 at mega_sas0: target 0 lun 1
>> sgen2 is /p...@0,0/pci8086,2...@1c/pci1028,1...@0/s...@0,1
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50925h, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50925h, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 20h  0h  0h  0h  0h  0h
>> sd4 at fp0: unit-address w210000d02305ff42,0: 50925
>> sd4 is 
>> /p...@0,0/pci8086,3...@7/pci1077,1...@0/f...@0,0/d...@w210000d02305ff42,0
>> pseudo-device: llc10
>> llc10 is /pseudo/l...@0
>> pseudo-device: lofi0
>> lofi0 is /pseudo/l...@0
>> pseudo-device: ramdisk1024
>> ramdisk1024 is /pseudo/ramd...@1024
>> pseudo-device: dcpc0
>> dcpc0 is /pseudo/d...@0
>> pseudo-device: fasttrap0
>> fasttrap0 is /pseudo/fastt...@0
>> pseudo-device: fbt0
>> fbt0 is /pseudo/f...@0
>> pseudo-device: profile0
>> profile0 is /pseudo/prof...@0
>> pseudo-device: lockstat0
>> lockstat0 is /pseudo/locks...@0
>> pseudo-device: sdt0
>> sdt0 is /pseudo/s...@0
>> pseudo-device: systrace0
>> systrace0 is /pseudo/systr...@0
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50aefh, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50aefh, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 20h  0h  0h  0h  0h  0h
>> sd132 at fp0: unit-address w210000d023038fa8,0: 50aef
>> sd132 is 
>> /p...@0,0/pci8086,3...@7/pci1077,1...@0/f...@0,0/d...@w210000d023038fa8,0
>> QEL qlc(1,0): ql_get_device, failed, tq=NULL
>> QEL qlc(1,0): ql_get_device, failed, tq=NULL
>> QEL qlc(1,0): ql_get_device, failed, tq=NULL
>> QEL qlc(1,0): ql_get_device, failed, tq=NULL
>> NOTICE: fcsm(2): attached to path 
>> /p...@0,0/pci8086,3...@7/pci1077,1...@0,1/f...@0,0
>> pseudo-device: fcsm0
>> fcsm0 is /pseudo/f...@0
>> NOTICE: fcsm(0): attached to path 
>> /p...@0,0/pci8086,3...@7/pci1077,1...@0/f...@0,0
>> pseudo-device: fssnap0
>> fssnap0 is /pseudo/fss...@0
>> pseudo-device: bpf0
>> bpf0 is /pseudo/b...@0
>> pseudo-device: lx_systrace0
>> lx_systrace0 is /pseudo/lx_systr...@0
>> pseudo-device: pm0
>> pm0 is /pseudo/p...@0
>> pseudo-device: nsmb0
>> nsmb0 is /pseudo/n...@0
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50925h, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50aefh, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50925h, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50aefh, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50aefh, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50925h, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50925h, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50aefh, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50925h, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> QEL qlc(0,0): ql_status_error, check condition sense data, d_id=50aefh, 
>> lun=0h
>> 70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
>> imported version 22 pool obelixData using 22
>> 
>> panic[cpu4]/thread=ffffff0572c3fc60:
>> BAD TRAP: type=e (#pf Page fault) rp=ffffff0023b7d310 addr=20 occurred in 
>> module "zfs" due to
>> a NULL pointer dereference
>> 
>> 
>> zpool:
>> #pf Page fault
>> Bad kernel fault at addr=0x20
>> pid=601, pc=0xfffffffff79a7588, sp=0xffffff0023b7d408, eflags=0x10213
>> cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
>> cr2: 20
>> cr3: 22656e000
>> cr8: c
>> 
>>        rdi: ffffff0588c61750 rsi:                0 rdx:                0
>>        rcx:         88e52e63  r8:         771ad198  r9: ffffff0023b7d470
>>        rax:                7 rbx:                0 rbp: ffffff0023b7d450
>>        r10:                7 r11:                0 r12: ffffff0588c61750
>>        r13: ffffff0588c61750 r14: ffffff0023b7d5c0 r15: ffffff0023b7d600
>>        fsb:                0 gsb: ffffff05720f5000  ds:               4b
>>         es:               4b  fs:                0  gs:              1c3
>>        trp:                e err:                0 rip: fffffffff79a7588
>>         cs:               30 rfl:            10213 rsp: ffffff0023b7d408
>>         ss:               38
>> 
>> ffffff0023b7d1f0 unix:die+dd ()
>> ffffff0023b7d300 unix:trap+177b ()
>> ffffff0023b7d310 unix:cmntrap+e6 ()
>> ffffff0023b7d450 zfs:zap_leaf_lookup_closest+40 ()
>> ffffff0023b7d4e0 zfs:fzap_cursor_retrieve+c9 ()
>> ffffff0023b7d5a0 zfs:zap_cursor_retrieve+19a ()
>> ffffff0023b7d780 zfs:zfs_purgedir+4c ()
>> ffffff0023b7d7d0 zfs:zfs_rmnode+52 ()
>> ffffff0023b7d810 zfs:zfs_zinactive+b5 ()
>> ffffff0023b7d860 zfs:zfs_inactive+ee ()
>> ffffff0023b7d8b0 genunix:fop_inactive+af ()
>> ffffff0023b7d8d0 genunix:vn_rele+5f ()
>> ffffff0023b7dac0 zfs:zfs_unlinked_drain+af ()
>> ffffff0023b7daf0 zfs:zfsvfs_setup+fb ()
>> ffffff0023b7db50 zfs:zfs_domount+17c ()
>> ffffff0023b7dc70 zfs:zfs_mount+1e4 ()
>> ffffff0023b7dca0 genunix:fsop_mount+21 ()
>> ffffff0023b7de00 genunix:domount+ae3 ()
>> ffffff0023b7de80 genunix:mount+121 ()
>> ffffff0023b7dec0 genunix:syscall_ap+8c ()
>> ffffff0023b7df10 unix:brand_sys_sysenter+1e0 ()
>> 
>> syncing file systems...
>> 1
>> done
>> dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
>> 
>> Thank you
>> 
> 
> 
> 
> 
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to