Re: [zfs-discuss] ZFS panic when trying to import pool

2007-09-21 Thread Geoffroy Doucet
Ok I found the problem with 0x06, one disk was missing. But now I got all my 
disk and I get 0x05.:
Sep 21 10:25:53 unknown ^Mpanic[cpu0]/thread=ff0001e12c80:
Sep 21 10:25:53 unknown genunix: [ID 603766 kern.notice] assertion failed: 
dmu_read(os, smo->smo_object, offset, size, entry_map) == 0 (0x5 == 0x0), file: 
..
/../common/fs/zfs/space_map.c, line: 339
Sep 21 10:25:53 unknown unix: [ID 10 kern.notice]
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e124f0 
genunix:assfail3+b9 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12590 
zfs:space_map_load+2ef ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e125d0 
zfs:metaslab_activate+66 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12690 
zfs:metaslab_group_alloc+24e ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12760 
zfs:metaslab_alloc_dva+192 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12800 
zfs:metaslab_alloc+82 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12850 
zfs:zio_dva_allocate+68 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12870 
zfs:zio_next_stage+b3 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e128a0 
zfs:zio_checksum_generate+6e ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e128c0 
zfs:zio_next_stage+b3 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12930 
zfs:zio_write_compress+239 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12950 
zfs:zio_next_stage+b3 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e129a0 
zfs:zio_wait_for_children+5d ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e129c0 
zfs:zio_wait_children_ready+20 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e129e0 
zfs:zio_next_stage_async+bb ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12a00 
zfs:zio_nowait+11 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12a80 
zfs:dmu_objset_sync+196 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12ad0 
zfs:dsl_dataset_sync+5d ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12b40 
zfs:dsl_pool_sync+b5 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12bd0 
zfs:spa_sync+1c5 ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12c60 
zfs:txg_sync_thread+19a ()
Sep 21 10:25:53 unknown genunix: [ID 655072 kern.notice] ff0001e12c70 
unix:thread_start+8 ()

There is no scsi the disks, because those are virtual disk. Also for anyone who 
are interest, I wrote a little program to show the properties on the vdev.:

http://www.projectvolcano.org/zfs/list_vdev.c.

Here is a sample output:
bash-3.00# ./list_vdev -d /dev/dsk/c1t12d0s0 
Vdev properties for /dev/dsk/c1t12d0s0:
version: 0x0003
name: share02
state: 0x0001
txg: 0x003fd0e4
pool_guid: 0x88f93fc54c215cfa
top_guid: 0x65400f2e7db0c2a5
guid: 0xfc3b9af2d3b6fd46
vdev_tree: type: raidz
id: 0x
guid: 0x65400f2e7db0c2a5
nparity: 0x0001
metaslab_array: 0x000d
metaslab_shift: 0x001e
ashift: 0x0009
asize: 0x00196e0c
children: [
[0]
type: disk
id: 0x
guid: 0xfc3b9af2d3b6fd46
path: /dev/dsk/c1t12d0s0
devid: id1,[EMAIL PROTECTED]/a
whole_disk: 0x0001
DTL: 0x004e
[1]
type: disk
id: 0x0001
guid: 0x377cc1a2beb3c985
path: /dev/dsk/c1t13d0s0
devid: id1,[EMAIL PROTECTED]/a
whole_disk: 0x0001
DTL: 0x004d
[2]
type: disk
id: 0x0002
guid: 0xe97db62ad7fe325d
path: /dev/dsk/c1t14d0s0
devid: id1,[EMAIL PROTECTED]/a
whole_disk: 0x0001
DTL: 0x0091
]


So my question, is there a way to really know why I got IOE (0x05)? Is there a 
way to know in the debugger? How can I access it?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS panic when trying to import pool

2007-09-18 Thread Geoffroy Doucet
actually here is the first panic messages:
Sep 13 23:33:22 netra2 unix: [ID 603766 kern.notice] assertion failed: 
dmu_read(os, smo->smo_object, offset, size, entry_map) == 0 (0x5 == 0x0), file: 
../../common/fs/zfs/space_map.c, line: 307
Sep 13 23:33:22 netra2 unix: [ID 10 kern.notice]
Sep 13 23:33:22 netra2 genunix: [ID 723222 kern.notice] 02a103e6b000 
genunix:assfail3+94 (7b7706d0, 5, 7b770710, 0, 7b770718, 133)
Sep 13 23:33:22 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
2000 0133  0186f800
Sep 13 23:33:22 netra2   %l4-7:  0183d400 
011eb400 
Sep 13 23:33:22 netra2 genunix: [ID 723222 kern.notice] 02a103e6b0c0 
zfs:space_map_load+1a4 (30007cc2c38, 70450058, 1000, 30007cc2908, 38000, 1)
Sep 13 23:33:22 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
1a60 03000ce3b000  7b73ead0
Sep 13 23:33:22 netra2   %l4-7: 7b73e86c 7fff 
7fff 1000
Sep 13 23:33:22 netra2 genunix: [ID 723222 kern.notice] 02a103e6b190 
zfs:metaslab_activate+3c (30007cc2900, 8000, c000, 
e75efe6c, 30007cc2900, c000)
Sep 13 23:33:23 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
02a103e6b308 0003 0002 006dd004
Sep 13 23:33:23 netra2   %l4-7: 7045 030010834940 
0300080eba40 0300106c9748
Sep 13 23:33:23 netra2 genunix: [ID 723222 kern.notice] 02a103e6b240 
zfs:metaslab_group_alloc+1bc (3fff, 400, 8000, 
32dc18000, 30003387d88, )
Sep 13 23:33:23 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
 0300106c9750 0001 030007cc2900
Sep 13 23:33:23 netra2   %l4-7: 8000  
000196e0c000 4000
Sep 13 23:33:23 netra2 genunix: [ID 723222 kern.notice] 02a103e6b320 
zfs:metaslab_alloc_dva+114 (0, 32dc18000, 30003387d88, 400, 300080eba40, 3fd0f1)
Sep 13 23:33:23 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
0001  0003 030011c068e0
Sep 13 23:33:23 netra2   %l4-7:  0300106c9748 
 0300106c9748
Sep 13 23:33:23 netra2 genunix: [ID 723222 kern.notice] 02a103e6b3f0 
zfs:metaslab_alloc+2c (30010834940, 200, 30003387d88, 3, 3fd0f1, 0)
Sep 13 23:33:23 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
030003387de8 0300139e1800 704506a0 
Sep 13 23:33:23 netra2   %l4-7: 030013fca7be  
030010834940 0001
Sep 13 23:33:24 netra2 genunix: [ID 723222 kern.notice] 02a103e6b4a0 
zfs:zio_dva_allocate+4c (30010eafcc0, 7b7515a8, 30003387d88, 70450508, 
70450400, 20001)
Sep 13 23:33:24 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
70450400 07030001 07030001 
Sep 13 23:33:24 netra2   %l4-7:  018a5c00 
0003 0007
Sep 13 23:33:24 netra2 genunix: [ID 723222 kern.notice] 02a103e6b550 
zfs:zio_write_compress+1ec (30010eafcc0, 23e20b, 23e000, 10001, 3, 30003387d88)
Sep 13 23:33:24 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
  0001 0200
Sep 13 23:33:24 netra2   %l4-7:  0001 
fc00 0001
Sep 13 23:33:24 netra2 genunix: [ID 723222 kern.notice] 02a103e6b620 
zfs:zio_wait+c (30010eafcc0, 30010834940, 7, 30010eaff20, 3, 3fd0f1)
Sep 13 23:33:24 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
 7b7297d0 030003387d40 03000be9edf8
Sep 13 23:33:24 netra2   %l4-7: 02a103e6b7c0 0002 
0002 03000a799920
Sep 13 23:33:24 netra2 genunix: [ID 723222 kern.notice] 02a103e6b6d0 
zfs:dmu_objset_sync+12c (30003387d40, 3000a762c80, 1, 1, 3000be9edf8, 0)
Sep 13 23:33:24 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
030003387d88  0002 003be93a
Sep 13 23:33:24 netra2   %l4-7: 030003387e40 0020 
030003387e20 030003387ea0
Sep 13 23:33:25 netra2 genunix: [ID 723222 kern.notice] 02a103e6b7e0 
zfs:dsl_dataset_sync+c (30007609480, 3000a762c80, 30007609510, 30005c475b8, 
30005c475b8, 30007609480)
Sep 13 23:33:25 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
0001 0007 030005c47638 0001
Sep 13 23:33:25 netra2   %l4-7: 030007609508  
030005c4caa8 
Sep 13 23:33:25 netra2 genunix: [ID 723222 kern.notice] 02a103e6b890 
zfs:dsl_pool_sync+64 (30005c47500, 3fd0f1, 30007609480, 3000f904380, 
300032bb7c0, 300032bb7e8)
Sep 13 23:33:25 netra2 genunix: [ID 179002 kern.notice]   %l0-3: 
 030010834d00 03000a762c80 030005c47698
Sep 13 23:33:25 netra2   %l4-7: 030005c47668 030005c47638 
03000

Re: [zfs-discuss] ZFS panic when trying to import pool

2007-09-18 Thread Jeff Bonwick
Basically, it is complaining that there aren't enough disks to read
the pool metadata.  This would suggest that in your 3-disk RAID-Z
config, either two disks are missing, or one disk is missing *and*
another disk is damaged -- due to prior failed writes, perhaps.

(I know there's at least one disk missing because the failure mode
is errno 6, which is EXNIO.)

Can you tell from /var/adm/messages or fmdump whether there write
errors to multiple disks, or to just one?

Jeff

On Tue, Sep 18, 2007 at 05:26:16PM -0700, Geoffroy Doucet wrote:
> I have a raid-z zfs filesystem with 3 disks. The disk was starting have read 
> and write errors.
> 
> The disks was so bad that I started to have trans_err. The server lock up and 
> the server was reset. Then now when trying to import the pool the system 
> panic.
> 
> I installed the last Recommend on my Solaris U3 and also install the last 
> Kernel patch (120011-14).
> 
> But still when trying to do zpool import  it panic.
> 
> I also dd the disk and tested on another server with OpenSolaris B72 and 
> still the same thing. Here is the panic backtrace:
> 
> Stack Backtrace
> -
> vpanic()
> assfail3+0xb9(f7dde5f0, 6, f7dde840, 0, f7dde820, 153)
> space_map_load+0x2ef(ff008f1290b8, c00fc5b0, 1, ff008f128d88,
> ff008dd58ab0)
> metaslab_activate+0x66(ff008f128d80, 8000)
> metaslab_group_alloc+0x24e(ff008f46bcc0, 400, 3fd0f1, 32dc18000,
> ff008fbeaa80, 0)
> metaslab_alloc_dva+0x192(ff008f2d1a80, ff008f235730, 200,
> ff008fbeaa80, 0, 0)
> metaslab_alloc+0x82(ff008f2d1a80, ff008f235730, 200, 
> ff008fbeaa80, 2
> , 3fd0f1)
> zio_dva_allocate+0x68(ff008f722790)
> zio_next_stage+0xb3(ff008f722790)
> zio_checksum_generate+0x6e(ff008f722790)
> zio_next_stage+0xb3(ff008f722790)
> zio_write_compress+0x239(ff008f722790)
> zio_next_stage+0xb3(ff008f722790)
> zio_wait_for_children+0x5d(ff008f722790, 1, ff008f7229e0)
> zio_wait_children_ready+0x20(ff008f722790)
> zio_next_stage_async+0xbb(ff008f722790)
> zio_nowait+0x11(ff008f722790)
> dmu_objset_sync+0x196(ff008e4e5000, ff008f722a10, ff008f260a80)
> dsl_dataset_sync+0x5d(ff008df47e00, ff008f722a10, ff008f260a80)
> dsl_pool_sync+0xb5(ff00882fb800, 3fd0f1)
> spa_sync+0x1c5(ff008f2d1a80, 3fd0f1)
> txg_sync_thread+0x19a(ff00882fb800)
> thread_start+8()
> 
> 
> 
> And here is the panic message buf:
> panic[cpu0]/thread=ff0001ba2c80:
> assertion failed: dmu_read(os, smo->smo_object, offset, size, entry_map) == 0 
> (0
> x6 == 0x0), file: ../../common/fs/zfs/space_map.c, line: 339
> 
> 
> ff0001ba24f0 genunix:assfail3+b9 ()
> ff0001ba2590 zfs:space_map_load+2ef ()
> ff0001ba25d0 zfs:metaslab_activate+66 ()
> ff0001ba2690 zfs:metaslab_group_alloc+24e ()
> ff0001ba2760 zfs:metaslab_alloc_dva+192 ()
> ff0001ba2800 zfs:metaslab_alloc+82 ()
> ff0001ba2850 zfs:zio_dva_allocate+68 ()
> ff0001ba2870 zfs:zio_next_stage+b3 ()
> ff0001ba28a0 zfs:zio_checksum_generate+6e ()
> ff0001ba28c0 zfs:zio_next_stage+b3 ()
> ff0001ba2930 zfs:zio_write_compress+239 ()
> ff0001ba2950 zfs:zio_next_stage+b3 ()
> ff0001ba29a0 zfs:zio_wait_for_children+5d ()
> ff0001ba29c0 zfs:zio_wait_children_ready+20 ()
> ff0001ba29e0 zfs:zio_next_stage_async+bb ()
> ff0001ba2a00 zfs:zio_nowait+11 ()
> ff0001ba2a80 zfs:dmu_objset_sync+196 ()
> ff0001ba2ad0 zfs:dsl_dataset_sync+5d ()
> ff0001ba2b40 zfs:dsl_pool_sync+b5 ()
> ff0001ba2bd0 zfs:spa_sync+1c5 ()
> ff0001ba2c60 zfs:txg_sync_thread+19a ()
> ff0001ba2c70 unix:thread_start+8 ()
> 
> syncing file systems...
> 
> 
> Is there a way to restore the data? Is there a way to "fsck" the zpool, and 
> correct the error manually?
>  
>  
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS panic when trying to import pool

2007-09-18 Thread Geoffroy Doucet
I have a raid-z zfs filesystem with 3 disks. The disk was starting have read 
and write errors.

The disks was so bad that I started to have trans_err. The server lock up and 
the server was reset. Then now when trying to import the pool the system panic.

I installed the last Recommend on my Solaris U3 and also install the last 
Kernel patch (120011-14).

But still when trying to do zpool import  it panic.

I also dd the disk and tested on another server with OpenSolaris B72 and still 
the same thing. Here is the panic backtrace:

Stack Backtrace
-
vpanic()
assfail3+0xb9(f7dde5f0, 6, f7dde840, 0, f7dde820, 153)
space_map_load+0x2ef(ff008f1290b8, c00fc5b0, 1, ff008f128d88,
ff008dd58ab0)
metaslab_activate+0x66(ff008f128d80, 8000)
metaslab_group_alloc+0x24e(ff008f46bcc0, 400, 3fd0f1, 32dc18000,
ff008fbeaa80, 0)
metaslab_alloc_dva+0x192(ff008f2d1a80, ff008f235730, 200,
ff008fbeaa80, 0, 0)
metaslab_alloc+0x82(ff008f2d1a80, ff008f235730, 200, ff008fbeaa80, 2
, 3fd0f1)
zio_dva_allocate+0x68(ff008f722790)
zio_next_stage+0xb3(ff008f722790)
zio_checksum_generate+0x6e(ff008f722790)
zio_next_stage+0xb3(ff008f722790)
zio_write_compress+0x239(ff008f722790)
zio_next_stage+0xb3(ff008f722790)
zio_wait_for_children+0x5d(ff008f722790, 1, ff008f7229e0)
zio_wait_children_ready+0x20(ff008f722790)
zio_next_stage_async+0xbb(ff008f722790)
zio_nowait+0x11(ff008f722790)
dmu_objset_sync+0x196(ff008e4e5000, ff008f722a10, ff008f260a80)
dsl_dataset_sync+0x5d(ff008df47e00, ff008f722a10, ff008f260a80)
dsl_pool_sync+0xb5(ff00882fb800, 3fd0f1)
spa_sync+0x1c5(ff008f2d1a80, 3fd0f1)
txg_sync_thread+0x19a(ff00882fb800)
thread_start+8()



And here is the panic message buf:
panic[cpu0]/thread=ff0001ba2c80:
assertion failed: dmu_read(os, smo->smo_object, offset, size, entry_map) == 0 (0
x6 == 0x0), file: ../../common/fs/zfs/space_map.c, line: 339


ff0001ba24f0 genunix:assfail3+b9 ()
ff0001ba2590 zfs:space_map_load+2ef ()
ff0001ba25d0 zfs:metaslab_activate+66 ()
ff0001ba2690 zfs:metaslab_group_alloc+24e ()
ff0001ba2760 zfs:metaslab_alloc_dva+192 ()
ff0001ba2800 zfs:metaslab_alloc+82 ()
ff0001ba2850 zfs:zio_dva_allocate+68 ()
ff0001ba2870 zfs:zio_next_stage+b3 ()
ff0001ba28a0 zfs:zio_checksum_generate+6e ()
ff0001ba28c0 zfs:zio_next_stage+b3 ()
ff0001ba2930 zfs:zio_write_compress+239 ()
ff0001ba2950 zfs:zio_next_stage+b3 ()
ff0001ba29a0 zfs:zio_wait_for_children+5d ()
ff0001ba29c0 zfs:zio_wait_children_ready+20 ()
ff0001ba29e0 zfs:zio_next_stage_async+bb ()
ff0001ba2a00 zfs:zio_nowait+11 ()
ff0001ba2a80 zfs:dmu_objset_sync+196 ()
ff0001ba2ad0 zfs:dsl_dataset_sync+5d ()
ff0001ba2b40 zfs:dsl_pool_sync+b5 ()
ff0001ba2bd0 zfs:spa_sync+1c5 ()
ff0001ba2c60 zfs:txg_sync_thread+19a ()
ff0001ba2c70 unix:thread_start+8 ()

syncing file systems...


Is there a way to restore the data? Is there a way to "fsck" the zpool, and 
correct the error manually?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss