Re: [zfs-discuss] pool metadata has duplicate children
On 2013-Jan-08 21:30:57 -0800, John Giannandrea j...@meer.net wrote: Notice that in the absence of the faulted da2 the OS has assigned da3 to da2 etc. I suspect this was part of the original problem in creating a label with two da2s The primary vdev identifier is tha guid. Tha path is of secondary importance (ZFS should automatically recover from juggled disks without an issue - and has for me). Try running zdb -l on each of your pool disks and verify that each has 4 identical labels, and that the 5 guids (one on each disk) are unique and match the vdev_tree you got from zdb. My suspicion is that you've somehow lost the disk with the guid 3419704811362497180. twa0: 3ware 9000 series Storage Controller twa0: INFO: (0x15: 0x1300): Controller details:: Model 9500S-8, 8 ports, Firmware FE9X 2.08.00.006 da0 at twa0 bus 0 scbus0 target 0 lun 0 da1 at twa0 bus 0 scbus0 target 1 lun 0 da2 at twa0 bus 0 scbus0 target 2 lun 0 da3 at twa0 bus 0 scbus0 target 3 lun 0 da4 at twa0 bus 0 scbus0 target 4 lun 0 Are these all JBOD devices? -- Peter Jeremy pgpykCYjUFT7j.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] pool metadata has duplicate children
I seem to have managed to end up with a pool that is confused abut its children disks. The pool is faulted with corrupt metadata: pool: d state: FAULTED status: The pool metadata is corrupted and the pool cannot be opened. action: Destroy and re-create the pool from a backup source. see: http://illumos.org/msg/ZFS-8000-72 scan: none requested config: NAME STATE READ WRITE CKSUM dFAULTED 0 0 1 raidz1-0 FAULTED 0 0 6 da1 ONLINE 0 0 0 3419704811362497180 OFFLINE 0 0 0 was /dev/da2 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 But if I look at the labels on all the online disks I see this: # zdb -ul /dev/da1 | egrep '(children|path)' children[0]: path: '/dev/da1' children[1]: path: '/dev/da2' children[2]: path: '/dev/da2' children[3]: path: '/dev/da3' children[4]: path: '/dev/da4' ... But the offline disk (da2) shows the older correct label: children[0]: path: '/dev/da1' children[1]: path: '/dev/da2' children[2]: path: '/dev/da3' children[3]: path: '/dev/da4' children[4]: path: '/dev/da5' zpool import -F doesnt help because none of the labels on the unfaulted disks seem to have the right label. And unless I can import the pool I cant replace the bad drive. Also zpool seems to really not want to import a raidz1 pool with one faulted drive even though that should be readable. I have read about the undocumented -V option but dont know if that would help. I got into this state when i noticed the pool was DEGRADED and was trying to replace the bad disk. I am debugging it under FreeBSD 9.1 Suggestions of things to try welcome, Im more interested in learning what went wrong than restoring the pool. I dont think I should have been able to go from one offline drive to a unrecoverable pool this easily. -jg ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] pool metadata has duplicate children
Have you tried importing the pool with that drive completely unplugged? Which HBA are you using? How many of these disks are on same or separate HBAs? Gregg Wonderly On Jan 8, 2013, at 12:05 PM, John Giannandrea j...@meer.net wrote: I seem to have managed to end up with a pool that is confused abut its children disks. The pool is faulted with corrupt metadata: pool: d state: FAULTED status: The pool metadata is corrupted and the pool cannot be opened. action: Destroy and re-create the pool from a backup source. see: http://illumos.org/msg/ZFS-8000-72 scan: none requested config: NAME STATE READ WRITE CKSUM dFAULTED 0 0 1 raidz1-0 FAULTED 0 0 6 da1 ONLINE 0 0 0 3419704811362497180 OFFLINE 0 0 0 was /dev/da2 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 But if I look at the labels on all the online disks I see this: # zdb -ul /dev/da1 | egrep '(children|path)' children[0]: path: '/dev/da1' children[1]: path: '/dev/da2' children[2]: path: '/dev/da2' children[3]: path: '/dev/da3' children[4]: path: '/dev/da4' ... But the offline disk (da2) shows the older correct label: children[0]: path: '/dev/da1' children[1]: path: '/dev/da2' children[2]: path: '/dev/da3' children[3]: path: '/dev/da4' children[4]: path: '/dev/da5' zpool import -F doesnt help because none of the labels on the unfaulted disks seem to have the right label. And unless I can import the pool I cant replace the bad drive. Also zpool seems to really not want to import a raidz1 pool with one faulted drive even though that should be readable. I have read about the undocumented -V option but dont know if that would help. I got into this state when i noticed the pool was DEGRADED and was trying to replace the bad disk. I am debugging it under FreeBSD 9.1 Suggestions of things to try welcome, Im more interested in learning what went wrong than restoring the pool. I dont think I should have been able to go from one offline drive to a unrecoverable pool this easily. -jg ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] pool metadata has duplicate children
Gregg Wonderly gregg...@gmail.com wrote: Have you tried importing the pool with that drive completely unplugged? Thanks for your reply. I just tried that. zpool import now says: pool: d id: 13178956075737687211 state: FAULTED status: The pool metadata is corrupted. action: The pool cannot be imported due to damaged devices or data. The pool may be active on another system, but can be imported using the '-f' flag. see: http://illumos.org/msg/ZFS-8000-72 config: dFAULTED corrupted data raidz1-0 FAULTED corrupted data da1 ONLINE 3419704811362497180 OFFLINE da2 ONLINE da3 ONLINE da4 ONLINE Notice that in the absence of the faulted da2 the OS has assigned da3 to da2 etc. I suspect this was part of the original problem in creating a label with two da2s zdb still reports that the label has two da2 children: vdev_tree: type: 'raidz' id: 0 guid: 11828532517066189487 nparity: 1 metaslab_array: 23 metaslab_shift: 36 ashift: 9 asize: 920660480 is_log: 0 children[0]: type: 'disk' id: 0 guid: 13697627234083630557 path: '/dev/da1' whole_disk: 0 DTL: 78 children[1]: type: 'disk' id: 1 guid: 3419704811362497180 path: '/dev/da2' whole_disk: 0 DTL: 71 offline: 1 children[2]: type: 'disk' id: 2 guid: 6790266178760006782 path: '/dev/da2' whole_disk: 0 DTL: 77 children[3]: type: 'disk' id: 3 guid: 2883571222332651955 path: '/dev/da3' whole_disk: 0 DTL: 76 children[4]: type: 'disk' id: 4 guid: 16640597255468768296 path: '/dev/da4' whole_disk: 0 DTL: 75 Which HBA are you using? How many of these disks are on same or separate HBAs? all the disks are on the same HBA twa0: 3ware 9000 series Storage Controller twa0: INFO: (0x15: 0x1300): Controller details:: Model 9500S-8, 8 ports, Firmware FE9X 2.08.00.006 da0 at twa0 bus 0 scbus0 target 0 lun 0 da1 at twa0 bus 0 scbus0 target 1 lun 0 da2 at twa0 bus 0 scbus0 target 2 lun 0 da3 at twa0 bus 0 scbus0 target 3 lun 0 da4 at twa0 bus 0 scbus0 target 4 lun 0 -jg ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss