Re: [zfs-discuss] raidz faulted with only one unavailable disk
Hi Hans-Christian, Can you provide the commands you used to create this pool? Are the pool devices actually files? If so, I don't see how you have a pool device that starts without a leading slash. I tried to create one and it failed. See the example below. By default, zpool import looks in the /dev/dsk directory so you would need to include the -d /dir option to look in an alternative directory. I'm curious how you faulted the device because when I fault a device in a similar raidz1 configuration, my pool is only degraded. See below. Your pool is corrupted at a higher level. In general, RAID-Z redundancy works like this: raidz can withstand 1 device failure raidz2 can withstand 2 device failures raidz3 can withstand 3 device failures Thanks, Cindy # mkdir /files # mkfile 200m /files/file.1 # mkfile 200m /files/file.2 # mkfile 200m /files/file.3 # mkfile 200m /files/file.4 # cd /files # zpool create tank-1 raidz1 file.1 file.2 file.3 file.4 cannot open 'file.1': no such device in /dev/dsk must be a full path or shorthand device name # zpool create tank-1 raidz1 /files/file.1 /files/file.2 /files/file.3 /files/file.4 Fault a disk in tank-1: # zpool status tank-1 pool: tank-1 state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-2Q scan: scrub repaired 0 in 0h0m with 0 errors on Fri Oct 8 09:20:36 2010 config: NAME STATE READ WRITE CKSUM tank-1 DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 /files/file.1 UNAVAIL 0 0 0 cannot open /files/file.2 ONLINE 0 0 0 /files/file.3 ONLINE 0 0 0 /files/file.4 ONLINE 0 0 0 # zpool export tank-1 # zpool import tank-1 cannot import 'tank-1': no such pool available # zpool import -d /files tank-1 On 10/07/10 17:41, Hans-Christian Otto wrote: Hi, I've been playing around with zfs for a few days now, and now ended up with a faulted raidz (4 disks) with 3 disks still marked as online. Lets start with the output of zpool import: pool: tank-1 id: 15108774693087697468 state: FAULTED status: One or more devices contains corrupted data. action: The pool cannot be imported due to damaged devices or data. The pool may be active on another system, but can be imported using the '-f' flag. see: http://www.sun.com/msg/ZFS-8000-5E config: tank-1 FAULTED corrupted data raidz1-0 ONLINE disk/by-id/dm-name-tank-1-1 UNAVAIL corrupted data disk/by-id/dm-name-tank-1-2 ONLINE disk/by-id/dm-name-tank-1-3 ONLINE disk/by-id/dm-name-tank-1-4 ONLINE After some google searches and reading http://www.sun.com/msg/ZFS-8000-5E, it seems to me as if some metadata is lost, and thus the pool cannot be restored anymore. I've tried zpool import -F tank-1 as well as zpool import -f tank-1, both resulting in the following message: cannot import 'tank-1': I/O error Destroy and re-create the pool from a backup source. What I'm wondering about right now are the following things: Is there some way to recover the data? I thought raidz would require two disks to lose the data? And as I think that the data is lost - why did this happen in the first place? Which situation can cause a faulted raidz that has only one broken drive? Greetings, Christian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] raidz faulted with only one unavailable disk
Hi Cindy, Can you provide the commands you used to create this pool? I don't have them anymore, no. But they were pretty much like what you wrote below. Are the pool devices actually files? If so, I don't see how you have a pool device that starts without a leading slash. I tried to create one and it failed. See the example below. By default, zpool import looks in the /dev/dsk directory so you would need to include the -d /dir option to look in an alternative directory. The pool devices are real devices. The naming scheme might be a bit… different, don't bother. Importing the pool did work with these names. I'm curious how you faulted the device because when I fault a device in a similar raidz1 configuration, my pool is only degraded. See below. Your pool is corrupted at a higher level. d In general, RAID-Z redundancy works like this: raidz can withstand 1 device failure raidz2 can withstand 2 device failures raidz3 can withstand 3 device failures Thats what I understood, and thats the reason for my mail to this list. No important data is lost, as I was just playing around with raidz. But I really want to know what happend. After thinking about what I did, one thing came to my mind. Might exporting a degraded pool cause this issue? Greetings, Christian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] raidz faulted with only one unavailable disk
Hi Christian, Yes, with non-standard disks you will need to provide the path to zpool import. I don't think the force import of a degraded pool would cause the pool to be faulted. In general, the I/O error is caused when ZFS can't access the underlying devices. In this case, your non-standard devices names might have caused that message. You might be able to find out what happened by reviewing the fmdump -eV output to see what device errors occurred to cause the faulted pool. You can review the ZFS hardware diagnostics info, here: http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Resolving Hardware Problems Thanks, Cindy On 10/08/10 14:45, Hans-Christian Otto wrote: Hi Cindy, Can you provide the commands you used to create this pool? I don't have them anymore, no. But they were pretty much like what you wrote below. Are the pool devices actually files? If so, I don't see how you have a pool device that starts without a leading slash. I tried to create one and it failed. See the example below. By default, zpool import looks in the /dev/dsk directory so you would need to include the -d /dir option to look in an alternative directory. The pool devices are real devices. The naming scheme might be a bit… different, don't bother. Importing the pool did work with these names. I'm curious how you faulted the device because when I fault a device in a similar raidz1 configuration, my pool is only degraded. See below. Your pool is corrupted at a higher level. d In general, RAID-Z redundancy works like this: raidz can withstand 1 device failure raidz2 can withstand 2 device failures raidz3 can withstand 3 device failures Thats what I understood, and thats the reason for my mail to this list. No important data is lost, as I was just playing around with raidz. But I really want to know what happend. After thinking about what I did, one thing came to my mind. Might exporting a degraded pool cause this issue? Greetings, Christian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] raidz faulted with only one unavailable disk
Hi Cindy, I don't think the force import of a degraded pool would cause the pool to be faulted. In general, the I/O error is caused when ZFS can't access the underlying devices. In this case, your non-standard devices names might have caused that message. as I wrote in my first mail, zpool import (without any params) shows the three non-corrupted disks as ONLINE - from my understanding, this should not happen if ZFS can't access the underlying devices? I will walk through the rest of your mail later. Greetings, Christian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] raidz faulted with only one unavailable disk
Hi, I've been playing around with zfs for a few days now, and now ended up with a faulted raidz (4 disks) with 3 disks still marked as online. Lets start with the output of zpool import: pool: tank-1 id: 15108774693087697468 state: FAULTED status: One or more devices contains corrupted data. action: The pool cannot be imported due to damaged devices or data. The pool may be active on another system, but can be imported using the '-f' flag. see: http://www.sun.com/msg/ZFS-8000-5E config: tank-1 FAULTED corrupted data raidz1-0 ONLINE disk/by-id/dm-name-tank-1-1 UNAVAIL corrupted data disk/by-id/dm-name-tank-1-2 ONLINE disk/by-id/dm-name-tank-1-3 ONLINE disk/by-id/dm-name-tank-1-4 ONLINE After some google searches and reading http://www.sun.com/msg/ZFS-8000-5E, it seems to me as if some metadata is lost, and thus the pool cannot be restored anymore. I've tried zpool import -F tank-1 as well as zpool import -f tank-1, both resulting in the following message: cannot import 'tank-1': I/O error Destroy and re-create the pool from a backup source. What I'm wondering about right now are the following things: Is there some way to recover the data? I thought raidz would require two disks to lose the data? And as I think that the data is lost - why did this happen in the first place? Which situation can cause a faulted raidz that has only one broken drive? Greetings, Christian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss