Re: [zfs-discuss] reconstruct recovery of rpool zpool and zfs file system with bad sectors

2010-05-25 Thread Rob Levy
Roy,

Thanks for your reply. 

I did get a new drive and attempted the approach (as you have suggested pre 
your reply) however once booted off the OpenSolaris Live CD (or the rebuilt new 
drive), I was not able to import the rpool (which I had established had sector 
errors). I expect I should have had some success if the vdev labels were intact 
(I currently suspect some critical boot files are impacted by bad sectors 
resulting in failed boot attempts from that partition slice). Unfortunately, I 
didn't keep a copy of the messages (if any - I have tried many permutations 
since).

At my last attempt ... I installed knoppix (debian) on one of the partitions  
(also allowed access to smartctl and hdparm too - I was hoping to reduce the 
read timeout to speed up the exercise), then added zfs-fuse (to access the 
space I will use to stage the recovery file) and added dd_rescue and gnu 
ddrescue packages. smartctl appears not to be able to manage the disk while 
attached to usb (but I am guessing because don't have much experience with it).

At this point I attempted dd_rescue to create an image of the partition with 
bad sectors (hoping there were efficiencies beyong normal dd) but it was at 
5.6GB in 36 hours, so again I needed to abort however it does log the blocks 
attempted so far so hopefully I can skip past them when I next get an 
opportunity. Although it does now appear that gnu ddrescue is the preferred of 
the two utilities which I may opt to use to look at creating an image of the 
partition before attempting recovery of the slice (rpool).

As an aside, I noticed that the knoppix 'dmesg | grep sd' command which 
reflects the primary partition devices, no longer appears to reflect the 
solaris partition (p2) slice devices (as it would the extended p4 partitions 
logical partition devices configured). I suspect due to this, the rpool (one of 
the solaris partition slices) appears not to be detected by the knoppix 
zfs-fuse 'zpool import' (although I can access the zpool which exists on 
partition p3). I wonder if this is related to the transition from ufs to zfs?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] reconstruct recovery of rpool zpool and zfs file system with bad sectors

2010-05-20 Thread Rob Levy
Folks I posted this question on (OpenSolaris - Help) without any replies 
http://opensolaris.org/jive/thread.jspa?threadID=129436tstart=0 and am 
re-posting here in the hope someone can help ... I have updated the wording a 
little too (in an attempt to clarify)

I currently use OpenSolaris on a Toshiba M10 laptop.

One morning the system wouldn't boot OpenSolaris 2009.06 (it was simply unable 
progress to the second stage grub). On further investigation I discovered the 
hdd partition slice with rpool appeared to have bad sectors.

Faced with either a rebuild or an attempt at recovery, I first made an attempt 
to recover the slice before rebuilding.

The c7t0d0 HDD (p0) was divided into p1 (NTFS 24GB), p2 (OpenSolaris 24GB), p3 
(OpenSolaris zfs pool for data 160GB) and p4 (50GB extended with 32GB pcfs, 
12GB linux and linux swap) partitions (or something close to that). On the 
first Solaris partition (p2), slice 0 was the OpenSolaris rpool zpool.

To attempt recovery I booted the OpenSolaris 2009.06 live CD and was able to 
import the ZFS pool which was configured on p3. On the p2 device (Solaris boot 
partition which wouldn't boot) I then ran dd if=/dev/rdsk/c7t0d0s2 bs=512 
conv=sync, noerror of=/p0/s2image.dd.

Due to sector read error timeouts, this took longer than my maintenance window 
allowed and I ended up aborting the attempt with a significant amount of 
sectors already captured. 

On block examination of this (so far) captured image.dd, I noticed the first 
two s0 vdev labels appeared to be intact. I then skipped the expected number of 
s2 sectors to get to the s0 start and copied blocks to attempt to reconstruct 
the s0 rpool (against this I ran zdb -l which reported the first two labels) 
and gave me the encouragement necessary to continue the exercise.

At the next opportunity I ran the command again using the skip directive to 
capture the balance of slice. The result was that I had two files (images) 
comprising the good c7t0d0s0 sectors (with I expect the bad padded) Ie. an 
s0image_start.dd and s0image_end.dd

As mentioned at this stage I was able to run 'zfs -l s0image_start.dd' and see 
the first two vdev labels and 'zfs -l s0image_end.dd' and see the last two vdev 
labels.

I then combined the two files (I tried various approaches eg. cat and dd with 
the append directive) however only the first two vdev labels appear to be 
readable in the resulting s0image_s0.dd? The resulting file size, which I 
expect is largely good sectors with padding for bad sectors, matches that of 
the prtvtoc s0 sectors count multiplied by 512.

Can anyone advise .. why I am unable to read the third and forth vdev labels 
once the start and end files are combined?

Is there another approach that may prove more fruitful?

Once I have the file (with labels being in the correct places) I was intending 
to attempt to import the vdev zpool as rpool2 or attempt any repair procedures 
I could locate (as far as was possible anyway) to see what data could be 
recovered (besides it was an opportunity to get another close look at ZFS).

Incidentally *only* the c7t0d0s0 slice appeared to have bad sectors (I do 
wonder what the significance is of this?).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool create over old pool recovery

2009-08-24 Thread Rob Levy
Folks,

Need help with ZFS recovery following zfs create ... 

We recently received new laptops (hardware refresh) and I simply transfered the 
multiboot hdd (using OpenSolaris 2008.11 as the primary production OS) from the 
old laptop to the new one (used the live DVD to do the zpool import, updated 
the boot archive and did a devfsadm) and worked away as usual.

I then wanted to use the WinXP distribution which was shipped with the new 
laptop and discovered the existing partition was too small. I bought a new hdd 
and proceeded to partition it as required (1.WinXP 24GB, 2.Solaris 24GB, 
3.Solaris 130GB, 4.extended with logical 5.FAT32 30GB, 6.Linux 20GB, 7.Linux 
swap 6GB).

I would consider myself inexperienced with ZFS (one of the reasons I opted for 
OpenSolaris was to get more familiar with it and the other features before they 
were adopted by customers).

So although I bet there would be more elgant ways to do this I stuck with what 
I know.

I connected the hdd (with functional but inappropriate partition sizes) to the 
usb port.

I 'dd' the first Solaris partition across (OpenSolaris rpool dataset) to the 
new drive (used the live DVD to do the zpool import, update_grub and bootadm) 
then all was as it should be.

It appears I may have been able to do this from the hdd after booting 
OpenSolaris but I wasn't aware of how to deal with two pools of the same name - 
Ie. rpool.

I subsequently copied the linux OS across (booted linux, created ext3fs and 
copied OS files across using tar). Also from Linux, I created the FAT32 
filesystem and copied the data across with tar).  - all okay and functional.

At this point all I need to do was copy the second 130GB Solaris partition (zfs 
filesystem) across and proceeded to create the new pool and zfs file system. My 
intention was to mount the two and simply copy the data across.

What I did do was zpool create to the device where the existing pool (of valid 
data) was and created a zfs filesystem. [b]When I did the zpool status I 
realised what I had done and promptly disconnected the hdd from the usb 
port[/b].

I have googled to see if anyone has successfully recovered data following an 
inadvertent 'zpool create' without success. The Sun url also says the data 
cannot be recovered and should be sourced from a backup.

I dont have a recent backup (and I guess worse yet - don't know what I will 
have lost by going back to a 8 monthish old backup).

So I guess what I'm hoping is that like other filesystems if the zfs 
'superstructures' are removed the data would still be in place and using some 
of the cached detail perhaps it can be pieced back?

I know it's a long shot but as I don't know ZFS well enough, I must ask the 
question.

Some documentation seemed to suggest ZFS would advise if a healthy pool existed 
before blowing it away (perhaps only if it is mounted? this wasn't imported?). 
As there is very obviously a risk here it would be a good time to add any 
possible checks to zpool create.

Does anyone have any recovery advise?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss