Re: Replace a failed block device with null PV in an LVM VG

2021-03-24 Thread Reco
On Wed, Mar 24, 2021 at 05:17:57PM +, David Pottage wrote:
> On 2021-03-24 12:37, Reco wrote:
> > Hi.
> > 
> > On Wed, Mar 24, 2021 at 10:26:49AM +, David Pottage wrote:
> > > Is there a way to assemble the VG and mount those ext4 filesystems in
> > > such a way that read attempts from the missing PV will return zeros,
> > > but the rest of the filesystem will work?
> > 
> > Try this:
> > 
> > vgchange --activationmode partial -ay
> > lvs
> > # immediately dump logical volume in question somewhere with cat/dd
> > cat /dev// > lv.img
> > vgchange -an 
> > # run fsck -f on a copy of logical volume
> > fsck -f lv.img
> > # try mounting it
> > mount -o loop lv.img /
> 
> 
> Thanks, that partly worked. It was an older version of LVM2, so I had to 
> modify the command line syntax to "vgchange --partial -ay "
> 
> I was then able to mount the damaged volumes and get back nearly half of the 
> lost files. I had a separate record of SHA1 checksums of all the lost files 
> and
> all the recovered files have been checked and are undamaged.
> 
> Thanks for your help.

You're welcome.

Reco



Re: Replace a failed block device with null PV in an LVM VG

2021-03-24 Thread David Pottage

On 2021-03-24 12:37, Reco wrote:

Hi.

On Wed, Mar 24, 2021 at 10:26:49AM +, David Pottage wrote:

Is there a way to assemble the VG and mount those ext4 filesystems in
such a way that read attempts from the missing PV will return zeros,
but the rest of the filesystem will work?


Try this:

vgchange --activationmode partial -ay
lvs
# immediately dump logical volume in question somewhere with cat/dd
cat /dev// > lv.img
vgchange -an 
# run fsck -f on a copy of logical volume
fsck -f lv.img
# try mounting it
mount -o loop lv.img /



Thanks, that partly worked. It was an older version of LVM2, so I had to 
modify the command line syntax to "vgchange --partial -ay "


I was then able to mount the damaged volumes and get back nearly half of 
the lost files. I had a separate record of SHA1 checksums of all the 
lost files and all the recovered files have been checked and are 
undamaged.


Thanks for your help.

--
David Pottage



Re: Replace a failed block device with null PV in an LVM VG

2021-03-24 Thread songbird
Dan Ritter wrote:
...
> Next time: verify that you are backing up everything that you
> need to backup. I know it's boring.

  it seems like it should be common sense for anyone worried
about redundancy to the extent of having raid to also think
about making sure there is more than one controller/device/
cabinet involved in the redundancy.

  however, not knowing how those devices were set up perhaps
moving the blinking red devices to another cabinet might be
worth a try.


  songbird



Re: Replace a failed block device with null PV in an LVM VG

2021-03-24 Thread Dan Ritter
David Pottage wrote: 
> At work, there is a fileserver with a failed external drive enclosure. I am
> attempting to recover some data that is probably not on the failed drives.
> 
> This file server started out with 36 internal drives (in three RAID-6
> arrays) that formed the initial 3 physical volumes to an LVM volume group.
> In that LVM I created 16 of logical volumes (all where 5Tb in size),
> formatted each with ext4, and stored a huge number of smallish files. That
> was about 9 years ago.
> 
> Over the years, as more disc space was needed, 5 external RAID drive
> enclosures where added of varying capacity between 30 and 90Tb. Each was
> added to the LVM VG as another physical volume, and the existing logical
> volumes where expanded to fill the additional available space, and the ext4
> filesystems where resized to use the expanded volumes till they where each
> 15Tb in size.
> 
> Now one of those external drive enclosures with 42Tb of capacity has failed.
> The data centre technician tells me that it has 28 drive bays, and 15 of
> those drives are flashing angry red lights, so recovery is unlikely
> 
> Most of the data on that file server is already backed up elsewhere, but I
> have been told that there are about 50,000 files that where somehow not
> backed up, and could I try to get them back.
> 
> The missing files are the oldest, and would have been on the original good
> drives, but I can't mount the ext4 filesystems because the LVM cannot
> assemble the VG because one of the physical volumes is missing.
> 
> Is there a way to assemble the VG and mount those ext4 filesystems in such a
> way that read attempts from the missing PV will return zeros, but the rest
> of the filesystem will work? Perhaps by creating a virtual block device with
> the correct capacity and UUID, or by special LVM commands?
> 
> How will ext4 behave if some parts of it's underlying block device return
> zeros on read?
> 
> I know that this is a last ditch data recovery effort. Is it likely to work?

If I understand your description correctly, the RAID6 was the
only redundancy in the system. That being the case, you will not
be able to have LVM assemble the VG, and you will not be able to
mount the ext3fs.

You can try forensic recovery with PhotoRec
https://www.cgsecurity.org/wiki/PhotoRec

or ddrescue:

https://www.gnu.org/software/ddrescue/

But I would bet on getting anything much off of those disks.

Next time: verify that you are backing up everything that you
need to backup. I know it's boring.


-dsr-



Re: Replace a failed block device with null PV in an LVM VG

2021-03-24 Thread Reco
Hi.

On Wed, Mar 24, 2021 at 10:26:49AM +, David Pottage wrote:
> Is there a way to assemble the VG and mount those ext4 filesystems in
> such a way that read attempts from the missing PV will return zeros,
> but the rest of the filesystem will work?

Try this:

vgchange --activationmode partial -ay
lvs
# immediately dump logical volume in question somewhere with cat/dd
cat /dev// > lv.img
vgchange -an 
# run fsck -f on a copy of logical volume
fsck -f lv.img
# try mounting it
mount -o loop lv.img /

Reco