Re: [gentoo-amd64] Re: Disable fsck on boot, was: How to make watchdog start earlier during bootup?

Richard Freeman Thu, 22 Jan 2009 12:56:13 -0800

Beso wrote:

well, i think that the lvm2 layer is still good even when used on a
single disk. especially when
you don't know how the partitions would look like. i've had big time
saves by resizing lvm2
array than copying, removing partitions, recreating them and then
recopying files into
the newer ones.


I tend to agree, but once bitten twice shy.  :(

Some details for the curious:

I was running lvm2 on top of several raid-5 devices (that is, the raid-5 devices were the lmv2 physical volumes). I created the logical volumes on particular pvs to try to optimize disk seeking, so generally speaking particular partitions resided on only one set of disks. However, some partitions did cross both arrays. (When creating lvs you can tell lvm2 to try to put them on a particular pv, or you can use pvmove to move particular lvs I believe).


I was running ext3 on my lvs (and swap).

The problem was that I was having some kind of glitch that was causing my computer to reset (I traced it to one of my drives), and when it happened the array would sometimes come up with one of the drives missing. If the glitch happened again while the array was degraded it could cause data loss (no worse than not having RAID at all).

When I finally got the bad drive replaced (which generally fixed the resets), I rebuilt my arrays. At that point mdadm was happy with the state of affairs, but fsck was showing loads of errors on some of my filesystems. When I went ahead and let fsck do its job, I immediately started noticing corrupt files all over the place. The majority of the data volume was mpg files from mythtv and I'd find hour-long TV episodes where one minute of some other show would get spliced in. It seemed obvious that files were somehow getting cross-linked (I'm not intimately familiar with ext3, but I could see how this could happen in FAT). Oh - these errors were on a partition that WASN'T fsck'ed (in the command-line-utility sense of the world only I suppose).

I also started getting lots of errors on dmesg about attempts to seek past the end of the md devices. I did some googling and found that this had been seen by others - but it was obviously very rare.

Fortunately all my most critical data is backed up weekly (only a day or two before the final crash), and I didn't care about the TV too much (I saved what I could and re-recorded anything that got truncated or wasn't watchable). I did find that some of my DVD backups of digital photos were unreadable which has taught me a valuable lesson. Fortunately only some of the photos actually had errors in them, and most were successfully backed up.

I'm not longer using lvm2. If I need to expand my RAID I can potentially reshape it (after backups where possible). I miss some of the flexibility, but when I need a few GB of scratch space to test out a filesystem upgrade or something I just use losetup - but I don't care about performance in these cases.

I would say that lvm2 is probably safe if you have more reliable hardware. My problem was that a failing drive not only made the drive inaccessible, but it took down the whole system (since hardware on a typical desktop isn't well-isolated). On a decent server a drive failure shouldn't cause errors that bring down the whole system. So, I didn't get the full benefit from RAID.

smime.p7s
Description: S/MIME Cryptographic Signature

Re: [gentoo-amd64] Re: Disable fsck on boot, was: How to make watchdog start earlier during bootup?

Reply via email to