thanks for the help. here's what I did: I booted single-user with init=/bin/sh, and md0 mounted read-only. everything works so far, I get to the shell w/o any errors.
at this point, md1 is not started but I can start it with mdadm -A --auto=yes /dev/md1. mdadm -D /dev/md{0,1} shows state: clean for both arrays, and state: active sync for the disks, so I assume the raid arrays are doing well. vgdisplay --ignorelockingfailure volg1 and lvdisplay --ignorelockingfailure volg1 displays the correct information about the volume group and all volumes (althoug this takes pretty long...?). I can make the volumes available using vgchange and lvchange. however, fsck then shows tons of 'illegal block #... in inode ... ' messages. I am simply at a complete loss as to why my file system should suddenly be so corrupt???! I've never had any probs like this before. and, more importantly, is there a safe (i.e. no data loss) way of fixing it? thanks, - Dave. On 5/6/07, Douglas Allan Tutty <[EMAIL PROTECTED]> wrote:
On Sun, May 06, 2007 at 03:25:02PM +0200, David Fuchs wrote: > I have just upgraded my sarge system to etch, following exactly the upgrade > instructions at http://www.us.debian.org/releases/etch/i386/release-notes/. > > now my system does not boot correctly anymore... I'm using RAID1 with two > disks, / is on md0 and all other mounts (/home/, /var, /usr etc) are on md1 > using LVM. > > the first problem is that during boot, only md0 gets started. I can get > around this by specifying break=mount on the kernel boot line and manually > starting md1, but where need I change what so that md1 gets started at this > point as well? > > after manually starting md1 and continuing to boot, I get errors like > > Inode 184326 has illegal block(s) > /var: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY (i.e. without the -a or -o > options) > > ... same for all other partitions on that volume group > > fsck died with exit status 4 > A log is being saved in /var/log/fsck/checkfs if that location is > writable.(it is not) > > at this point I get dropped to a maintenance shell. when I select to > continue the boot process: What happens if instead of forcing a boot you do what it says: run fsck without the -a or -o options? > > EXT3-fs warning: mounting fs with errors. running e2fsck is recommended > EXT3 FS on dm-4, internal journal > EXT3-FS: mounted filesystem with ordered data mode. > ... same for all mounts (same for dm-3, dm-2, dm-1, dm-0) > > EXT3-fs error (device dm-1) in ext3_reserve_inode_write: Journal has aborted > EXT3-fs error (device dm-1) in ext3_orphan)write: Journal has aborted > EXT3-fs error (device dm-1) in ext3_orphan_del: Journal has aborted > EXT3-fs error (device dm-1) in ext3_truncate_write: Journal has aborted > ext3_abort called. > EXT3-fs error (device dm-1): ext3_journal)_start_sb: Detected aborte > djournal > Remounting filesystem read-only > > and finally I get tons of these: > > dm-0: rw-9, want=6447188432, limit=10485760 > attempt to access beyond end of device > > the system then stops for a long time (~5 minutes) at "starting systlog > service" but eventually the login prompt comes up, and I can log in, see all > my data, and even (to my surprise) write to the partitions on md1... > ...which probably corrupts the fs even more. > what the hell is going on here? thanks a lot in advance for any help! > What is going on is that you started with a simple booting error that has propogated into filesystem errors. Those errors are compounded by forcing a mount of a filesystem with errors . Remember that the system that starts LVM and raid itself exists on the disks.... What you need is a shell with the root fs either totally unmounted or mounted ro. Does booting single-user work? What about telling the kernel init=/bin/sh? From there, you can check the status of the mds with: #/sbin/mdadm -D /dev/md0 #/sbin/mdadm -D /dev/md1 ... check the status of the logical volumes: #/sbin/lvdisplay [lvname] and then check the filesystems with: #/sbin/e2fsck -f -c -c /dev/... Only once you get the filesystems fully functional should you attempt to boot further. Doug. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]