hi,

I just had some massive filesystem corruption on the root filesystem of a debian potato system that was installed about a month ago and was working fine till now, running self compiled 2.2.13 kernel. the box has been running GNU/Linux (redhat) for about a year and a half with no trouble either..

after finding that all of /bin and nearly all of /etc contents were moved to lost+found losing their filenames after a fsck -y I ended up reinstalling the system. is there any way to restore the files to a destroyed root filesystem (/var /usr and /home are all separate undamaged filesystems) ? or is a full reinstall the only way to really fix this other then restore from tape backups (which i do not have and cannot create)

now after reinstalling the system again and spending several hours reconfiguring things, running fsck a couple times every so often and finding nothing wrong, i did some more work and noticed a cron job ran causing a lot of disk activity (locate database building or something) so i checked the filesystems again and the root filesystem was again ruined just like before! this time i had made backups of /bin /sbin /etc /boot /lib /root and such into tar archives on the /home partition (/ is the ONLY partition to be damaged, other then filetype errors that are always there on /var which i have mentioned before but never got reply on) i ran e2fsck -c and redirected all its output to a floppy so i could have a reference, after it finished `fixing' everything by mostly moving everything into lost+found in the form of inode numbers and restored the tar archives i had made since everything was ruined anyway, after after restoring most of them the kernel spat out a few errors about blocks or some such and the filesystem went totally hosed again, no commands worked any more since apparently the kernel could not read /sbin and /bin (though i could get listings using sash's built in ls) rebooting resulted in a kernel panic no init found.

I am beginning to think that the disk may be going bad but badblocks e2fsck -c mke2fs -c all report no bad blocks. this is an IDE disk. is there any other way to find out if the disk is at fault or is this the kernel? I am running 2.2.13 on a different machine and have been since it came out and have had no problems with it, except for the constant filetype errors that show up on the /var filesystem but they seem to be minor and not hurt anything that i can see.

would the fsck output be useful to anyone more knowledgeable about the filesystem than I in determining whether this is hardware or software related? (I can send it to anyone who is interested, if only in admiring the completness of the ruination. the log file is about 64000 bytes.)

I have read there is still reports of filesystem corruption in the stable kernels but it has been mostly unreproducable, if that is what is happening here it looks like i can reproduce it just fine :| I tend to think this is not a kernel problem since the 6 other mounted filesystems are completely undamaged after / gets hosed...

any other pointers on where I should go from here on troubleshooting this problem?

Ethan

Reply via email to