Errors on UFS Partitions
Hi, I am sorry if I am asking a question that might have been brought up before I have attempted to research my issue but it has many angles it might be listed under so please bare with me. We have had ongoing problems with UFS Errors on our root partition (and any additional partition that did not have soft-updates enabled by default) and we recently had a problem with a secondary drive that housed home directories completely filled up and then everything locked up due-to huge CPU and Memory usage because nothing was able to write to the drive but when the server was rebooted it failed to bootup because of critical errors on the root partition. We have /etc and /usr on the root partition and our home/var partitions mistakenly do not have soft-updates flag set. ::dmesg:: http://the-irc.com/dmesg ::mount:: /dev/ad4s1a on / (ufs, local) devfs on /dev (devfs, local, multilabel) /dev/ad4s1d on /home (ufs, local, with quotas) /dev/ad4s1e on /tmp (ufs, local, noexec, nosuid, soft-updates) /dev/ad4s1f on /var (ufs, local) devfs on /var/named/dev (devfs, local, multilabel) procfs on /proc (procfs, local) /dev/ad0s1e on /Backups (ufs, local, soft-updates) /dev/ad0s1d on /root (ufs, local, soft-updates) ::fsck /:: ** /dev/ad4s1a (NO WRITE) ** Last Mounted on / ** Root file system ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts UNREF FILE I=361477 OWNER=root MODE=100666 SIZE=144464 MTIME=Jan 1 03:59 2010 CLEAR? no UNREF FILE I=966786 OWNER=root MODE=100644 SIZE=0 MTIME=Jan 15 23:02 2010 CLEAR? no ** Phase 5 - Check Cyl groups SUMMARY INFORMATION BAD SALVAGE? no BLK(S) MISSING IN BIT MAPS SALVAGE? no 549534 files, 4784719 used, 2830920 free (47200 frags, 347965 blocks, 0.6% fragmentation) ::fsck /home:: ** /dev/ad4s1d (NO WRITE) ** Last Mounted on /home ** Phase 1 - Check Blocks and Sizes INCORRECT BLOCK COUNT I=1957573 (4 should be 0) CORRECT? no INCORRECT BLOCK COUNT I=10270973 (300 should be 0) CORRECT? no INCORRECT BLOCK COUNT I=10270976 (44 should be 0) CORRECT? no INCORRECT BLOCK COUNT I=10271040 (48 should be 0) CORRECT? no INCORRECT BLOCK COUNT I=11871624 (4 should be 0) CORRECT? no ** Phase 2 - Check Pathnames UNALLOCATED I=732010 OWNER=agrippas MODE=100600 SIZE=33868 MTIME=Jan 16 19:05 2010 FILE=/agrippas/services/lib/akill.db REMOVE? no UNALLOCATED I=4545818 OWNER=port1080 MODE=100600 SIZE=2052 MTIME=Jan 16 19:06 2010 FILE=/port1080/services/nick.db REMOVE? no ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts UNREF FILE I=730879 OWNER=agrippas MODE=100664 SIZE=3020510 MTIME=Jan 16 18:54 2010 CLEAR? no LINK COUNT FILE I=732011 OWNER=agrippas MODE=0 SIZE=0 MTIME=Jan 16 19:05 2010 COUNT 0 SHOULD BE -1 ADJUST? no UNREF FILE I=2359889 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2359928 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2359930 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2359931 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2359932 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2359934 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2360094 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2360101 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2360103 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2360104 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2360118 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2360121 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2360122 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2360123 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 10 17:20 2010 CLEAR? no UNREF FILE I=2360124 OWNER=killjoyr MODE=100600 SIZE=0 MTIME=Jan 11 00:02 2010 CLEAR? no UNREF FILE I=2920477 OWNER=marianus MODE=100644 SIZE=6 MTIME=Jan 2 20:27 2010 CLEAR? no UNREF FILE I=2920480 OWNER=marianus MODE=100644 SIZE=6 MTIME=Jan 2 20:27 2010 CLEAR? no LINK COUNT FILE I=4545817 OWNER=port1080 MODE=0 SIZE=0 MTIME=Jan 16 19:06 2010 COUNT 0 SHOULD BE -1 ADJUST? no UNREF FILE I=6267525 OWNER=chijiru MODE=100644 SIZE=5 MTIME=Jan 2 10:05 2010 CLEAR? no UNREF FILE I=6760292 OWNER=jibbanet MODE=100644 SIZE=6 MTIME=Jan 10 20:21 2010 CLEAR? no UNREF FILE I=7089454 OWNER=talkingi MODE=100600 SIZE=0 MTIME=Jan 10 22:22 2010 CLEAR? no UNREF FILE I=8668793 OWNER=mutrcom MODE=100660 SIZE=1074 MTIME=Jan 8 14:32 2010 CLEAR? no UNREF FILE I=9752529 OWNER=gigircco MODE=100600 SIZE=0 MTIME=Jan 11 00:25 2010 CLEAR? no UNREF FILE I=9752883 OWNER=gigircco MODE=100600 SIZE=18 MTIME=Jan 12 00:04 2010 CLEAR?
Re: Errors on UFS Partitions
In the last episode (Jan 16), The-IRC FreeBSD said: I am sorry if I am asking a question that might have been brought up before I have attempted to research my issue but it has many angles it might be listed under so please bare with me. We have had ongoing problems with UFS Errors on our root partition (and any additional partition that did not have soft-updates enabled by default) and we recently had a problem with a secondary drive that housed home directories completely filled up and then everything locked up due-to huge CPU and Memory usage because nothing was able to write to the drive but when the server was rebooted it failed to bootup because of critical errors on the root partition. We have /etc and /usr on the root partition and our home/var partitions mistakenly do not have soft-updates flag set. ::dmesg:: http://the-irc.com/dmesg ::mount:: /dev/ad4s1a on / (ufs, local) devfs on /dev (devfs, local, multilabel) /dev/ad4s1d on /home (ufs, local, with quotas) /dev/ad4s1e on /tmp (ufs, local, noexec, nosuid, soft-updates) /dev/ad4s1f on /var (ufs, local) devfs on /var/named/dev (devfs, local, multilabel) procfs on /proc (procfs, local) /dev/ad0s1e on /Backups (ufs, local, soft-updates) /dev/ad0s1d on /root (ufs, local, soft-updates) ::fsck /:: ** /dev/ad4s1a (NO WRITE) fsck'ing a filesystem that is currently mounted read-write will always produce errors. Boot in single-user mode if you want to check the root filesystem or other fs'es that you can't dismount in multi-user mode. -- Dan Nelson dnel...@allantgroup.com ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Errors on UFS Partitions
The-IRC FreeBSD wrote: Hi, I am sorry if I am asking a question that might have been brought up before I have attempted to research my issue but it has many angles it might be listed under so please bare with me. We have had ongoing problems with UFS Errors on our root partition (and any additional partition that did not have soft-updates enabled by default) and we recently had a problem with a secondary drive that housed home directories completely filled up and then everything locked up due-to huge CPU and Memory usage because nothing was able to write to the drive but when the server was rebooted it failed to bootup because of critical errors on the root partition. A healthy system does not get UFS errors during normal operation. We have /etc and /usr on the root partition and our home/var partitions mistakenly do not have soft-updates flag set. ::dmesg:: http://the-irc.com/dmesg ::mount:: /dev/ad4s1a on / (ufs, local) devfs on /dev (devfs, local, multilabel) /dev/ad4s1d on /home (ufs, local, with quotas) /dev/ad4s1e on /tmp (ufs, local, noexec, nosuid, soft-updates) /dev/ad4s1f on /var (ufs, local) devfs on /var/named/dev (devfs, local, multilabel) procfs on /proc (procfs, local) /dev/ad0s1e on /Backups (ufs, local, soft-updates) /dev/ad0s1d on /root (ufs, local, soft-updates) [snip] To prevent letting these errors go out of control and not beable to fix the root partition errors without going into singleuser mode and the other partitions by mounting them with soft-updates flag, does anyone advise removing everything from the root partition and only leaving the bootloader and thus moving /etc and /usr (or most of all just /usr) to it's own partition or do you guys have a better solution. No. Proceeding in directions such as this is a waste of time. Every partition gets errors over time but if you are unable to correct them without downtime how are you to correct them before they get out of control? Probably by not looking for a software solution to a hardware problem. It is not normal for a file system to behave as you describe. Moving partitions around and other such avenues of approach are doomed to failure as they are not addressing the underlying problem. Real server hardware with sophisticated ECC subsystems usually have some BIOS counters which you can check for stats on memory errors. Hard drives fail the most often but either bad memory or drive controller can readily corrupt data. If you have a RAID controller with RAM cache the RAM could be defective. Hardware failure is going to mean downtime. But I'd be looking for a hardware problem, get it fixed, then worry about how to proceed. If you have decent backups from before the system was corrupted you can get back to where you need to be in relatively short order. Not fixing a hardware defect will result in you never getting your server back to normal operation. -Mike ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Errors on UFS Partitions
Thanks everyone for their input it has helped greatly. Does anyone know a way to toggle soft-updates on a UFS non-root partition while the system is live or without having to recreate the partition? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Errors on UFS Partitions
On Sun, Jan 17, 2010 at 12:30:09AM -0500, The-IRC FreeBSD wrote: Thanks everyone for their input it has helped greatly. Does anyone know a way to toggle soft-updates on a UFS non-root partition while the system is live or without having to recreate the partition? Sure. Use the tunefs(8) utility for this. (Note that it cannot be used on a filesystem which is mounted read-write.) -- Insert your favourite quote here. Erik Trulsson ertr1...@student.uu.se ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org