Re: Large discrepancy in reported disk usage on USR partition
On Friday 31 October 2008 02:20:39 Brendan Hart wrote: > > Is it possible that nfs directory got written to /usr at some point in > > time? > > > You would only notice this with du if the nfs directory is unmounted. > > Unmount it and ls -al /usr/mountpoint should only give you an empty dir > > Bingo!! That is exactly the problem. An NFS mount was hiding a 17G local > dir which had an old copy of the entire NFS mounted dir. I guess it must > have been written incorrectly to this standby server by RSYNC before the > NFS mount was put in place. I will add an exclusion to rsync to make sure > it does not happen again even if the NFS dir is not mounted. I used to nfs mount /usr/ports and run a cron job on the local machine. I made a file on the local machine: echo 'This is a mountpoint' > /usr/ports/KEEP_ME_EMPTY The script would: if [ -e /usr/ports/KEEP_ME_EMPTY ]; then do_nfs_mount(); if [ -e /usr/ports/KEEP_ME_EMPTY ]; then give_up_or_wait(); fi fi Of course it's fragile, but it works for not so critical issues. -- Mel Problem with today's modular software: they start with the modules and never get to the software part. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
RE: Large discrepancy in reported disk usage on USR partition
Now that you mention it, it *is* strange that the NFS mount was not listed by the "df" function. Try again after a fresh reboot: #: df -h Filesystem SizeUsed Avail Capacity Mounted on /dev/aacd0s1a 496M176M280M39%/ devfs 1.0K1.0K 0B 100%/dev /dev/aacd0s1e 496M 15M441M 3%/tmp /dev/aacd0s1f 28G4.8G 21G19%/usr /dev/aacd0s1d 1.9G430M1.3G24%/var server2:/storage/blah/foo/data/397G103G262G28% /usr/home/development/mount/foobar I guess I must have missed the final line when copying the output when I first posted to the mailing list. And then when I replied Mel, I had already nmounted the NFS dir when attempting the suggested fix, so it did not show when I ran "df" again to double-check, and I did not realize what had happened. I apologise for any confusion caused. Best Regards, Brendan Hart - Brendan Hart, Development Manager Strategic Ecommerce Division Securepay Pty Ltd Phone: 08-8274-4000 Fax: 08-8274-1400 -Original Message- From: Jeremy Chadwick [mailto:[EMAIL PROTECTED] Sent: Friday, 31 October 2008 12:02 PM To: Brendan Hart Cc: 'Mel'; freebsd-questions@freebsd.org Subject: Re: Large discrepancy in reported disk usage on USR partition On Fri, Oct 31, 2008 at 11:50:39AM +1030, Brendan Hart wrote: > >> #: df -h > >> Filesystem SizeUsed Avail Capacity Mounted on > >> /dev/aacd0s1a 496M163M 293M36%/ > >> devfs 1.0K1.0K 0B 100% /dev > >> /dev/aacd0s1e 496M15M 441M3% /tmp > >> /dev/aacd0s1f28G25G 1.2G96%/usr > >> /dev/aacd0s1d 1.9G429M 1.3G24%/var > > > Is this output untruncated? Is df really df or an alias to 'df -t nonfs'? > > Yes, it really is the untruncated output of "df -h". I also tried the > "df -t nonfs" and it gives exactly the same output as "df". What are > you expecting that is not present in the output ? > > > Is it possible that nfs directory got written to /usr at some point > > in > time? > > You would only notice this with du if the nfs directory is unmounted. > > Unmount it and ls -al /usr/mountpoint should only give you an empty > > dir > > Bingo!! That is exactly the problem. An NFS mount was hiding a 17G > local dir which had an old copy of the entire NFS mounted dir. I guess > it must have been written incorrectly to this standby server by RSYNC > before the NFS mount was put in place. I will add an exclusion to > rsync to make sure it does not happen again even if the NFS dir is not mounted. > > Thank you for your help, you have saved me much time rebuilding this server. Can either of you outline what exactly happened here? I'm trying to figure out how an "NFS mount was hiding a 17G local dir", when there's no NFS mounts shown in the above df output. This is purely an ignorant question on my part, but I'm not able to piece together what happened. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | __ Information from ESET NOD32 Antivirus, version of virus signature database 3571 (20081030) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 3571 (20081030) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Large discrepancy in reported disk usage on USR partition
Jeremy Chadwick wrote: On Fri, Oct 31, 2008 at 11:50:39AM +1030, Brendan Hart wrote: #: df -h Filesystem SizeUsed Avail Capacity Mounted on /dev/aacd0s1a 496M163M 293M36%/ devfs 1.0K1.0K 0B 100% /dev /dev/aacd0s1e 496M15M 441M3% /tmp /dev/aacd0s1f28G25G 1.2G96%/usr /dev/aacd0s1d 1.9G429M 1.3G24%/var Is this output untruncated? Is df really df or an alias to 'df -t nonfs'? Yes, it really is the untruncated output of "df -h". I also tried the "df -t nonfs" and it gives exactly the same output as "df". What are you expecting that is not present in the output ? I would have to assume he's looking for an NFS mount ;-) Is it possible that nfs directory got written to /usr at some point in time? You would only notice this with du if the nfs directory is unmounted. Unmount it and ls -al /usr/mountpoint should only give you an empty dir Bingo!! That is exactly the problem. An NFS mount was hiding a 17G local dir which had an old copy of the entire NFS mounted dir. I guess it must have been written incorrectly to this standby server by RSYNC before the NFS mount was put in place. I will add an exclusion to rsync to make sure it does not happen again even if the NFS dir is not mounted. Thank you for your help, you have saved me much time rebuilding this server. Can either of you outline what exactly happened here? I'm trying to figure out how an "NFS mount was hiding a 17G local dir", when there's no NFS mounts shown in the above df output. This is purely an ignorant question on my part, but I'm not able to piece together what happened. Well, it would appear that perhaps Mel also guessed right about df being aliased? Just my guess, but, as you mention, no nfs mounts appear. I may be mistaken, but I think it's also possible to get into this sort of situation by mounting a local partition on a non-empty mountpoint---at least, it happened to me recently. Kevin Kinsey -- A triangle which has an angle of 135 degrees is called an obscene triangle. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Large discrepancy in reported disk usage on USR partition
On Fri, Oct 31, 2008 at 11:50:39AM +1030, Brendan Hart wrote: > >> #: df -h > >> Filesystem SizeUsed Avail Capacity Mounted on > >> /dev/aacd0s1a 496M163M 293M36%/ > >> devfs 1.0K1.0K 0B 100% /dev > >> /dev/aacd0s1e 496M15M 441M3% /tmp > >> /dev/aacd0s1f28G25G 1.2G96%/usr > >> /dev/aacd0s1d 1.9G429M 1.3G24%/var > > > Is this output untruncated? Is df really df or an alias to 'df -t nonfs'? > > Yes, it really is the untruncated output of "df -h". I also tried the "df -t > nonfs" and it gives exactly the same output as "df". What are you expecting > that is not present in the output ? > > > Is it possible that nfs directory got written to /usr at some point in > time? > > You would only notice this with du if the nfs directory is unmounted. > > Unmount it and ls -al /usr/mountpoint should only give you an empty dir > > Bingo!! That is exactly the problem. An NFS mount was hiding a 17G local dir > which had an old copy of the entire NFS mounted dir. I guess it must have > been written incorrectly to this standby server by RSYNC before the NFS > mount was put in place. I will add an exclusion to rsync to make sure it > does not happen again even if the NFS dir is not mounted. > > Thank you for your help, you have saved me much time rebuilding this server. Can either of you outline what exactly happened here? I'm trying to figure out how an "NFS mount was hiding a 17G local dir", when there's no NFS mounts shown in the above df output. This is purely an ignorant question on my part, but I'm not able to piece together what happened. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
RE: Large discrepancy in reported disk usage on USR partition
>> #: df -h >> Filesystem SizeUsed Avail Capacity Mounted on >> /dev/aacd0s1a 496M163M 293M36%/ >> devfs 1.0K1.0K 0B 100% /dev >> /dev/aacd0s1e 496M15M 441M3% /tmp >> /dev/aacd0s1f28G25G 1.2G96%/usr >> /dev/aacd0s1d 1.9G429M 1.3G24%/var > Is this output untruncated? Is df really df or an alias to 'df -t nonfs'? Yes, it really is the untruncated output of "df -h". I also tried the "df -t nonfs" and it gives exactly the same output as "df". What are you expecting that is not present in the output ? > Is it possible that nfs directory got written to /usr at some point in time? > You would only notice this with du if the nfs directory is unmounted. > Unmount it and ls -al /usr/mountpoint should only give you an empty dir Bingo!! That is exactly the problem. An NFS mount was hiding a 17G local dir which had an old copy of the entire NFS mounted dir. I guess it must have been written incorrectly to this standby server by RSYNC before the NFS mount was put in place. I will add an exclusion to rsync to make sure it does not happen again even if the NFS dir is not mounted. Thank you for your help, you have saved me much time rebuilding this server. Best Regards, Brendan Hart - Brendan Hart, Development Manager Strategic Ecommerce Division Securepay Pty Ltd Phone: 08-8274-4000 Fax: 08-8274-1400 __ Information from ESET NOD32 Antivirus, version of virus signature database 3571 (20081030) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Large discrepancy in reported disk usage on USR partition
On Fri, Oct 31, 2008 at 11:15:15AM +1030, Brendan Hart wrote: > > What you showed tells me nothing about SMART, other than the remote > > possibility > > its basing some of its decisions on the "general SMART health status", > > which means jack squat. I can explain why this is if need be, but it's > > not related to the problem you're having. > > Thanks for this additional information. I hadn't understood that there was > far more information behind the simple SMART ok/not ok reported by the PERC > controller. Here's an example of some attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 200 200 051Pre-fail Always - 0 3 Spin_Up_Time0x0003 178 175 021Pre-fail Always - 6066 4 Start_Stop_Count0x0032 100 100 000Old_age Always - 50 5 Reallocated_Sector_Ct 0x0033 200 200 140Pre-fail Always - 0 7 Seek_Error_Rate 0x000e 200 200 051Old_age Always - 0 9 Power_On_Hours 0x0032 085 085 000Old_age Always - 11429 10 Spin_Retry_Count0x0012 100 253 051Old_age Always - 0 11 Calibration_Retry_Count 0x0012 100 253 051Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000Old_age Always - 48 192 Power-Off_Retract_Count 0x0032 200 200 000Old_age Always - 33 193 Load_Cycle_Count0x0032 200 200 000Old_age Always - 50 194 Temperature_Celsius 0x0022 117 100 000Old_age Always - 33 196 Reallocated_Event_Count 0x0032 200 200 000Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 200 000Old_age Always - 0 198 Offline_Uncorrectable 0x0010 200 200 000Old_age Offline - 0 199 UDMA_CRC_Error_Count0x003e 200 200 000Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 051Old_age Offline - 0 You probably now understand why having access to this information is useful. :-) It's very disappointing that so many RAID controllers don't provide a way to get at this information; the ones which do I am very thankful for! > > Either way, this is just one of many reasons to avoid hardware RAID > controllers if given the choice. > > I have seen some mentions of using gvinum and/or gmirror to achieve the > goals of protection from Single Point Of Failure with a single disk, which I > believe is the reason that most people, myself included, have specified > Hardware RAID in their servers. Is this what you mean by avoiding Hardware > Raid? More or less. Hardware RAID has some advantages (I can dig up a mail of mine long ago outlining what the advantages were), but a lot of the time the controller acts as more of a hindrance than a benefit. I personally feel the negatives outweigh the positives, but each person has different needs and requirements. There are some controllers which work very well and provide great degrees of insights (at a disk level) under FreeBSD, and those are often what I recommend if someone wants to go that route. I make it sound like I'm the authoritative voice for what a person should or should not buy -- I'm not. I predominantly rely on Intel ICHx on-board controllers with SATA disks, because ICHx works quite well under FreeBSD (especially with AHCI). I personally have no experience with gmirror or gvinum, but I do have experience with ZFS. (I'll have a little more experience with gmirror once I have the time to test some reported problems with gmirror and high interrupt counts when a disk is hot-swapped). > > I hope these are SCSI disks you're showing here, otherwise I'm not sure how > > the > > controller is able to get the primary defect count of a SATA or SAS disk. > > So, > > assuming the numbers shown are accurate, then yes, I don't think there's > > any > > disk-level problem. > > Yes, they are SCSI disks. Not particularly relevant to this topic, but > interesting: I would have thought that SAS would make the same information > available as SCSI does, as it is a serial bus evolution of SCSI. Is this > thinking incorrect? I don't have any experience with SAS, so I can't comment on what features are available on SAS. Specifically with regards to SMART: historically, SCSI does not provide the amount of granularity/detail with attributes as ATA/SATA does. I do not consider this a negative against SCSI (in case, I very much like SCSI). SAS might provide these details, but I don't know, as I don't have any SAS disks. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator
RE: Large discrepancy in reported disk usage on USR partition
>> I took a look at using the smart tools as you suggested, but have now >> found that the disk in question is a RAID1 set on a DELL PERC 3/Di >> controller and smartctl does not appear to be the correct tool to >> access the SMART data for the individual disks. After a little >> research, I have found the aaccli tool and used it to get the following information: > Sadly, that controller does not show you SMART attributes. This is one of > the biggest problems with the majority (but not all) of hardware RAID > controllers -- they give you no access to disk-level things like SMART. > FreeBSD has support for such (using CAM's pass(4)), but the driver has > to support/use it, *and* the card firmware has to support it. At present, > Areca, 3Ware, and Promise controllers support such; HighPoint might, but > I haven't confirmed it. Adaptec does not. > What you showed tells me nothing about SMART, other than the remote possibility > its basing some of its decisions on the "general SMART health status", > which means jack squat. I can explain why this is if need be, but it's > not related to the problem you're having. Thanks for this additional information. I hadn't understood that there was far more information behind the simple SMART ok/not ok reported by the PERC controller. > Either way, this is just one of many reasons to avoid hardware RAID controllers if given the choice. I have seen some mentions of using gvinum and/or gmirror to achieve the goals of protection from Single Point Of Failure with a single disk, which I believe is the reason that most people, myself included, have specified Hardware RAID in their servers. Is this what you mean by avoiding Hardware Raid? > I hope these are SCSI disks you're showing here, otherwise I'm not sure how the > controller is able to get the primary defect count of a SATA or SAS disk. So, > assuming the numbers shown are accurate, then yes, I don't think there's any > disk-level problem. Yes, they are SCSI disks. Not particularly relevant to this topic, but interesting: I would have thought that SAS would make the same information available as SCSI does, as it is a serial bus evolution of SCSI. Is this thinking incorrect? > I understand at this point you're running around with your arms in the air, > but you've already confirmed one thing: none of your other systems exhibit > this problem. If this is a production environment, step back a moment and > ask yourself: "just how much time is this worth?" It might be better to just > newfs the filesystem and be done with it, especially if this is a one-time-never-seen-before thing. >> I will wait and see if any other list member has any suggestions for >> me to try, but I am now leaning toward scrubbing the system. Oh well. > When you say scrubbing, are you referring to actually formatting/wiping the system, or are you referring to disk scrubbing? I meant reformatting and reinstalling, as a way to escape the issue without spending too much more time on it. I would of course like to understand the problem so as to know what to avoid in the future, but as you make the point above, time is money and it is rapidly approaching the point where it isn't worth any more effort. Thanks for all your help. Best Regards, Brendan Hart __ Information from ESET NOD32 Antivirus, version of virus signature database 3571 (20081030) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Large discrepancy in reported disk usage on USR partition
On Thursday 30 October 2008 01:42:32 Brendan Hart wrote: > Hi, > > I have inherited some servers running various releases of FreeBSD and I am > having some trouble with the /usr partition on one of these boxen. > > The problem is that there appears to be far more space used on the USR > partition than there are actual files on the partition. The utility "df -h" > reports 25GB used (i.e. nearly the whole partition), but "du -x /usr" > reports only 7.6GB of files. > > I have reviewed the FAQ, particularly item 9.24 "The du and df commands > show different amounts of disk space available. What is going on?". > However, the suggested cause of the discrepancy (large files already > unlinked but still held open by active processes), does not appear to be > true in this case as problem is present even after rebooting into single > user mode. > > #: uname -a > FreeBSD ibisweb4spare.strategicecommerce.com.au 6.1-RELEASE FreeBSD > 6.1-RELEASE #0: Sun May 7 04:42:56 UTC 2006 > [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP i386 > > #: df -h > Filesystem SizeUsed Avail Capacity Mounted on > /dev/aacd0s1a 496M163M 293M36%/ > devfs 1.0K1.0K 0B 100% /dev > /dev/aacd0s1e 496M15M 441M3% /tmp > /dev/aacd0s1f28G25G 1.2G96%/usr > /dev/aacd0s1d 1.9G429M 1.3G24%/var Is this output untruncated? Is df really df or an alias to 'df -t nonfs'? > #: du -x -h /usr > 2.0K/usr/.snap > 24M/usr/bin > > > > 584M/usr/ports > 140K/usr/lost+found > 7.6G/usr Is it possible that nfs directory got written to /usr at some point in time? You would only notice this with du if the nfs directory is unmounted. Unmount it and ls -al /usr/mountpoint should only give you an empty dir. -- Mel Problem with today's modular software: they start with the modules and never get to the software part. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Large discrepancy in reported disk usage on USR partition
On Thu, Oct 30, 2008 at 02:04:36PM +1030, Brendan Hart wrote: > On Thu 30/10/2008 12:25 PM, Jeremy Chadwick wrote: > >> Could the "missing" space be an indication of hardware disk issues i.e. > >> physical blocks marked as bad? > > >The simple answer is no, bad blocks would not cause what you're seeing. > >smartctl -a /dev/disk will help you determine if there's evidence the disk > is in bad shape. I can help you with reading SMART stats if need be. > > I took a look at using the smart tools as you suggested, but have now found > that the disk in question is a RAID1 set on a DELL PERC 3/Di controller and > smartctl does not appear to be the correct tool to access the SMART data for > the individual disks. After a little research, I have found the aaccli tool > and used it to get the following information: Sadly, that controller does not show you SMART attributes. This is one of the biggest problems with the majority (but not all) of hardware RAID controllers -- they give you no access to disk-level things like SMART. FreeBSD has support for such (using CAM's pass(4)), but the driver has to support/use it, *and* the card firmware has to support it. At present, Areca, 3Ware, and Promise controllers support such; HighPoint might, but I haven't confirmed it. Adaptec does not. What you showed tells me nothing about SMART, other than the remote possibility its basing some of its decisions on the "general SMART health status", which means jack squat. I can explain why this is if need be, but it's not related to the problem you're having. Either way, this is just one of many reasons to avoid hardware RAID controllers if given the choice. > AAC0> disk show defects 00 > Executing: disk show defects (ID=0) > Number of PRIMARY defects on drive: 285 > Number of GROWN defects on drive: 0 > > AAC0> disk show defects 01 > Executing: disk show defects (ID=1) > Number of PRIMARY defects on drive: 193 > Number of GROWN defects on drive: 0 > > This output doesn't seem to indicate existing physical issues on the disks. I hope these are SCSI disks you're showing here, otherwise I'm not sure how the controller is able to get the primary defect count of a SATA or SAS disk. So, assuming the numbers shown are accurate, then yes, I don't think there's any disk-level problem. > I have done some additional digging and noticed that there is a /usr/.snap > folder present. "ls -al" shows no content however. Some quick searching > shows this could possibly be part of a UFS snapshot... Correct; the .snap directory is used for UFS2 snapshots and mksnap_ffs(8) (which is also the program dump -L uses). > I wonder if partition snapshots might be the cause of my major disk > space "loss". Your /usr/.snap directory is empty; there are no snapshots. That said, are you actually making filesystem snapshots using dump or mksnap_ffs? If not, then you're barking up the wrong tree. :-) > I also took a look to see if the issue could be something like running out > of inodes, But this does't seem to be the case: > > #: df -ih /usr > Filesystem SizeUsed Avail Capacity iused ifree %iused Mounted > on > /dev/aacd0s1f 28G 25G1.1G96% 708181 3107241 19% /usr inodes != disk space, but I'm pretty sure you know that. I understand at this point you're running around with your arms in the air, but you've already confirmed one thing: none of your other systems exhibit this problem. If this is a production environment, step back a moment and ask yourself: "just how much time is this worth?" It might be better to just newfs the filesystem and be done with it, especially if this is a one-time-never-seen-before thing. > I will wait and see if any other list member has any suggestions for me to > try, but I am now leaning toward scrubbing the system. Oh well. When you say scrubbing, are you referring to actually formatting/wiping the system, or are you referring to disk scrubbing? -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
RE: Large discrepancy in reported disk usage on USR partition
On Thu 30/10/2008 12:25 PM, Jeremy Chadwick wrote: >> Could the "missing" space be an indication of hardware disk issues i.e. >> physical blocks marked as bad? >The simple answer is no, bad blocks would not cause what you're seeing. >smartctl -a /dev/disk will help you determine if there's evidence the disk is in bad shape. I can help you with reading SMART stats if need be. I took a look at using the smart tools as you suggested, but have now found that the disk in question is a RAID1 set on a DELL PERC 3/Di controller and smartctl does not appear to be the correct tool to access the SMART data for the individual disks. After a little research, I have found the aaccli tool and used it to get the following information: AAC0> disk show smart Executing: disk show smart SmartMethod of Enable Capable Informational Exception Performance Error B:ID:L Device Exceptions(MRIE) ControlEnabled Count -- --- - --- -- 0:00:0 Y6 Y N 0 0:01:0 Y6 Y N 0 AAC0> disk show defects 00 Executing: disk show defects (ID=0) Number of PRIMARY defects on drive: 285 Number of GROWN defects on drive: 0 AAC0> disk show defects 01 Executing: disk show defects (ID=1) Number of PRIMARY defects on drive: 193 Number of GROWN defects on drive: 0 This output doesn't seem to indicate existing physical issues on the disks. > Since you booted single-user and presumably ran fsck -f /usr, and nothing came back, I'm left to believe this isn't filesystem corruption. Yes, this is the command I tried when I went into the data centre yesterday, and yes, nothing came back. I have done some additional digging and noticed that there is a /usr/.snap folder present. "ls -al" shows no content however. Some quick searching shows this could possibly be part of a UFS snapshot... I wonder if partition snapshots might be the cause of my major disk space "loss". Some old message group posts suggest that UFS snapshots were dangerously flakey on Release 6.1, so I would hope that my predecessors were not using them however... Do you know anything about snapshots, and how I could see what/if any/ space is used by snapshots? I also took a look to see if the issue could be something like running out of inodes, But this does't seem to be the case: #: df -ih /usr Filesystem SizeUsed Avail Capacity iused ifree %iused Mounted on /dev/aacd0s1f 28G 25G1.1G96% 708181 3107241 19% /usr BTW Jeremy, thanks for your help thus far. I will wait and see if any other list member has any suggestions for me to try, but I am now leaning toward scrubbing the system. Oh well. Best Regards, Brendan Hart - Brendan Hart, Development Manager Strategic Ecommerce Division Securepay Pty Ltd Phone: 08-8274-4000 Fax: 08-8274-1400 __ Information from ESET NOD32 Antivirus, version of virus signature database 3568 (20081030) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Large discrepancy in reported disk usage on USR partition
On Thu, Oct 30, 2008 at 12:11:58PM +1030, Brendan Hart wrote: > The space reserved as minfree does not appear to have been changed from the > default setting of 8%. Okay, then that's likely not the problem. > Is your suggestion that I should change it to a larger value? That would just make your problem worse. :-) > I don't understand how modifying it now could fix the situation, but I > could be missing something. Well, the feature I described isn't what's causing your problem, but to clarify: if you change the percentage, it applies immediately. I read "I don't understand how modifying it now could fix ..." to mean "isn't this option applied during newfs?" > I have not observed the problem on any of the other ~dozen FreeBSD servers > in our data centre. Unless someone more clueful chimes in with better hints, the obvious choice here is going to be "recreate the filesystem". I'd tell you something like "try using ffsinfo(8)?", but I've never used the tool, so very little of the output will make sense to me. > Could the "missing" space be an indication of hardware disk issues i.e. > physical blocks marked as bad? The simple answer is no, bad blocks would not cause what you're seeing. smartctl -a /dev/disk will help you determine if there's evidence the disk is in bad shape. I can help you with reading SMART stats if need be. Since you booted single-user and presumably ran fsck -f /usr, and nothing came back, I'm left to believe this isn't filesystem corruption. > Is it possible on UFS2 for disk space to be allocated but hidden somehow? > (although I have been running the commands such as "du -x" as superuser) That's exactly what the above tunefs parameter describes. > Similarly, is it possible on UFS2 for disk space to be allocated in "lost > cluster chains" ? I don't know what this means. Someone more clueful will have to answer. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
RE: Large discrepancy in reported disk usage on USR partition
Hi, The space reserved as minfree does not appear to have been changed from the default setting of 8%. Is your suggestion that I should change it to a larger value? I don't understand how modifying it now could fix the situation, but I could be missing something. The output of "tunefs -p /usr" is as follows: #: tunefs -p /usr tunefs: ACLs: (-a) disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n) enabled tunefs: maximum blocks per file in a cylinder group: (-e) 2048 tunefs: average file size: (-f)16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: optimization preference: (-o) time tunefs: volume label: (-L) I have not observed the problem on any of the other ~dozen FreeBSD servers in our data centre. Could the "missing" space be an indication of hardware disk issues i.e. physical blocks marked as bad? Is it possible on UFS2 for disk space to be allocated but hidden somehow? (although I have been running the commands such as "du -x" as superuser) Similarly, is it possible on UFS2 for disk space to be allocated in "lost cluster chains" ? Best Regards, Brendan Hart -Original Message- From: Jeremy Chadwick [mailto:[EMAIL PROTECTED] Sent: Thursday, 30 October 2008 11:50 AM To: Brendan Hart Cc: freebsd-questions@freebsd.org Subject: Re: Large discrepancy in reported disk usage on USR partition On Thu, Oct 30, 2008 at 11:12:32AM +1030, Brendan Hart wrote: > I have inherited some servers running various releases of FreeBSD and I am > having some trouble with the /usr partition on one of these boxen. > > The problem is that there appears to be far more space used on the USR > partition than there are actual files on the partition. The utility "df -h" > reports 25GB used (i.e. nearly the whole partition), but "du -x /usr" > reports only 7.6GB of files. Have you tried playing with tunefs(8), -m flag? I can't reproduce this behaviour on any of our systems. icarus# df -k /usr Filesystem 1024-blocksUsed Avail Capacity Mounted on /dev/ad12s1f 167879968 1973344 152476228 1%/usr icarus# du -sx /usr 1973344 /usr eos# df -k /usr Filesystem 1024-blocksUsedAvail Capacity Mounted on /dev/ad0s1f32494668 2261670 27633426 8%/usr eos# du -sx /usr 2261670 /usr anubis# df -k /usr Filesystem 1024-blocksUsedAvail Capacity Mounted on /dev/ad4s1f80010344 1809620 71799898 2%/usr anubis# du -sx /usr 1809620 /usr horus# df -k /usr Filesystem 1024-blocksUsedAvail Capacity Mounted on /dev/ad4s1f32494668 1608458 28286638 5%/usr horus# du -sx /usr 1608458 /usr -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | __ Information from ESET NOD32 Antivirus, version of virus signature database 3567 (20081029) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 3567 (20081029) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Large discrepancy in reported disk usage on USR partition
On Thu, Oct 30, 2008 at 11:12:32AM +1030, Brendan Hart wrote: > I have inherited some servers running various releases of FreeBSD and I am > having some trouble with the /usr partition on one of these boxen. > > The problem is that there appears to be far more space used on the USR > partition than there are actual files on the partition. The utility "df -h" > reports 25GB used (i.e. nearly the whole partition), but "du -x /usr" > reports only 7.6GB of files. Have you tried playing with tunefs(8), -m flag? I can't reproduce this behaviour on any of our systems. icarus# df -k /usr Filesystem 1024-blocksUsed Avail Capacity Mounted on /dev/ad12s1f 167879968 1973344 152476228 1%/usr icarus# du -sx /usr 1973344 /usr eos# df -k /usr Filesystem 1024-blocksUsedAvail Capacity Mounted on /dev/ad0s1f32494668 2261670 27633426 8%/usr eos# du -sx /usr 2261670 /usr anubis# df -k /usr Filesystem 1024-blocksUsedAvail Capacity Mounted on /dev/ad4s1f80010344 1809620 71799898 2%/usr anubis# du -sx /usr 1809620 /usr horus# df -k /usr Filesystem 1024-blocksUsedAvail Capacity Mounted on /dev/ad4s1f32494668 1608458 28286638 5%/usr horus# du -sx /usr 1608458 /usr -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"