tune2fs: last write time is weeks or even months ago, moreover before last reboot

David Guyot Thu, 24 Jul 2014 06:45:40 -0700

Hello, there.

I recently noticed that one of our servers was horribly slow when
aptitude upgrading, at the point that installing a simple mysql update
took a minute or so, when this server is in no way overloaded : twenty
or so mail accounts, not heavily used, an intranet with its MySQL daemon
and a front web page and under 1 Mbps on network interface. I forcefully
checked mdadm clusters: they're all clean. I checked RAID attributes: I
got a full scale reading on the read error rate on one of the two disks
— in fact a 2^16 value, so I assumed this was a positive integer
counter which reached its full scale reading — but this value disappeared
when I tried to investigate and dropped back to zero. Already a problem
because, as far as I know, this value just can't decrease, only increase;
am I right to suspect a faulty hard disk ?


Besides that, I listed the filesystems — all of them being ext3 —
parameters with tune2fs, and the I saw strange values at last
mount/write dates : every filesystems say that these dates are a few
weeks ago, at a moment I restarted the server — cleanly, I mean. Worse
than that, the / filesystem says that it hasn't been written since the
7th of December, 2013. That's more than seven months ago ! I would make
clear that this server's clock is NTP-synchronised; I just checked it
and it has corrects date and time; in addition, inodes counts are OK,
there are plenty of them free, and filesystems are not even used at 10%.
In fact, I see nothing else wrong with these filesystems. Apart from the
strange change in RAID attributes values, virtually nothing is wrong with
these disks besides the inconsistent last write dates in the filesystems.
I noticed that our other servers also show a last write date some weeks
ago, so I assume these values are consistent, but seven months ago, with
at least one reboot and a server always running since December ? I can't
imagine a logical reason for such a period. Do you know if this long
period is consistent ? If so, why is it consistent ? If not, does that
mean that a hard disk is to be changed ? The one whose RAID attributes
values are so erratic ? By the way, how can such a value decrease from
full scale reading to zero in a matter of minutes — during an extended
self-test, I should add ?

I was considering running the sync command to forcibly flush disks
caches, but, as the filesystems are all that slow that a single file
remove with rm took around a minute and slowed I/O at the point that
half the CPU cores where used for I/O wait, I'm not sure this is a good
idea to launch a sync. In fact, would a sync be effective ? fask,
maybe ? What else could be effective ?

Thank you in advance for your answers.

Regards.

PS: I will of course provide any needed additional information, as long
as it isn't a critical information for our server's security.
-- 
David Guyot
Administrateur système, réseau et télécommunications / Sysadmin
Europe Camions Interactive / Stockway
Moulin Collot
F-88500 Ambacourt
Tel: +33 (0)3 29 30 47 85
Fax : +33 (0)3 29 31 31 31

signature.asc
Description: Digital signature

tune2fs: last write time is weeks or even months ago, moreover before last reboot

Reply via email to