On 05/07/2021 14:37, Stefan Esser wrote:
Hi Pete,
have you checked the drive state and statistics with smartctl?
Hi, thanks for the reply - yes, I did check the statistics, and they
dont make a lot of sense. I was just looking at them again in fact.
So, one of the machines that we chnaged a drive on when this first
started, which was 4 weeks ago.
root@telehouse04:/home/webadmin # smartctl -a /dev/ada0 | grep Perc
169 Remaining_Lifetime_Perc 0x0000 082 082 000 Old_age
Offline - 82
root@telehouse04:/home/webadmin # smartctl -a /dev/ada1 | grep Perc
202 Percent_Lifetime_Remain 0x0030 100 100 001 Old_age
Offline - 0
Now, from that you might think the 2nd drive was the one changes, but
no. Its the first one, which is now at 82% lifetime remaining! The other
druve, still at 100%, has been in there a year. The drives are different
manufacturers, which makes comparing most of the numbers tricky
unfortunately.
Am now even more worried than when I sent the first email - if that 18%
is accurate then I am going to be doing this again in another 4 months,
and thats not sustainable. It also looks as if this problem has got a
lot worse recently. Though I wasnt looking at the numbers before, only
noticing tyhe failurses. If I look at 'Percentage Used Endurance
Indicator' isntead of the 'Percent_Lifetime_Remain' value then I see
some of those well over 200%. That value is, on the newer drives, 100
minus the 'Percent_Lifetime_Remain' value, so I guess they ahve the same
underlying metric.
I didnt mention in my original email, but I am encrypting these with
geli. Does geli do any write amplification at all ? That might explain
the high write volumes...
-pete.