Stroller writes:

> On 21 Aug 2010, at 14:25, Alex Schuster wrote:
> > ...
> > I want to monitor the power status of my hard drives, so I wrote a
> > little
> > script that gives me this output:
> > 
> > sda: standby
> > sdb: standby
> > sdc: active/idle 32°C
> > sdd: active/idle 37°C
> > 
> > This script is called every minute via an fcron entry, output goes
> > into a log file, and I use the file monitor plasmoid to watch this log
> > file in KDE.
> > 
> > It's working fine, but  also monitor my syslog in another file
> > monitor plamoid, and now I get lots of these entries:
> > 
> > Aug 21 14:21:06 [fcron] pam_unix(fcron:session): session opened for
> > user root by (uid=0)
> > Aug 21 14:21:06 [fcron] Job /usr/local/sbin/hdstate >> /var/log/
> > hdstate started for user root (pid 24483)
> > Aug 21 14:21:08 [fcron] Job /usr/local/sbin/hdstate >> /var/log/
> > hdstate completed
> > Aug 21 14:21:08 [fcron] pam_unix(fcron:session): session closed for
> > user root
> 
> #!/bin/bash
> while true
> do
>    for drive in a b c d
>    do
>       /usr/sbin/smartctl /dev/sd$drive --whatever >> /var/log/hdstate
>    done
> sleep 60
> done

I use hdparm and hddtemp:

for hd in sda sdb sdc sdd
do
        str=$( /sbin/hdparm -C /dev/$hd )
        state=${str##*is:  }
        if [[ $state == active/idle ]] && [[ $hd =~ sd[c] ]]
        then
                temp=$( /usr/sbin/hddtemp -q /dev/$hd )
                temp=${temp% or *}
                temp=${temp##* }
        else
                temp=
        fi
        echo "$hd: $state $temp"
done

Unfortunately, reading the temperature makes a drive in standby spin up, 
and prevents automatic spindown after a while of idle time. So now I ask 
for the temperature only on my system drive, the others should sleep most 
of the time anyway.


> I would personally update more often than this, and my concern would
> be that if the process fails then your plasmoid isn't showing the
> correct data.
> 
> I presume this is the same with your current setup: if cron dies then
> the current temperature will not be read to file, and the plasmoid
> will continue reading the last lines in /var/log/hdstate - the drive
> can overheat without you knowing about it.

Nah, it's really not that important for me. I show the temperature just 
for the fun of it, and for extreme temperatures I have smartd running, see 
below.
I'm more interested in the active/standby state. I just added two old 
additonal IDE drives for additional backups, and I want them to be silent 
most of the time. So I wrote a little script to show the status so I see 
when they spin up again (and they do this sometimes), and used fcron to 
get the data into a log file that the plasmoids shows.

The problem with cron is that I get those cron logs I do not like, and 
that the update time of 60 seconds is a little long. Running the script in 
a loop, started in .kde4/Autostart, would be better, but as a user I have 
no permission to call hdparm or hdtemp. I do not want to be part of the 
disk group, and when using sudo I would get the logs by sudo I wanted to 
avoid. So now I SUID'ed hdparm and hddtemp, changed the group to wheel and 
disabled execution for others. cron problem not solved, but workarounded.


> So I would expect there to be a better "plasmid" for this task. I'm
> completely unfamiliar with plasmids, but what you really want is a
> plasmid that itself runs a script and displays the stdout on your
> screen. That way if there's no data, or an error, then _you see that
> in the plasmid_, instead of silently ignoring it (as you may be at
> present).
> 
> The easiest (but dumb) way to handle this is to add the date to your
> plasmid's display so that at least you can see that something's wrong
> if it doesn't match the clock. A better way is not to have to watch a
> status monitor at all, and just have a script running that emails you
> if the temperature is above a specified range.

I have smartd running, which should send me mails about such things. For 
each drive, I have a line like this in /etc/smartd.conf:

/dev/sdc -a -n standby -o on -S on -W 5,40,45 \
         -s (S/../.././12|L/../../06/06) -m r...@wonkology.org

This does some regular health checks on the drive, when it is not in 
standby mode. Temperature changes of more than 5 degrees and temperatures 
of 40 degrees or more are logged. I will receive an email when the 
temperature reaches 45 degrees, or when it reaches a new maximum. The 
maximum values are preserved across boot cycles (option -S). Every day at 
12:00, a short self test is scheduled, and a long self test each sunday on 
06:00.

        Wonko

Reply via email to