On 05.02.2012 16:34, Russell Coker wrote:
> Package: mdadm
> Version: 3.2.3-2
> Severity: important
> 
> Feb  5 22:55:09 xev mdadm[20730]: RebuildFinished event detected on md device 
> /dev/md0, component device  mismatches found: 20608 (on raid level 1)
> 
> When a check initiated by /etc/cron.d/mdadm finds an error mdadm will discover
> this and log an error such as the above with facility DAEMON.  But it doesn't
> send an email.

This is the same as discussed in #599821 and #588516.  I'll think about
mergeing all 3 together.

> I believe that this is a serious bug, it seems to me that one of the most
> significant conditions it can encounter that should be immediately reported to
> the sysadmin is the fact that the contents of disks are changing and breaking
> RAID consistency!

Yes that's the condition it may encouner indeed.  The question is WHY - under 
normal
conditions there should be no such errors.

There are two points there.

First, a formal one.  Were it a serious issue if such a check weren't be done at
all?  I think that in this case this bugreport didt'n exist to start with.

And second, more to the point, Neil gave a very good writeup of these checks and
repairs of raid arrays, about deciding which part/component of the array is
"more right".  Unfortunately I can't find it right now.

> 
> For a 3-disk mirror or a RAID-6 such an error can be reliably corrected as 
> long
> as all the other disks are fine.  If you have an array with double-redundancy
> and one disk fails entirely while another returns dodgey data then you lose,
> and obviously anyone who creates a doubly-redundant array wants protection
> against that sort of thing.
> 
> With a RAID-1 or RAID-5 array every mismatch is an indication of real data
> corruption and is very important.
> 
> The following patch makes mdadm send email about such events.
> 
> --- /tmp/Monitor.c    2012-02-05 23:28:41.873079816 +1100
> +++ ./Monitor.c       2012-02-05 23:32:03.961132380 +1100
> @@ -364,6 +364,7 @@
>           (strncmp(event, "Fail", 4)==0 ||
>            strncmp(event, "Test", 4)==0 ||
>            strncmp(event, "Spares", 6)==0 ||
> +          (strncmp(event, "RebuildFinished", 15)==0 && disc) ||
>            strncmp(event, "Degrade", 7)==0)) {
>               FILE *mp = popen(Sendmail, "w");
>               if (mp) {
> 

This might be more interesting approach than already offered in two
other mentioned patches.

/mjt



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to