Re: what does md do if it finds an inconsistency?

2007-05-08 Thread Bill Davidsen

martin f krafft wrote:

The first time it reports that it found (and repaired) 128 items.
It does not mean that you now *have* 128 mismatches.

The next run ('repair' or 'check') will find none (hopefully...)
and report zero.



Oh, this makes perfect sense, thanks for the explanation.

As the mdadm maintainer for Debian, I would like to come up with a way to
handle mismatches somewhat intelligently. I already have the check
sync_action run once a month on all machines by default (can be turned
on/off via debconf), and now I would like to find a good way to react when
mismatch_count is non-zero. I don't want to write to the components
without the admin's consent though.
  
That sounds right. Some arrarys have persistent mismatches if they are 
in use, you are unlikely to want to even attempt to take corrective 
action. You might want to have a config file and just run a program 
which reads the config regularly and acts based on what it finds.


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: what does md do if it finds an inconsistency?

2007-05-06 Thread Neil Brown
On Sunday May 6, [EMAIL PROTECTED] wrote:
> On Sun, 06 May 2007, martin f krafft wrote:
> 
> > Maybe the ideal way would be to have mdadm --monitor send an email on
> > mismatch_count>0 or a cronjob that regularly sends reminders, until the
> > admin logs in and runs e.g. /usr/share/mdadm/repairarray.

You could certainly do that.  If you configure mdadm to run a program
for each 'monitor' event, you can detect the mismatch count from
argv[3] when argv[1] ==  RebuildFinished.

Though I suspect many people would be happy with running the 'repair'
every month rather than just a 'check'.  Maybe that should be a config
option.

> > 
> > Also, if a mismatch is found on a RAID1, how does md decide which copy is
> > mismatched and which is correct? What about RAID 5/6/10?
> 
> I think it just picks one at random.  After all, how could you reliably
> know which is right in a raid1 array?  With raid5, I understand it just
> updates the parity.

I prefer to say "arbitrary" rather than "random".
I think the current implementation uses the first readable device as
the 'correct' one.
Otherwise, this is correct.

> 
> I had an idea to write an interactive userspace program which ran through
> each block on each disk device to figure out which ones didn't match up and
> then figure out whether it's within allocated filesystem space and if so,
> which file or filesystem data was affected.  This would hopefully enable a
> user to figure out which block is wrong and correct things.
> 

That would be awfully difficult as doing a reverse mapping (block ->
file) is no-trivial in almost any filesystem, and you would want to
(ultimately) do it for every filesystem...

Might be educational though :-)

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: what does md do if it finds an inconsistency?

2007-05-06 Thread Gavin McCullagh
On Sun, 06 May 2007, martin f krafft wrote:

> Maybe the ideal way would be to have mdadm --monitor send an email on
> mismatch_count>0 or a cronjob that regularly sends reminders, until the
> admin logs in and runs e.g. /usr/share/mdadm/repairarray.
> 
> Also, if a mismatch is found on a RAID1, how does md decide which copy is
> mismatched and which is correct? What about RAID 5/6/10?

I think it just picks one at random.  After all, how could you reliably
know which is right in a raid1 array?  With raid5, I understand it just
updates the parity.

I had an idea to write an interactive userspace program which ran through
each block on each disk device to figure out which ones didn't match up and
then figure out whether it's within allocated filesystem space and if so,
which file or filesystem data was affected.  This would hopefully enable a
user to figure out which block is wrong and correct things.

Gavin

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: what does md do if it finds an inconsistency?

2007-05-06 Thread martin f krafft
> The first time it reports that it found (and repaired) 128 items.
> It does not mean that you now *have* 128 mismatches.
>
> The next run ('repair' or 'check') will find none (hopefully...)
> and report zero.

Oh, this makes perfect sense, thanks for the explanation.

As the mdadm maintainer for Debian, I would like to come up with a way to
handle mismatches somewhat intelligently. I already have the check
sync_action run once a month on all machines by default (can be turned
on/off via debconf), and now I would like to find a good way to react when
mismatch_count is non-zero. I don't want to write to the components
without the admin's consent though.

Maybe the ideal way would be to have mdadm --monitor send an email on
mismatch_count>0 or a cronjob that regularly sends reminders, until the
admin logs in and runs e.g. /usr/share/mdadm/repairarray.

Thoughts?

Also, if a mismatch is found on a RAID1, how does md decide which copy is
mismatched and which is correct? What about RAID 5/6/10?

Thanks for your time!
-martin

>
> --
> Eyal Lebedinsky ([EMAIL PROTECTED]) 
>   attach .zip as .dat
>
>

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: what does md do if it finds an inconsistency?

2007-05-06 Thread Eyal Lebedinsky
martin f krafft wrote:
> also sprach martin f krafft <[EMAIL PROTECTED]> [2007.05.06.0245 +0200]:
> 
>>With the check feature of the recent md feature, the question popped
>>up what happens when an inconsistency is found. Does it fix it? If
>>so, which disk it assumes to be wrong if an inconsistency is found?
> 
> 
> What I meant was of course
> 
>   echo repair > sycn_action
> 
> I am unsure what happens:
> 
>   piper:/sys/block/md7/md# cat mismatch_cnt
>   128
>   piper:/sys/block/md7/md# echo repair > sync_action
>   piper:/sys/block/md7/md# cat sync_action
>   idle
>   piper:/sys/block/md7/md# cat mismatch_cnt
>   128 
> 
> If I do this again, then mismatch_cnt goes to 0. Not the first time.
> 
> md7 : active raid10 sda2[0] sdc2[2] sdb2[1]
>   1373376 blocks 64K chunks 2 near-copies [3/3] [UUU]

The first time it reports that it found (and repaired) 128 items.
It does not mean that you now *have* 128 mismatches.

The next run ('repair' or 'check') will find none (hopefully...)
and report zero.

-- 
Eyal Lebedinsky ([EMAIL PROTECTED]) 
attach .zip as .dat
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: what does md do if it finds an inconsistency?

2007-05-06 Thread martin f krafft
also sprach martin f krafft <[EMAIL PROTECTED]> [2007.05.06.0245 +0200]:
> With the check feature of the recent md feature, the question popped
> up what happens when an inconsistency is found. Does it fix it? If
> so, which disk it assumes to be wrong if an inconsistency is found?

What I meant was of course

  echo repair > sycn_action

I am unsure what happens:

  piper:/sys/block/md7/md# cat mismatch_cnt
  128
  piper:/sys/block/md7/md# echo repair > sync_action
  piper:/sys/block/md7/md# cat sync_action
  idle
  piper:/sys/block/md7/md# cat mismatch_cnt
  128 

If I do this again, then mismatch_cnt goes to 0. Not the first time.

md7 : active raid10 sda2[0] sdc2[2] sdb2[1]
  1373376 blocks 64K chunks 2 near-copies [3/3] [UUU]

-- 
martin;  (greetings from the heart of the sun.)
  \ echo mailto: !#^."<*>"|tr "<*> mailto:"; [EMAIL PROTECTED]
 
spamtraps: [EMAIL PROTECTED]
 
"the thought of suicide is a great consolation: by means of it one
 gets successfully through many a bad night."
 - friedrich nietzsche


signature.asc
Description: Digital signature (GPG/PGP)


what does md do if it finds an inconsistency?

2007-05-05 Thread martin f krafft
Neil,

With the check feature of the recent md feature, the question popped
up what happens when an inconsistency is found. Does it fix it? If
so, which disk it assumes to be wrong if an inconsistency is found?

Cheers,

-- 
martin;  (greetings from the heart of the sun.)
  \ echo mailto: !#^."<*>"|tr "<*> mailto:"; [EMAIL PROTECTED]
 
spamtraps: [EMAIL PROTECTED]
 
"frank harris has been received
 in all the great houses -- once!"
-- oscar wilde


signature.asc
Description: Digital signature (GPG/PGP)