RAID6 issue: md_do_sync() got signal ... exiting

2007-11-30 Thread thomas62186218

Hi all,

I am having trouble with creating a RAID 6 md device on a home-grown 
Linux 2.6.20.11 SMP 64-bit build.


I first create the RAID6 without problems, and see the following 
successful dump in /var/log/messages. If I check /proc/mdstat, the 
RAID6 is doing the initial syncing as expected.


Nov 30 01:26:39 testsystem kernel: md: bind
Nov 30 01:26:39 testsystem kernel: md: bind
Nov 30 01:26:39 testsystem kernel: md: bind
Nov 30 01:26:39 testsystem kernel: md: bind
Nov 30 01:26:39 testsystem kernel: md: md0: raid array is not clean -- 
starting background reconstruction
Nov 30 01:26:39 testsystem kernel: raid5: device dm-0 operational as 
raid disk 3
Nov 30 01:26:39 testsystem kernel: raid5: device dm-1 operational as 
raid disk 2
Nov 30 01:26:39 testsystem kernel: raid5: device dm-2 operational as 
raid disk 1
Nov 30 01:26:39 testsystem kernel: raid5: device dm-3 operational as 
raid disk 0

Nov 30 01:26:39 testsystem kernel: raid5: allocated 4268kB for md0
Nov 30 01:26:39 testsystem kernel: raid5: raid level 6 set md0 active 
with 4 out of 4 devices, algorithm 0

Nov 30 01:26:39 testsystem kernel: RAID5 conf printout:
Nov 30 01:26:39 testsystem kernel:  --- rd:4 wd:4
Nov 30 01:26:42 testsystem kernel:  disk 0, o:1, dev:dm-3
Nov 30 01:26:42 testsystem kernel:  disk 1, o:1, dev:dm-2
Nov 30 01:26:42 testsystem kernel:  disk 2, o:1, dev:dm-1
Nov 30 01:26:42 testsystem kernel:  disk 3, o:1, dev:dm-0
Nov 30 01:26:42 testsystem kernel: md: resync of RAID array md0
Nov 30 01:26:42 testsystem kernel: md: minimum _guaranteed_  speed: 0 
KB/sec/disk.
Nov 30 01:26:42 testsystem kernel: md: using maximum available idle IO 
bandwidth (but not more than 0 KB/sec) for resync.
Nov 30 01:26:42 testsystem kernel: md: using 128k window, over a total 
of 143371968 blocks.



If I then delete the RAID6 and try to create the exact same RAID6 
again, it sometimes fails, with the following appearing in 
/var/log/messages. Note the md_do_sync() got signal ... exiting line:


Nov 30 01:28:03 testsystem kernel: md: bind
Nov 30 01:28:03 testsystem kernel: md: bind
Nov 30 01:28:03 testsystem kernel: md: bind
Nov 30 01:28:03 testsystem kernel: md: bind
Nov 30 01:28:03 testsystem kernel: md: md0: raid array is not clean -- 
starting background reconstruction
Nov 30 01:28:03 testsystem kernel: raid5: device dm-0 operational as 
raid disk 3
Nov 30 01:28:03 testsystem kernel: raid5: device dm-1 operational as 
raid disk 2
Nov 30 01:28:03 testsystem kernel: raid5: device dm-2 operational as 
raid disk 1
Nov 30 01:28:03 testsystem kernel: raid5: device dm-3 operational as 
raid disk 0

Nov 30 01:28:03 testsystem kernel: raid5: allocated 4268kB for md0
Nov 30 01:28:03 testsystem kernel: raid5: raid level 6 set md0 active 
with 4 out of 4 devices, algorithm 0

Nov 30 01:28:03 testsystem kernel: RAID5 conf printout:
Nov 30 01:28:03 testsystem kernel:  --- rd:4 wd:4
Nov 30 01:28:04 testsystem kernel:  disk 0, o:1, dev:dm-3
Nov 30 01:28:05 testsystem kernel:  disk 1, o:1, dev:dm-2
Nov 30 01:28:05 testsystem kernel:  disk 2, o:1, dev:dm-1
Nov 30 01:28:05 testsystem kernel:  disk 3, o:1, dev:dm-0
Nov 30 01:28:05 testsystem kernel: md: resync of RAID array md0
Nov 30 01:28:05 testsystem kernel: md: minimum _guaranteed_  speed: 0 
KB/sec/disk.
Nov 30 01:28:05 testsystem kernel: md: using maximum available idle IO 
bandwidth (but not more than 0 KB/sec) for resync.
Nov 30 01:28:05 testsystem kernel: md: using 128k window, over a total 
of 143368192 blocks.
Nov 30 01:28:05 testsystem kernel: md: md_do_sync() got signal ... 
exiting

Nov 30 01:28:05 testsystem kernel: md: checkpointing resync of md0.
Nov 30 01:28:05 testsystem kernel: md: md0 stopped.
Nov 30 01:28:05 testsystem kernel: md: unbind
Nov 30 01:28:05 testsystem kernel: md: export_rdev(dm-0)
Nov 30 01:28:05 testsystem kernel: md: unbind
Nov 30 01:28:05 testsystem kernel: md: export_rdev(dm-1)
Nov 30 01:28:05 testsystem kernel: md: unbind
Nov 30 01:28:05 testsystem kernel: md: export_rdev(dm-2)
Nov 30 01:28:05 testsystem kernel: md: unbind
Nov 30 01:28:05 testsystem kernel: md: export_rdev(dm-3)


The failure is VERY intermittent. Sometimes it fails, sometimes it 
succeeds...with the exact same creation procedure. Any ideas on what 
may be causing this issue? Thank you very much in advance for your 
assistance!



Best regards,
Thomas










More new features than ever.  Check out the new AOL Mail ! - 
http://o.aolcdn.com/cdn.webmail.aol.com/mailtour/aol/en-us/text.htm?ncid=aolcmp000503

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid6 check/repair

2007-11-30 Thread Thiemo Nagel

Dear Neil,


The point that I'm trying to make is, that there does exist a specific
case, in which recovery is possible, and that implementing recovery for
that case will not hurt in any way.


Assuming that it true (maybe hpa got it wrong) what specific
conditions would lead to one drive having corrupt data, and would
correcting it on an occasional 'repair' pass be an appropriate
response?


The use case for the proposed 'repair' would be occasional,
low-frequency corruption, for which many sources can be imagined:

Any piece of hardware has a certain failure rate, which may depend on
things like age, temperature, stability of operating voltage, cosmic
rays, etc. but also on variations in the production process.  Therefore,
hardware may suffer from infrequent glitches, which are seldom enough,
to be impossible to trace back to a particular piece of equipment.  It
would be nice to recover gracefully from that.

Kernel bugs or just plain administrator mistakes are another thing.

But also the case of power-loss during writing that you have mentioned
could profit from that 'repair':  With heterogeneous hardware, blocks
may be written in unpredictable order, so that in more cases graceful
recovery would be possible with 'repair' compared to just recalculating
parity.


Does the value justify the cost of extra code complexity?


In the case of protecting data integrity, I'd say 'yes'.


Everything costs extra.  Code uses bytes of memory, requires
maintenance, and possibly introduced new bugs.


Of course, you are right.  However, in my other email, I tried to sketch
a piece of code which is very lean as it makes use of functions which I
assume to exist.  (Sorry, I didn't look at the md code, yet, so please
correct me if I'm wrong.)  Therefore I assume the costs in memory,
maintenance and bugs to be rather low.

Kind regards,

Thiemo

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-11-30 Thread Bryce

Dragos wrote:

Hello,
I had created a raid 5 array on 3 232GB SATA drives. I had created one 
partition (for /home) formatted with either xfs or reiserfs (I do not 
recall).
Last week I reinstalled my box from scratch with Ubuntu 7.10, with 
mdadm v. 2.6.2-1ubuntu2.
Then I made a rookie mistake: I --create instead of --assemble. The 
recovery completed. I then stopped the array, realizing the mistake.


1. Please make the warning more descriptive: ALL DATA WILL BE LOST, 
when attempting to created an array over an existing one.
2. Do you know of any way to recover from this mistake? Or at least 
what filesystem it was formated with.


Any help would be greatly appreciated. I have hundreds of family 
digital pictures and videos that are irreplaceable.

Thank you in advance,
Dragos

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Meh,...
I do that all the time for testing
The raid metadata is separate from the FS in that you can trash it as 
much as you like and the FS it refers to will be fine as long as you 
don't decide to mkfs over it
If you've an old /var/log/messages kicking around from when the raid was 
correct you should be able to extract the order eg,


RAID5 conf printout:
--- rd:5 wd:5
disk 0, o:1, dev:sdf1
disk 1, o:1, dev:sde1
disk 2, o:1, dev:sdg1
disk 3, o:1, dev:sdc1
disk 4, o:1, dev:sdd1

Unfortunately, there is no point looking at mdadm -E disk> as you've already trashed the information there

Anyway From the above the recreation of the array would be

mdadm -C -l5 -n5 -c128  /dev/md0 /dev/sdf1 /dev/sde1 /dev/sdg1 /dev/sdc1 
/dev/sdd1
(where -l5 = raid 5, -n5 = number of participating drives and -c128 = 
chunk size of 128K)


IF you don't have the configuration printout, then you're left with 
exhaustive brute force searching of the combinations
disks. Unfortunately possible combinations increase geometrically and 
going beyond 8 disks is a suicidally *bad* idea


2=2
3=6
4=24
5=120
6=720
7=5040
8=40320

You only have 3 drives so only 6 possible combinations to try (unlike 
myself with 5)


So, just write yourself a small script with all 6 combinations and run 
them through a piece of shell similar to this pseudo script


lvchange -an /dev/VolGroup01/LogVol00 # if you use lvm at all (change as 
appropriate or discard)

mdadm --stop --scan
yes | mdadm -C -l5 -n3 /dev/md0  /dev/sdd1 /dev/sde1 /dev/sdf1 # 
(replaceable combinations)

lvchange -ay  /dev/VolGroup01/LogVol00 # if you use lvm (or discard)
mount /dev/md0 /mnt
# Lets use the success return code for mount to indicate we're able to 
mount the FS again and bail out (man mount)

if [ $? eq 0 ] ; then
exit 0
fi



-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid6 check/repair

2007-11-30 Thread Thiemo Nagel

Dear Neil and Eyal,

Eyal Lebedinsky wrote:
> Neil Brown wrote:
>> It would seem that either you or Peter Anvin is mistaken.
>>
>> On page 9 of
>> http://www.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf
>> at the end of section 4 it says:
>>
>> Finally, as a word of caution it should be noted that RAID-6 by
>> itself cannot even detect, never mind recover from, dual-disk
>> corruption. If two disks are corrupt in the same byte positions,
>> the above algorithm will in general introduce additional data
>> corruption by corrupting a third drive.
>
> The above a/b/c cases are not correct for raid6. While we can detect
> 0, 1 or 2 errors, any higher number of errors will be misidentified as
> one of these.
>
> The cases we will always see are:
> a) no  errors - nothing to do
> b) one error - correct it
> c) two errors -report? take the raid down? recalc syndromes?
> and any other case will always appear as *one* of these (not as [c]).

I still don't agree.  I'll explain the algorithm for error handling that
I have in mind, maybe you can point out if I'm mistaken at some point.

We have n data blocks D1...Dn and two parities P (XOR) and Q
(Reed-Solomon).  I assume the existence of two functions to calculate
the parities
P = calc_P(D1, ..., Dn)
Q = calc_Q(D1, ..., Dn)
and two functions to recover a missing data block Dx using either parity
Dx = recover_P(x, D1, ..., Dx-1, Dx+1, ..., Dn, P)
Dx = recover_Q(x, D1, ..., Dx-1, Dx+1, ..., Dn, Q)

This pseudo-code should distinguish between a), b) and c) and properly
repair case b):

P' = calc_P(D1, ..., Dn);
Q' = calc_Q(D1, ..., Dn);
if (P' == P && Q' == Q) {
  /* case a): zero errors */
  return;
}
if (P' == P && Q' != Q) {
  /* case b1): Q is bad, can be fixed */
  Q = Q';
  return;
}
if (P' != P && Q' == Q) {
  /* case b2): P is bad, can be fixed */
  P = P';
  return;
}
/* both parities are bad, so we try whether the problem can
   be fixed by repairing data blocks */
for (i = 1; i <= n; n++) {
  /* assume only Di is bad, use P parity to repair */
  D' = recover_P(i, D1, ..., Di-1, Di+1, ..., Dn, P);
  /* use Q parity to check assumption */
  Q' = calc_Q(D1, ..., Di-1, D', Di+1, ..., Dn);
  if (Q == Q') {
/* case b3): Q parity is ok, that means the assumption was
   correct and we can fix the problem */
Di = D';
return;
  }
}
/* case c): when we get here, we have excluded cases a) and b),
   so now we really have a problem */
report_unrecoverable_error();
return;


Concerning misidentification:  A situation can be imagined, in which two 
or more simultaneous corruptions have occurred in a very special way, so 
that case b3) is diagnosed accidentally.  While that is not impossible, 
I'd assume the probability for it to be negligible, to be compared to 
that of undetectable corruption in a RAID 5 setup.


Kind regards,

Thiemo
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-11-30 Thread Michael Tokarev
Bryce wrote:
[]
> mdadm -C -l5 -n5 -c128  /dev/md0 /dev/sdf1 /dev/sde1 /dev/sdg1 /dev/sdc1 
> /dev/sdd1
...
> IF you don't have the configuration printout, then you're left with
> exhaustive brute force searching of the combinations

You're missing a very important point -- --assume-clean option.
For experiments like this (trying to figure out the order of disks),
you'd better ensure the data on disks isn't being changed while
you try different combinations.  But on each build, md always
destroys one drive by re-calculating parity.  You have to stop
it from doing so - to not trash your data.

Another option is to use one missing drive always, i.e.,

 mdadm -C -l5 -n5 -c128  /dev/md0 /dev/sdf1 missing /dev/sdg1 /dev/sdc1 
/dev/sdd1

so that the array will be degraded and no way to resync anything -
this also prevents md from trashing data.

/mjt
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-11-30 Thread David Greaves
Neil Brown wrote:
> On Thursday November 29, [EMAIL PROTECTED] wrote:
>> 2. Do you know of any way to recover from this mistake? Or at least what 
>> filesystem it was formated with.
It may not have been lost - yet.


> If you created the same array with the same devices and layout etc,
> the data will still be there, untouched.
> Try to assemble the array and use "fsck" on it.
To be safe I'd use fsck -n (check the man page as this is odd for reiserfs)


> When you create a RAID5 array, all that is changed is the metadata (at
> the end of the device) and one drive is changed to be the xor of all
> the others.
In other words, one of your 3 drives has just been erased.
Unless you know the *exact* command you used and have the dmesg output to hand
then we won't know which one.

Now what you need to do is to try all the permutations of creating a degraded
array using 2 of the drives and specify the 3rd as 'missing':

So something like:
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sdb1 /dev/sdc1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 missing /dev/sdc1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 /dev/sdc1 missing
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sdb1 /dev/sdd1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 missing /dev/sdd1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 /dev/sdd1 missing
etc etc

It is important to create the array using a 'missing' device so the xor data
isn't written.

There is a program here: http://linux-raid.osdl.org/index.php/Permute_array.pl
that may help...

David


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-11-30 Thread Dragos

Thank you for your very fast answers.

First I tried 'fsck -n' on the existing array. The answer was that If I 
wanted to check a XFS partition I should use 'xfs_check'. That seems to 
say that my array was partitioned with xfs, not reiserfs. Am I correct?


Then I tried the different permutations:
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sda1 
/dev/sdb1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sda1 missing 
/dev/sdb1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sda1 /dev/sdb1 
missing

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sda1 
/dev/sdc1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sda1 missing 
/dev/sdc1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sda1 /dev/sdc1 
missing

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sdb1 
/dev/sdc1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 missing 
/dev/sdc1

mount /dev/md0 temp
mdadm --stop --scan

mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 /dev/sdc1 
missing

mount /dev/md0 temp
mdadm --stop --scan

With some arrays mount reported:
   mount: you must specify the filesystem type
and with others:
   mount: Structure needs cleaning

No choice seems to have been successful.
Please let me know of other ideas.

Thank you again,
Dragos

David Greaves wrote:

Neil Brown wrote:
  

On Thursday November 29, [EMAIL PROTECTED] wrote:

2. Do you know of any way to recover from this mistake? Or at least what 
filesystem it was formated with.
  

It may not have been lost - yet.


  

If you created the same array with the same devices and layout etc,
the data will still be there, untouched.
Try to assemble the array and use "fsck" on it.


To be safe I'd use fsck -n (check the man page as this is odd for reiserfs)


  

When you create a RAID5 array, all that is changed is the metadata (at
the end of the device) and one drive is changed to be the xor of all
the others.


In other words, one of your 3 drives has just been erased.
Unless you know the *exact* command you used and have the dmesg output to hand
then we won't know which one.

Now what you need to do is to try all the permutations of creating a degraded
array using 2 of the drives and specify the 3rd as 'missing':

So something like:
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sdb1 /dev/sdc1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 missing /dev/sdc1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 /dev/sdc1 missing
mdadm --create /dev/md0 --raid-devices=3 --level=5 missing /dev/sdb1 /dev/sdd1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 missing /dev/sdd1
mdadm --create /dev/md0 --raid-devices=3 --level=5 /dev/sdb1 /dev/sdd1 missing
etc etc

It is important to create the array using a 'missing' device so the xor data
isn't written.

There is a program here: http://linux-raid.osdl.org/index.php/Permute_array.pl
that may help...

David


  

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: assemble vs create an array.......

2007-11-30 Thread Dragos

I forgot one thing.
After re-creating the array which deleted my data in the first place, 
'mount' was giving me this answer:

  mount: Structure needs cleaning

Thank you for your time,
Dragos
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html