Your message dated Wed, 15 Nov 2006 18:50:55 +0100
with message-id <[EMAIL PROTECTED]>
and subject line bugs closed with 2.5.6-5 upload
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--- Begin Message ---
Package: mdadm
Version: 2.5.4-1
Severity: important
Owner: [EMAIL PROTECTED]
Tags: patch

----- Forwarded message from Dan Pascu <[EMAIL PROTECTED]> -----

[...]

The second patch fixes a more serious problem. If the system boots with a 
degraded array, it locks in the booting process forever. I traced the 
problem to be with mdadm --assemble --scan --auto=yes
If an array is degraded, it is assembled, but for some reason it is not 
properly recognized so when it tries to scan the available devices and it 
then tries to reassemble it again and again.

The output of trying to assemble 2 RAID1 arrays (one being degraded) is 
this:

debian:~# mdadm --assemble --scan --auto=yes
mdadm: /dev/md0 has been started with 2 drives.
mdadm: /dev/md1 has been started with 1 drive (out of 2).
mdadm: /dev/md/1 is already active.
mdadm: /dev/md/1 is already active.
mdadm: /dev/md/1 is already active.
mdadm: /dev/md/1 is already active.
mdadm: /dev/md/1 is already active.
mdadm: /dev/md/1 is already active.
[... repeated forever or until ^C ...]

The issue (as far as I was able to understand it) is this:

1. mdadm tries to first assemble all the arrays it finds in the mdadm.conf 
file.

2. it then tries to find other arrays by scanning all the available 
devices in the system and repeatedly assembling the ones found until none 
is left. Now if at least 1 array is degraded, for some reason it won't 
see it as assembled and it tries to assemble it again (this doesn't 
happen when the arrays are not degraded).

At the 2nd step it calls the Assemble() function repeatedly, but it 
doesn't pass a device list to it. This has the consequence that 
Assemble() will request its own device list and while parsing it to 
detect unassembled arrays it will mark the devices that are already used 
by other arrays. However since it gets this device list new everytime it 
is called, next time it won't know what devices are already used and has 
to start over marking them.

My patch tries to fix the issue by passing a device list to Assemble() 
while autosselecting, which means that over multiple calls Assemble() 
will use the same device list and will know which devices were already 
used by other arrays without redetecting this. This fixes the problem and 
now mdadm no longer locks in an infinite loop when assembling degraded 
arrays, and booting in such a condition no longer locks the machine. Also 
arrays are assembled correctly.

Now I have only studied the code for a couple of hours and I'm sure I 
didn't got all its subtleties, so this patch may not be the best way to 
fix the issue. However it makes sense to me to pass the same list of 
devices while autoassembling, so it knows which ones are already used in 
another arrays and doesn't try to reuse them.
I would recommend that you take this to the upstream author and he should 
decide if this is the right fix, or there is a better way to fix the 
problem.

The patch applies cleanly on both mdadm 2.5.4 and 2.5.5

-- 
Dan
--- mdadm.c.orig	2006-10-23 08:49:00.000000000 +0300
+++ mdadm.c	2006-10-31 12:01:41.000000000 +0200
@@ -1062,7 +1062,7 @@
 					do {
 						rv2 = Assemble(ss, NULL, -1,
 							       &ident,
-							       NULL, NULL,
+							       devlist, NULL,
 							       readonly, runstop, NULL, homehost, verbose-quiet, force);
 						if (rv2==0) {
 							cnt++;

--- End Message ---
--- Begin Message ---
Version: 2.5.6-5

Hi,

the bug(s) you reported against mdadm has been closed by the recent
upload of 2.5.6-5. Because I do too many things at once, I failed to
let the upload close them. Changelog is here:

  
http://svn.debian.org/wsvn/pkg-mdadm/mdadm/trunk/debian/changelog?op=file&rev=0&sc=0

-- 
 .''`.   martin f. krafft <[EMAIL PROTECTED]>
: :'  :  proud Debian developer, author, administrator, and user
`. `'`   http://people.debian.org/~madduck - http://debiansystem.info
  `-  Debian - when you have better things to do than fixing systems

Attachment: signature.asc
Description: Digital signature (GPG/PGP)


--- End Message ---

Reply via email to