Bug#585015: kernel 2.6.34 fails to boot normally with mdadm > 3.1.1-1

2010-07-22 Thread Neil Brown
On Thu, 22 Jul 2010 12:36:47 +0200
Jean-Luc Coulon  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Le 22/07/2010 10:16, Neil Brown a écrit :
> > 
> > On Thu, 22 Jul 2010 09:30:17 +0200
> > Jean-Luc Coulon  wrote:
> 
> [ ... ]
> 
> > Debian doesn't add any significant patches to mainline mdadm, so you could 
> > 
> >   git clone git://neil.brown.name/mdadm
> >   cd mdadm
> > Then
> > 
> >   git bisect start mdadm-3.1.2 mdadm-3.1.1
> > 
> >   make ; make install ; mkinitramfs ; reboot
> >   git bisect good  OR git bisect bad
> > 
> > and see where you end up.
> > 
> > There are only 104 commits, so it should only take 7 iterations.
> 
> Ok, done, attached my log.

Great, thanks for doing that.

Had you run one more "git bisect good" it would have said:
319767b85c2b16d80c235195329470c46d4547b3 is the first bad commit
commit 319767b85c2b16d80c235195329470c46d4547b3
Author: NeilBrown 
Date:   Mon Feb 8 14:33:31 2010 +1100

mapfile: use ALT_RUN as alternate place to store mapfile

This gives better consistency and fewer hidden '.' files.

Signed-off-by: NeilBrown 

:100644 100644 552df2914f3fc0b0fc128512d92e818e6627 
366ebe332299c92fceaa4d3c9fa2e8c644b27801 M  mapfile.c

to make it explicit.
That patch changed the location where the mapfile is stored during initramfs
time from "/dev/.mdadm.map" to "/lib/init/rw/map".
Which:
  a/ isn't exactly what I wanted (I wanted /lib/init.rw/mdadm/map) and
  b/ doesn't exist - damn.
I thought that /lib/init/rw/map existed during early boot, but it seems not.
I guess I'm going to have to leave it in /dev - which I don't like at all but
there doesn't seem to be an option (OK Doug, you can say "I told you so" now).

Why that would cause infinite loops I'm not sure.  It would stop udev from
creating a symlink from /dev/md/whatever to /dev/mdXX - maybe that is enough
to upset some part of the boot process.

I wonder how that related to the kernel... you say it only breaks with 2.6.34.

I'll try experimenting with 2.6.34 and see if I can break it ... but not
today.
Meanwhile I'll revert that change for mdadm-3.1.3.

Thanks for your help.
NeilBrown



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#585015: kernel 2.6.34 fails to boot normally with mdadm > 3.1.1-1

2010-07-22 Thread Jean-Luc Coulon
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Le 22/07/2010 10:16, Neil Brown a écrit :
> 
> On Thu, 22 Jul 2010 09:30:17 +0200
> Jean-Luc Coulon  wrote:

[ ... ]

> Debian doesn't add any significant patches to mainline mdadm, so you could 
> 
>   git clone git://neil.brown.name/mdadm
>   cd mdadm
> Then
> 
>   git bisect start mdadm-3.1.2 mdadm-3.1.1
> 
>   make ; make install ; mkinitramfs ; reboot
>   git bisect good  OR git bisect bad
> 
> and see where you end up.
> 
> There are only 104 commits, so it should only take 7 iterations.

Ok, done, attached my log.

Regards

Jean-Luc
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFMSB8/UdGGXzzGnNARAokoAJ9G1O1HtpKtqdV/ow9mECVS02jbYACgk9PC
2FCiWgS+JjRIG19o+sFRAt4=
=Q8Yu
-END PGP SIGNATURE-
git clone git://neil.brown.name/mdadm
  cd mdadm
git bisect start mdadm-3.1.2 mdadm-3.1.1

[jean-...@tangerine] % git bisect start mdadm-3.1.2 mdadm-3.1.1
Bisecting: 51 revisions left to test after this (roughly 6 steps)
[d998adc316299efc44cb6e70ecc2e04bffb76d17] Detail:  Report state of FAILED when
an array has too few devices to work.
[jean-...@tangerine] % make
[jean-...@tangerine] % sudo make install
[jean-...@tangerine] % sudo mkinitramfs -o /boot/initrd.img-2.6.35.new

reboot --> good

[jean-...@tangerine] % git bisect good
[24af7a8744d947b5c3f062af55312c044ca12a95] Assemble: clean up properly if we
cannot add the bitmap file.

reboot --> failed 

[jean-...@tangerine] % git bisect bad  /usr/local/src/tmp/mdadm
Bisecting: 12 revisions left to test after this (roughly 4 steps)
[24f6f99b3630b1a89aaa57930c5c9de8a3df9ded] Having single function to read mdmon
pid file.

[jean-...@tangerine] % make
[jean-...@tangerine] % sudo make install
[jean-...@tangerine] % sudo mkinitramfs -o /boot/initrd.img-2.6.35.new

reboot --> good

[jean-...@tangerine] % git bisect good /usr/local/src/tmp/mdadm
Bisecting: 6 revisions left to test after this (roughly 3 steps)
[319767b85c2b16d80c235195329470c46d4547b3] mapfile: use ALT_RUN as alternate
place to store mapfile

[jean-...@tangerine] % make
[jean-...@tangerine] % sudo make install
[jean-...@tangerine] % sudo mkinitramfs -o /boot/initrd.img-2.6.35.new

reboot --> failed

[jean-...@tangerine] % git bisect bad  /usr/local/src/tmp/mdadm
Bisecting: 2 revisions left to test after this (roughly 2 steps)
[b5c727dc1a55323f02e5f60a50bcecb866dd51ea] mdmon: remove switch-root
functionality.

[jean-...@tangerine] % make
[jean-...@tangerine] % sudo make install
[jean-...@tangerine] % sudo mkinitramfs -o /boot/initrd.img-2.6.35.new
 
reboot --> good

[jean-...@tangerine] % git bisect good /usr/local/src/tmp/mdadm
Bisecting: 0 revisions left to test after this (roughly 1 step)
[fa716c83c5be8093e663e260e46e73ea9ad6facf] mdmon: insist on creating .pid file
at startup.

reboot --> good


mdadm-git-bisect.log.sig
Description: PGP signature


Bug#585015: kernel 2.6.34 fails to boot normally with mdadm > 3.1.1-1

2010-07-22 Thread Neil Brown
On Thu, 22 Jul 2010 09:30:17 +0200
Jean-Luc Coulon  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Le 22/07/2010 09:04, Neil Brown a écrit :
> > 
> 
> > 
> > I wonder if it could be
> >commit b179246f4f519082158279b2f45e5fd51842cc42
> > causing this.
> > 
> > Can you report the contents of /etc/mdadm/mdadm.conf ??
> 
> Please find it attached

Thanks.  Unfortunately it didn't help.

> 
> > 
> > Can you boot of a live CDROM and see if
> >   mdadm -As
> > 
> > works, or spins or does something else bad?
> 
> Well: the latest version of madm works on kernel version older the
> 2.6.34… and I've not found a live CD with ealier version (2.6.34 or
> 3.6.65.rc)…
> Should the version on the live CD of importance?

Yes, you would need a kernel which causes problems - scratch that idea.

> 
> BTW, I tried to left the system for a couple of years waiting something
> happens. Finally, I got an udev message: "no space left on device". The
> device was not indicated and I've no device (disk) short of room...

Sounds like udev creating lots of things in a ramdisk...

How helpful are you feeling???

It would be really great if you could use 'git bisect' to isolate which
change causes the problem.

Debian doesn't add any significant patches to mainline mdadm, so you could 

  git clone git://neil.brown.name/mdadm
  cd mdadm
Then

  git bisect start mdadm-3.1.2 mdadm-3.1.1

  make ; make install ; mkinitramfs ; reboot
  git bisect good  OR git bisect bad

and see where you end up.

There are only 104 commits, so it should only take 7 iterations.

Thanks,
NeilBrown



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#585015: kernel 2.6.34 fails to boot normally with mdadm > 3.1.1-1

2010-07-22 Thread Jean-Luc Coulon
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Le 22/07/2010 09:04, Neil Brown a écrit :
> 

> 
> I wonder if it could be
>commit b179246f4f519082158279b2f45e5fd51842cc42
> causing this.
> 
> Can you report the contents of /etc/mdadm/mdadm.conf ??

Please find it attached

> 
> Can you boot of a live CDROM and see if
>   mdadm -As
> 
> works, or spins or does something else bad?

Well: the latest version of madm works on kernel version older the
2.6.34… and I've not found a live CD with ealier version (2.6.34 or
3.6.65.rc)…
Should the version on the live CD of importance?

BTW, I tried to left the system for a couple of years waiting something
happens. Finally, I got an udev message: "no space left on device". The
device was not indicated and I've no device (disk) short of room...

Regards

Jean-Luc
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFMR/OHUdGGXzzGnNARAmKoAJ4yMTZf2GLvLj5/vBcJmgOWE+Vs1ACfVeDn
UcZ7D06PS/IY00m1v0KRQWo=
=sMq4
-END PGP SIGNATURE-
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST 

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md3 UUID=a14b6e84:366c9fff:fd4061fc:100c5d56
ARRAY /dev/md4 UUID=3075c036:528fe022:fd4061fc:100c5d56
ARRAY /dev/md5 UUID=4e8929f4:f9b54516:fd4061fc:100c5d56

# This configuration was auto-generated on Wed, 16 Sep 2009 19:09:44 +0200
# by mkconf 3.0-2


mdadm.conf.sig
Description: PGP signature


Bug#585015: kernel 2.6.34 fails to boot normally with mdadm > 3.1.1-1

2010-07-22 Thread Neil Brown
On Tue, 08 Jun 2010 13:25:13 +0200
"Jean-Luc Coulon (f5ibh)"  wrote:


> I've a system with lvm2 over raid1 and some filesystems encrypted.
> When I updated mdadm from 3.1.1-1 to 3.1.2-1 the system failed to boot kernel
> 2.6.34 (from experimental).
> I tried 3.1.2-2 when it was released, I got the same problem.
> 
> The system boot from grub and the wait for the password.
> 
> When I enter the password, there is a huge / endless disk activity but the 
> boot
> process seems to be frozen.
> 
> The system is still living : if I plug/unplug an usb disk, it is reported on 
> the
> console.
> 
> I've then rebooted with 2.6.33 without any problem.
> 
> Reverting to 3.1.1-1 solved also the problem.

I wonder if it could be
   commit b179246f4f519082158279b2f45e5fd51842cc42
causing this.

Can you report the contents of /etc/mdadm/mdadm.conf ??

Can you boot of a live CDROM and see if
  mdadm -As

works, or spins or does something else bad?

I'd love to know the cause of this before I release 3.1.3, but
there is so little hard information to go on, it is hard to make progress.

Thanks,
NeilBrown



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#585015: kernel 2.6.34 fails to boot normally with mdadm > 3.1.1-1

2010-06-08 Thread Jean-Luc Coulon
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Le 08/06/2010 14:24, martin f krafft a écrit :
> also sprach Jean-Luc Coulon (f5ibh)  
> [2010.06.08.1325 +0200]:
>> When I enter the password, there is a huge / endless disk activity
>> but the boot process seems to be frozen.
> 
> Can you compare to #583917 and see if waiting for a few hours makes
> it boot?
> 

The main difference is that a 2.6.33 kernel works with both version of
mdadm and 2.6.34 doesnt.

I will try that to left it the time ;)

J-L

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFMDjm1UdGGXzzGnNARAgRVAJ9AiBbaZVZZLt+FlfnsLnTzZ3H5mACfbC7z
ty5WkKAA1mjZDzh6zbsWAcE=
=e51t
-END PGP SIGNATURE-





--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#585015: kernel 2.6.34 fails to boot normally with mdadm > 3.1.1-1

2010-06-08 Thread martin f krafft
also sprach Jean-Luc Coulon (f5ibh)  
[2010.06.08.1325 +0200]:
> When I enter the password, there is a huge / endless disk activity
> but the boot process seems to be frozen.

Can you compare to #583917 and see if waiting for a few hours makes
it boot?

-- 
 .''`.   martin f. krafft   Related projects:
: :'  :  proud Debian developer   http://debiansystem.info
`. `'`   http://people.debian.org/~madduckhttp://vcs-pkg.org
  `-  Debian - when you have better things to do than fixing systems
 
beauty, brains, availability, personality; pick any two.


digital_signature_gpg.asc
Description: Digital signature (see http://martin-krafft.net/gpg/)


Bug#585015: kernel 2.6.34 fails to boot normally with mdadm > 3.1.1-1

2010-06-08 Thread Jean-Luc Coulon (f5ibh)
Package: mdadm
Version: 3.1.1-1
Severity: normal

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I've a system with lvm2 over raid1 and some filesystems encrypted.
When I updated mdadm from 3.1.1-1 to 3.1.2-1 the system failed to boot kernel
2.6.34 (from experimental).
I tried 3.1.2-2 when it was released, I got the same problem.

The system boot from grub and the wait for the password.

When I enter the password, there is a huge / endless disk activity but the boot
process seems to be frozen.

The system is still living : if I plug/unplug an usb disk, it is reported on the
console.

I've then rebooted with 2.6.33 without any problem.

Reverting to 3.1.1-1 solved also the problem.

Regards

Jean-Luc


- -- Package-specific info:
- --- mount output
/dev/mapper/vg00-root_lv on / type ext3 (rw,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/dev/md3 on /boot type ext3 (rw)
/dev/mapper/vg00-usr_lv on /usr type ext3 (rw)
/dev/mapper/vg00-opt_lv on /opt type xfs (rw)
/dev/mapper/vg00-var_lv on /var type ext3 (rw)
/dev/mapper/vg00-local_lv on /usr/local type xfs (rw)
/dev/mapper/cryptvg00-home_lv on /home type xfs (rw)
/dev/mapper/cryptvg00-tmp_lv on /tmp type xfs (rw)
/dev/mapper/cryptvg00-mail_lv on /usr/mail type xfs (rw)
/dev/mapper/cryptvg00-photos_lv on /photos type xfs (rw)
/dev/mapper/cryptvg00-music_lv on /music type ext3 (rw)
/dev/sdb1 on /extra type ext4 (rw)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc 
(rw,noexec,nosuid,nodev)
/dev/sr0 on /media/cdrom0 type udf (ro,noexec,nosuid,nodev,user=jean-luc)
/dev/sdd1 on /media/rn int(rand type vfat 
(rw,nosuid,nodev,uhelper=udisks,uid=1000,gid=1000,shortname=mixed,dmask=0077,utf8=1,showexec,flush)

- --- mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST 

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md3 UUID=a14b6e84:366c9fff:fd4061fc:100c5d56
ARRAY /dev/md4 UUID=3075c036:528fe022:fd4061fc:100c5d56
ARRAY /dev/md5 UUID=4e8929f4:f9b54516:fd4061fc:100c5d56

# This configuration was auto-generated on Wed, 16 Sep 2009 19:09:44 +0200
# by mkconf 3.0-2

- --- /proc/mdstat:
Personalities : [raid1] 
md5 : active raid1 sda3[0] sdc3[1]
  97450176 blocks [2/2] [UU]
  
md4 : active raid1 sda2[0] sdc2[1]
  58596992 blocks [2/2] [UU]
  
md3 : active raid1 sda1[0] sdc1[1]
  240832 blocks [2/2] [UU]
  
unused devices: 

- --- /proc/partitions:
major minor  #blocks  name

   80  156290904 sda
   81 240943 sda1
   82   58597087 sda2
   83   97450290 sda3
   8   16   80043264 sdb
   8   17   80035798 sdb1
   8   32  156290904 sdc
   8   33 240943 sdc1
   8   34   58597087 sdc2
   8   35   97450290 sdc3
   93 240832 md3
   94   58596992 md4
   95   97450176 md5
 2530 487424 dm-0
   8   48   78150744 sdd
   8   49   19535008 sdd1
   8   50   58613152 sdd2
 2531   97449148 dm-1
 25329277440 dm-2
 25335857280 dm-3
 25347811072 dm-4
 2535 974848 dm-5
 2536   15728640 dm-6
 25375242880 dm-7
 2538   35577856 dm-8
 2539   14680064 dm-9
 253   102768896 dm-10
 253   113903488 dm-11
 253   129437184 dm-12
 253   138773632 dm-13

- --- initrd.img-2.6.34-k8-2:
46277 blocks
d989e6c5bc1ed7cea311c0bc22f4d665  ./etc/mdadm/mdadm.conf
c6ee45b773edd23768bab8ab4067121b  
./lib/modules/2.6.34-k8-2/kernel/drivers/md/dm-mod.ko
c8698139164a8ddf4ec5d791c23d2cab  
./lib/modules/2.6.34-k8-2/kernel/drivers/md/dm-crypt.ko
a939c477d261470e80c87545d697b8c1  
./lib/modules/2.6.34-k8-2/kernel/drivers/md/dm-snapshot.ko
b711c380f2e0afed2b8814645d3cb657  
./lib/modules/2.6.34-k8-2/kernel/drivers/md/dm-log.ko
c43586c1225ed92d8cb8420d5915f7f4  
./lib/modules/2.6.34-k8-2/kernel/drivers/md/dm-region-hash.ko
9c53fd2cd0b67eac709c56b2740345d5  
./lib/modules/2.6.34-k8-2/kernel/drivers/md/dm-mirror.ko
5cb80047c86e1e4576f8c6f04d167489  
./lib/modules/2.6.34-k8-2/kernel/drivers/md/md-mod.ko
ff34331e8c5c7b090d230d72a9831aa4  
./lib/modules/2.6.34-k8-2/kernel/drivers/md/linear.ko
1287f4b136d5b01eaa0c0bf51e14df17  
./lib/modules/2.6.34-k8-2/kernel/drivers/md/raid0.ko
096ccd8f1