Hi Michael, Thanks very much for helping me with this. (continued following quoted material)
On Apr 21, 2015, at 11:17 AM, Michael Biebl <bi...@debian.org> wrote: > control: tags -1 moreinfo unreproducible > > Am 18.04.2015 um 02:02 schrieb Rick Thomas: >> >> On Apr 17, 2015, at 3:44 PM, Michael Biebl <bi...@debian.org> wrote: >>> >>> Thanks for the data. >>> Looks like an lvm issue to me: >>> >>> root@cube:~# lvscan >>> inactive '/dev/vg1/backup' [87.29 GiB] inherit >>> >>> and as a result, /dev/disk/by-label/BACKUP is missing. >> >> Yes, that’s true, of course. But the question is, what is keeping lvm from >> activating the volume? >> >> It works fine for a logical volume on a single physical disk. And >> /proc/mdstat shows that the RAID device, /dev/md127, _is_ active. Or, at >> least it is when we get to emergency mode… I don’t know if it’s active when >> the fsck times out, of course… If you know how to figure that out from the >> systemd journal I attached to the original bug report, or any other way that >> I can try, I’d appreciate any assistance you can give! > > fwiw, I tried to reproduce the problem in a VM with two additional disks > attached and a setup like the following: > > ext4 on RAID1 (via mdadm) > ext4 on LVM on RAID1 (mdadm) > ext4 on LVM > ext4 on dos partition. > > All partitions were correctly mounted during boot without any issues. > > > Is this a fresh jessie installation or an upgraded system? > Do you have any custom udev rules in /lib/udev/rules.d or /etc/udev/rules.d? > > If it's an upgraded system and you have the sysvinit package installed, > you can try booting with sysvinit temporarily via > init=/lib/sysvinit/init on the kernel command line. > > Does that work? My physical setup is this: The hardware is a quad-core armv7 Cubox i4pro ( https://wiki.debian.org/ArmHardFloatPort/CuBox-i ) With some help from Karsten Merker, I got a plain-vanilla — un-modified — Jessie installed on it to use for experimenting. I wanted experience with the Cubox hardware and with using Jessie in a “real life” situation. The boot (and system residency: root, swap, /home, /var — the works) is on a 32GB microSD card. I’ve added to that an eSATA 1TB hard disk (currently configured as single filesystem using LVM) and a 7-port USB2.0 hub with 5 of the ports each holding a 32GB USB-Flash stick. Those 5 devices are configured as a software (md) RAID6 array (I wanted to get some experience with RAID6) providing about 90GB of useful space configured with LVM as a single logical volume. It’s the RAID6 array (or rather the lv on it) that is having the problem. I’ve managed to make it work using a cron script that runs at reboot time (crontab has: @reboot bash -x bin/mount-backup and the mount-backup script looks like this: ################################### logger -s -t mount-backup 'Mounting the /backup filesystem' ( let count=10 # don’t try uselessly forever if it fails # If it doesn’t exist, take remedial action. while [ ! -h /dev/disk/by-label/BACKUP ] do let count=$count-1 [ $count -lt 1 ] && exit 1 sleep 10 # give things some time to settle cat /proc/mdstat # show some debugging information # see if the raid has assembled and can be used /sbin/vgchange -ay done # If the fsck isn’t perfect, quit and wait for human intervention /sbin/fsck -nf /backup && /bin/mount /backup ) | logger -s -t mount-backup ################################### This works. Interestingly, without the sleep loop the vgchange fails. Now, you say that a VM with two virtual disks configured as RAID1 with a logical volume works fresh out of the box. This makes me wonder if it’s some kind of a timing problem… It takes a few seconds for the freshly rebooted system to find the USB-Flash sticks and assemble them. So some time-out is triggered in the systemd stuff on my setup, while your setup has no such physical constraints — everything is available immediately. That’s just a guess… But fortunately, it’s a testable guess! My setup is (at the moment) just for experimentation and learning — no actual useful work. So I can re-install it at will or make any changes I need to to track this down. Your suggestion about trying sysvinit will be a good place to start. If that works with my workaround script disabled, the next experiment will be to try systemd with a rootdelay=10. I will also try the VM setup, just to see if I can replicate your result. After that, I’m not sure — any suggestions will be appreciated! I’ll get back to you when I’ve made those tests. Real-life(TM) will probably prevent me from doing that before the week-end. Enjoy! And Thanks for all the help! Rick -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org