Hello Niels, On Mon, 2015-10-05 at 11:35 +0000, Baumgartner Niels, Bedag wrote: > Hello Ritesh > > > VGs in this case are: > > > > vg_system => 20 GiB > > vg_services => 1.0 TiB > > > > Is that correct ? > > Correct. > > > And from the logs, it looks like these Physical Volumes were > > created on top of the SCSI devices ? > > This is true for the PV in vg_system. This PV was created during > installation and I didn't get debian installer to work with multipath > (install disk-detect/multipath/enable=true didn't work), so I created > it directly on a scsi device. I'm guessing you made this assumption > based on the "Found duplicate PV" messages...? >
Hmmmm! This may be interesting one. There are a set of assumptions that was made, for "root on multipathed device" setups. Have you checked the README.Debian file ? For example this snippet: Where did my FC-connected filesystem go? ======================================== If you were previously mounting a device connected to your system by Fibre Channel and then installed multipath-tools, you need to change the way you mount the device. The device must now be accessed using the identifier by which device-mapper knows it. For example if you have in /dev/mapper a file like this: brw-rw---- 1 root disk 254, 8 2009-01-05 14:35 /dev/mapper/36000393000007d3901000000fef00a2 d then you can mount the filesystem like this: mount /dev/mapper/36000393000007d3901000000fef00a2d /mnt or this mount /dev/disk/by-id/scsi-36000393000007d3901000000fef00a2d /mnt You should prefer the latter, as this will work whether or not multipath-tools is installed. Note that with multipath-tools installed you cannot use the device's node in /dev, e.g. # mount /dev/sdc1 /mnt mount: /dev/sdc1 already mounted or /mnt busy The device is 'busy' because it is part of a multipath map. See the output of 'multipath -l' to confirm this. The problem is multi-fold. SCSI device names (/dev/sd* ones) are not persistent. The persistent names are the ones created in /dev/disk/by- */. Now, in a setup, if you mix Device Mapper LVM and Device Mapper Multipath, you need to ensure that the multipath map is created before LVM locks it. And to LVM, you need to instruct that it should look for Physical Volumes in /dev/mapper/. Sorry. I don't have the up-to-date specifics but you absolutely need to ensure that the bare device (/dev/sd*) is not locked by LVM. Otherwise, the multipath map will not be created. And from your logs, that might be the case. Because both your VGs, vg_system and vg_service, are active with the SCSI device. Note: You can't call them paths because that notion is only for a Device Mapper Multipath device. > The other PV in vg_services was, if I recall correctly, created on > the multipath device-mapper device. > I doubt that. From your logs: Sep 17 14:05:13 redactedhostname multipath-tools-boot[833]: Sep 17 14:05:12 | rport-7:0-1: failed to set fast_io_fail_tmo to 5, error 22 Sep 17 14:05:13 redactedhostname multipath-tools-boot[833]: Sep 17 14:05:12 | rport-7:0-0: failed to set fast_io_fail_tmo to 5, error 22 Sep 17 14:05:13 redactedhostname multipath-tools-boot[833]: Sep 17 14:05:12 | rport-0:0-0: failed to set fast_io_fail_tmo to 5, error 22 Sep 17 14:05:13 redactedhostname multipath-tools-boot[833]: Sep 17 14:05:12 | rport-7:0-1: failed to set fast_io_fail_tmo to 5, error 22 Sep 17 14:05:13 redactedhostname multipath-tools-boot[833]: Sep 17 14:05:12 | rport-7:0-0: failed to set fast_io_fail_tmo to 5, error 22 Sep 17 14:05:13 redactedhostname multipath-tools-boot[833]: Sep 17 14:05:12 | rport-7:0-1: failed to set fast_io_fail_tmo to 5, error 22 Sep 17 14:05:13 redactedhostname multipath-tools-boot[833]: Sep 17 14:05:12 | rport-7:0-0: failed to set fast_io_fail_tmo to 5, error 22 Sep 17 14:05:13 redactedhostname multipath-tools-boot[833]: done. Sep 17 14:05:13 redactedhostname lvm[912]: Found duplicate PV is1EDNsscfDf7DdFzBM0obX2jZzUMOXa: using /dev/sdc3 not /dev/sda3 Sep 17 14:05:13 redactedhostname lvm[912]: Found duplicate PV is1EDNsscfDf7DdFzBM0obX2jZzUMOXa: using /dev/sde3 not /dev/sdc3 Sep 17 14:05:13 redactedhostname lvm[912]: Found duplicate PV is1EDNsscfDf7DdFzBM0obX2jZzUMOXa: using /dev/sdg3 not /dev/sde3 Sep 17 14:05:13 redactedhostname lvm[912]: 1 logical volume(s) in volume group "vg_system" now active Sep 17 14:05:13 redactedhostname lvm[912]: 3 logical volume(s) in volume group "vg_services" now active Sep 17 14:05:13 redactedhostname lvm[1033]: Found duplicate PV is1EDNsscfDf7DdFzBM0obX2jZzUMOXa: using /dev/sdc3 not /dev/sda3 Sep 17 14:05:13 redactedhostname systemd-fsck[1036]: /dev/mapper/vg_services-lv_services_redactedhostname: clean, 24/655360 files, 79676/2621440 blocks Sep 17 14:05:13 redactedhostname lvm[1033]: Found duplicate PV is1EDNsscfDf7DdFzBM0obX2jZzUMOXa: using /dev/sde3 not /dev/sdc3 Sep 17 14:05:13 redactedhostname lvm[1033]: Found duplicate PV is1EDNsscfDf7DdFzBM0obX2jZzUMOXa: using /dev/sdg3 not /dev/sde3 Sep 17 14:05:13 redactedhostname lvm[1033]: 1 logical volume(s) in volume group "vg_system" now active Sep 17 14:05:13 redactedhostname lvm[1033]: 3 logical volume(s) in volume group "vg_services" now active Sep 17 14:05:13 redactedhostname kernel: EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null) Sep 17 14:05:13 redactedhostname lvm[1053]: Found duplicate PV is1EDNsscfDf7DdFzBM0obX2jZzUMOXa: using /dev/sdc3 not /dev/sda3 Sep 17 14:05:13 redactedhostname lvm[1053]: Found duplicate PV is1EDNsscfDf7DdFzBM0obX2jZzUMOXa: using /dev/sde3 not /dev/sdc3 Sep 17 14:05:13 redactedhostname lvm[1053]: Found duplicate PV is1EDNsscfDf7DdFzBM0obX2jZzUMOXa: using /dev/sdg3 not /dev/sde3 Sep 17 14:05:13 redactedhostname lvm[1053]: 1 logical volume(s) in volume group "vg_system" monitored Sep 17 14:05:13 redactedhostname lvm[1053]: 3 logical volume(s) in volume group "vg_services" monitored Even though these are devices under /dev/mapper, they are from a different target, NOT multipath. And those VGs got active because the SCSI device had the PV metadata. From your logs: Sep 17 14:04:41 redactedhostname kernel: EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null) So, your rootfs (which you confirmed is on SAN) was mouted. That dm-0 device must be referencing LVM. And then later this happened: Sep 17 14:04:42 redactedhostname kernel: EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro Sep 17 14:04:52 redactedhostname multipath-tools-boot[833]: Discovering and coalescing multipaths...Sep 17 14:04:43 | rport-0:0-1: failed to set fast_io_fail_tmo to 5, error 22 Sep 17 14:04:52 redactedhostname multipath-tools-boot[833]: Sep 17 14:04:43 | rport-0:0-0: failed to set fast_io_fail_tmo to 5, error 22 Sep 17 14:04:52 redactedhostname multipath-tools-boot[833]: Sep 17 14:04:43 | rport-0:0-1: failed to set fast_io_fail_tmo to 5, error 22 Sep 17 14:04:52 redactedhostname multipath-tools-boot[833]: Sep 17 14:04:44 | rport-7:0-0: failed to set fast_io_fail_tmo to 5, error 22 That re-mount is, I'm guessing, when it reached the real init. So it looks like, by the time multipath was brought into picture, the devices were already acquired by LVM. What do you have to say about my explanation? I might be wrong here, because my interpretation is based on just your logs. > > The rule for stable updates is to have the same fix present > > upstream, and then in the Unstable repo. Neither of which is true > > in this case. > > Ah, I understand now. > So how are we going to precede here? > > If you're going to tell me, that you can't push this to stable, I'll > put a working package on our internal mirror. I opened this bugreport > because I feel like this should be fixed officially, so others don't > run into the same issues. > It'd already be nice if the "[fd43c41] Drop udev rule to invoke > multipath per path." Fix was pushed to stable, as it seems to resolve > the boot issue. What are the rules for code in /debian, since there > is no 'upstream' of this? > I'd love to fix and push, provided it is a bug. First we need to confirm that. Then, whether it'll be accepted or not, is up to the stable release managers. -- Ritesh Raj Sarraf RESEARCHUT - http://www.researchut.com "Necessity is the mother of invention."
signature.asc
Description: This is a digitally signed message part