Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-09-30 Thread chenw...@gmail.com

Hi, all:
  Any news about lvm2 (2.02.98-5) for wheezy now?


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-08-23 Thread Frank Steinborn
Hi, latest version of lvm2 from unstable (2.02.98-5) fixes this for
us. When installing lvm2, the following dependecies got pulled from
unstable into the wheezy-setup:

- libc-dev-bin 2.13-38 -> libc-dev-bin 2.17-92
- libc6-dev 2.13-38 -> libc6-dev 2.17-92
- locales 2.13-38 -> 2.17-92
- libc6 2.13-38 -> 2.17-92

I talked to wa...@debian.org on IRC, he will prepare a lvm2 package
for wheezy which fixes this bug.

Thank you all for helping tracking this down!

Have a nice weekend,
Frank

On Fri, Jul 26, 2013 at 5:14 PM, Frank Steinborn  wrote:
> Hi,
>
> we are a bit further in debugging this. We installed a DELL PowerEdge r620
> (same hardware as used in our DRBD-cluster where this problem happens). As
> noone in this thread brought DRBD into play, I didn't expect any interaction
> with it related to this bug. However, we were not able to reproduce with
> just LVM2 (eg. configure LV, do IO in LV, remove LV, hang.)
>
> So we installed a second machine and put DRBD on top of the LVs. And voila,
> as soon as we create a snapshot of the LV where DRBD is on top and remove
> this snapshot it fails ca. 1/3 of the time.
>
> Some facts:
>
> root@drbd-primary:~# lvremove --force /dev/vg0/lv0-snap
>   Unable to deactivate open vg0-lv0--snap-cow (254:3)
>   Failed to resume lv0-snap.
>   libdevmapper exiting with 1 device(s) still suspended.
>
> After this, "dmsetup info" gives the following output:
>
> <<< snip >>>
>
> Name:  vg0-lv0--snap
> State: ACTIVE
> Read Ahead:256
> Tables present:LIVE
> Open count:0
> Event number:  0
> Major, minor:  254, 1
> Number of targets: 1
> UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYy4WFhwy43CZA1g7zKFGF915pLAOIPvFZ
>
> Name:  vg0-lv0-real
> State: ACTIVE
> Read Ahead:0
> Tables present:LIVE
> Open count:1
> Event number:  0
> Major, minor:  254, 2
> Number of targets: 1
> UUID:
> LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYC3ppjt1CZ3AcZR2hNz1VT5CHdM4RR32j-real
>
> Name:  vg0-lv0
> State: SUSPENDED
> Read Ahead:256
> Tables present:LIVE & INACTIVE
> Open count:2
> Event number:  0
> Major, minor:  254, 0
> Number of targets: 1
> UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYC3ppjt1CZ3AcZR2hNz1VT5CHdM4RR32j
>
> Name:  vg0-lv0--snap-cow
> State: ACTIVE
> Read Ahead:0
> Tables present:LIVE
> Open count:0
> Event number:  0
> Major, minor:  254, 3
> Number of targets: 1
> UUID:
> LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYy4WFhwy43CZA1g7zKFGF915pLAOIPvFZ-cow
>
> <<< snap >>>
>
> As you can see, the real LV with DRBD on top is now in state SUSPENDED -
> which causes the cluster to be non-functional as IO operations stall on both
> the primary and secondary node until one does "dmsetup resume /dev/vg0/lv0".
>
> Another interesting issue we've seen: after doing "dmsetup resume
> /dev/vg0/lv0", lv0-snap doesn't appear to be a snapshot anymore, given the
> output of lvs (lv0-snap has no origin anymore):
>
>   LV   VG   Attr LSize   Pool Origin Data%  Move Log Copy%  Convert
>   lv0  vg0  -wi-ao-- 200.00g
>   lv0-snap vg0  -wi-a---  40.00g
>
>
> Some miscellaneous notes:
> * It _feels_ to only happen when the snapshot is filled at least something
> around 50-60%.
> * We can trigger something like this even without DRBD. When triggered
> however, the LV will never end up in SUSPENDED state and a second try of
> lvremove will always succeed.
>
> Thats all we have so far. I already had a private conversation with
> wa...@debian.org on this and we will (probably) provide him remote access on
> this system as soon as we have the setup reachable from the outside.
>
> Please let me know if I can provide any more information to get this fixed.
> I put drbd-dev in cc, maybe someone over there has an idea on this?
>
> @drbd-dev: system is debian wheezy, w/ drbd 8.3.11, lvm2 2.02.95.
>
> Thanks,
> Frank



-- 
Frank Steinborn - steinb...@sipgate.de
Telefon: +49 (0)211-63 55 55-87
Mobil: +49 (0) 173 87 87 2 87
Telefax: +49 (0)211-63 55 55-22

sipgate GmbH - Gladbacher Str. 74 - 40219 Düsseldorf
HRB Düsseldorf 39841 - Geschäftsführer: Thilo Salmon, Tim Mois
Steuernummer: 106/5724/7147, Umsatzsteuer-ID: DE219349391

www.sipgate.de - www.sipgate.at - www.sipgate.co.uk


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-08-12 Thread Sven Hartge
Hi *,

just a short notice on this/these bug/s (maybe one should merge all
those "unable to remove snapshot"-bugs?):

I recompiled the lvm2-src from Unstable (2.02.98-5) on Wheezy and
installed that on 7 of my servers which had problems with the removal of
lvm snapshots while using rsnapshot and mylvmbackup.

Right now, all 7 servers have each ran over 1000 iterations of
rsnapshot, thus creating and removing a lvm snapshot during the backup
cycle and not one of them failed.

So something in the current version of lvm2 seems to fix the bug or at
least reduce the likely-hood of it appearing dramatically.

With the original lvm2 (2.02.95-7) the lockup appears nearly instantly.

Grüße,
Sven.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-08-02 Thread Bastian Blank
clone -1 -2
retitle -1 lvm2 - Failed snapshot removal produces suspended devices
severity -1 grave
retitle -2 lvm2 - udev calls blkid on to be removed devices
thanks

On Mon, Feb 13, 2012 at 04:03:53PM +, Paul LeoNerd Evans wrote:
> # lvremove vg_cel/backups-20110930
> Do you really want to remove active logical volume backups-20110930? [y/n]: y
>   Unable to deactivate open vg_cel-backups--20110930-cow (254:35)
>   Failed to resume backups-20110930.
>   libdevmapper exiting with 7 device(s) still suspended.

This are in fact two bugs:
- If the cow device is open for some reason---cat is sufficient to
  trigger this---, the whole removal fails and devices remain suspended.
- udev calls blkid on cow devices that the kernel wants to remove during
  a "change" event.

Bastian

-- 
Madness has no purpose.  Or reason.  But it may have a goal.
-- Spock, "The Alternative Factor", stardate 3088.7


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-07-26 Thread Urban Loesch

Hi,

we had the same problems with Debian Wheezy, LVM2 and DRBD.
But this seems not DRBD related. It seems to be some problem between lvm and 
udevd.

See:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=549691

Stopping udevd before taking the snapshot and starting after removing the
snapshot solved the problem for us. It's only a workaround, but it works for us.

Regards
Urban


Am 26.07.2013 17:14, schrieb Frank Steinborn:

Hi,

we are a bit further in debugging this. We installed a DELL PowerEdge r620 
(same hardware as used in our DRBD-cluster where this problem happens). As
noone in this thread brought DRBD into play, I didn't expect any interaction 
with it related to this bug. However, we were not able to reproduce with
just LVM2 (eg. configure LV, do IO in LV, remove LV, hang.)

So we installed a second machine and put DRBD on top of the LVs. And voila, as 
soon as we create a snapshot of the LV where DRBD is on top and remove
this snapshot it fails ca. 1/3 of the time.

Some facts:

root@drbd-primary:~# lvremove --force /dev/vg0/lv0-snap
   Unable to deactivate open vg0-lv0--snap-cow (254:3)
   Failed to resume lv0-snap.
   libdevmapper exiting with 1 device(s) still suspended.

After this, "dmsetup info" gives the following output:

<<< snip >>>

Name:  vg0-lv0--snap
State: ACTIVE
Read Ahead:256
Tables present:LIVE
Open count:0
Event number:  0
Major, minor:  254, 1
Number of targets: 1
UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYy4WFhwy43CZA1g7zKFGF915pLAOIPvFZ

Name:  vg0-lv0-real
State: ACTIVE
Read Ahead:0
Tables present:LIVE
Open count:1
Event number:  0
Major, minor:  254, 2
Number of targets: 1
UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYC3ppjt1CZ3AcZR2hNz1VT5CHdM4RR32j-real

Name:  vg0-lv0
State: SUSPENDED
Read Ahead:256
Tables present:LIVE & INACTIVE
Open count:2
Event number:  0
Major, minor:  254, 0
Number of targets: 1
UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYC3ppjt1CZ3AcZR2hNz1VT5CHdM4RR32j

Name:  vg0-lv0--snap-cow
State: ACTIVE
Read Ahead:0
Tables present:LIVE
Open count:0
Event number:  0
Major, minor:  254, 3
Number of targets: 1
UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYy4WFhwy43CZA1g7zKFGF915pLAOIPvFZ-cow

<<< snap >>>

As you can see, the real LV with DRBD on top is now in state SUSPENDED - which 
causes the cluster to be non-functional as IO operations stall on both
the primary and secondary node until one does "dmsetup resume /dev/vg0/lv0".

Another interesting issue we've seen: after doing "dmsetup resume 
/dev/vg0/lv0", lv0-snap doesn't appear to be a snapshot anymore, given the output of
lvs (lv0-snap has no origin anymore):

   LV   VG   Attr LSize   Pool Origin Data%  Move Log Copy%  Convert
   lv0  vg0  -wi-ao-- 200.00g
   lv0-snap vg0  -wi-a---  40.00g


Some miscellaneous notes:
* It _feels_ to only happen when the snapshot is filled at least something 
around 50-60%.
* We can trigger something like this even without DRBD. When triggered however, 
the LV will never end up in SUSPENDED state and a second try of
lvremove will always succeed.

Thats all we have so far. I already had a private conversation with wa...@debian.org 
 on this and we will (probably) provide
him remote access on this system as soon as we have the setup reachable from 
the outside.

Please let me know if I can provide any more information to get this fixed. I 
put drbd-dev in cc, maybe someone over there has an idea on this?

@drbd-dev: system is debian wheezy, w/ drbd 8.3.11, lvm2 2.02.95.

Thanks,
Frank



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-07-26 Thread Frank Steinborn
Hi,

we are a bit further in debugging this. We installed a DELL PowerEdge r620
(same hardware as used in our DRBD-cluster where this problem happens). As
noone in this thread brought DRBD into play, I didn't expect any
interaction with it related to this bug. However, we were not able to
reproduce with just LVM2 (eg. configure LV, do IO in LV, remove LV, hang.)

So we installed a second machine and put DRBD on top of the LVs. And voila,
as soon as we create a snapshot of the LV where DRBD is on top and remove
this snapshot it fails ca. 1/3 of the time.

Some facts:

root@drbd-primary:~# lvremove --force /dev/vg0/lv0-snap
  Unable to deactivate open vg0-lv0--snap-cow (254:3)
  Failed to resume lv0-snap.
  libdevmapper exiting with 1 device(s) still suspended.

After this, "dmsetup info" gives the following output:

<<< snip >>>

Name:  vg0-lv0--snap
State: ACTIVE
Read Ahead:256
Tables present:LIVE
Open count:0
Event number:  0
Major, minor:  254, 1
Number of targets: 1
UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYy4WFhwy43CZA1g7zKFGF915pLAOIPvFZ

Name:  vg0-lv0-real
State: ACTIVE
Read Ahead:0
Tables present:LIVE
Open count:1
Event number:  0
Major, minor:  254, 2
Number of targets: 1
UUID:
LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYC3ppjt1CZ3AcZR2hNz1VT5CHdM4RR32j-real

Name:  vg0-lv0
State: SUSPENDED
Read Ahead:256
Tables present:LIVE & INACTIVE
Open count:2
Event number:  0
Major, minor:  254, 0
Number of targets: 1
UUID: LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYC3ppjt1CZ3AcZR2hNz1VT5CHdM4RR32j

Name:  vg0-lv0--snap-cow
State: ACTIVE
Read Ahead:0
Tables present:LIVE
Open count:0
Event number:  0
Major, minor:  254, 3
Number of targets: 1
UUID:
LVM-M0Z897O16CAiYbSivOzgSn0M9Ae9TdoYy4WFhwy43CZA1g7zKFGF915pLAOIPvFZ-cow

<<< snap >>>

As you can see, the real LV with DRBD on top is now in state SUSPENDED -
which causes the cluster to be non-functional as IO operations stall on
both the primary and secondary node until one does "dmsetup resume
/dev/vg0/lv0".

Another interesting issue we've seen: after doing "dmsetup resume
/dev/vg0/lv0", lv0-snap doesn't appear to be a snapshot anymore, given the
output of lvs (lv0-snap has no origin anymore):

  LV   VG   Attr LSize   Pool Origin Data%  Move Log Copy%  Convert
  lv0  vg0  -wi-ao-- 200.00g
  lv0-snap vg0  -wi-a---  40.00g


Some miscellaneous notes:
* It _feels_ to only happen when the snapshot is filled at least something
around 50-60%.
* We can trigger something like this even without DRBD. When triggered
however, the LV will never end up in SUSPENDED state and a second try of
lvremove will always succeed.

Thats all we have so far. I already had a private conversation with
wa...@debian.org on this and we will (probably) provide him remote access
on this system as soon as we have the setup reachable from the outside.

Please let me know if I can provide any more information to get this fixed.
I put drbd-dev in cc, maybe someone over there has an idea on this?

@drbd-dev: system is debian wheezy, w/ drbd 8.3.11, lvm2 2.02.95.

Thanks,
Frank


Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-07-21 Thread Nigel Kukard

Same problem here.

It mostly occurs if there was heavy IO load. It mostly occurs on machine 
with more than a few CPU's.


Its so far only occurred for me on LSI RAID controllers  (92xx), with 
dual quad core Xeon CPU's.


Running 'sync" before lvremove appears to make it occur less frequently.


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-07-03 Thread Frederic Van Espen
We are seeing this as well on a build server where the LVM volumes are
managed by sbuild. Randomly the snapshots cannot be removed by sbuild
when the build is done and all the lv* commands hang forever. This
happens several times per day so it is quite easily reproducible

The only way around this is to reboot and then manually remove them.

Attached is some output I see when I increase verbosity on schroot.
E: 05lvm: Using logical volume(s) on command line
E: 05lvm: Archiving volume group "main" metadata (seqno 333).
E: 05lvm: Removing snapshot baseline2-61c813e3-f1d7-4bd9-b9c7-0d3f85e8b4de
E: 05lvm: Found volume group "main"
E: 05lvm: Found volume group "main"
E: 05lvm: Loading main-baseline2_i386_chroot table (253:1)
E: 05lvm: Loading main-baseline2--61c813e3--f1d7--4bd9--b9c7--0d3f85e8b4de 
table (253:5)
E: 05lvm: main/snapshot0 already not monitored.
E: 05lvm: Suspending main-baseline2_i386_chroot (253:1) with device flush
E: 05lvm: Suspending 
main-baseline2--61c813e3--f1d7--4bd9--b9c7--0d3f85e8b4de (253:5) with device 
flush
E: 05lvm: Suspending main-baseline2_i386_chroot-real (253:6) with device 
flush
E: 05lvm: Suspending 
main-baseline2--61c813e3--f1d7--4bd9--b9c7--0d3f85e8b4de-cow (253:7) with 
device flush
E: 05lvm: Found volume group "main"
E: 05lvm: Resuming 
main-baseline2--61c813e3--f1d7--4bd9--b9c7--0d3f85e8b4de-cow (253:7)
E: 05lvm: Resuming main-baseline2_i386_chroot-real (253:6)
E: 05lvm: Resuming main-baseline2--61c813e3--f1d7--4bd9--b9c7--0d3f85e8b4de 
(253:5)
E: 05lvm: Removing 
main-baseline2--61c813e3--f1d7--4bd9--b9c7--0d3f85e8b4de-cow 
(253:7)device-mapper: remove ioctl on  failed: Device or resource busyUnable to 
deactivate main-baseline2--61c813e3--f1d7--4bd9--b9c7--0d3f85e8b4de-cow 
(253:7)Failed to resume 
baseline2-61c813e3-f1d7-4bd9-b9c7-0d3f85e8b4de.libdevmapper exiting with 1 
device(s) still suspended.



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-06-25 Thread Alasdair G Kergon
On Mon, Jun 24, 2013 at 05:17:21PM +0200, Frank Steinborn wrote:
> What i basically want to ask is - are there any ETAs possible when this
> will be resolved? We are seriously considering a downgrade to squeeze. That
> would be a _huge_ pain, though.
 
I made some suggestions further back on the bug about the sort of things
Debian should investigate.  I do not believe these problems can be
reproduced upstream or in non-Debian-based distributions and so, as
upstream maintainer, it's not something I can justify putting any time
into: My first answer will always be to rebuild the packages using the
upstream code that works fine elsewhere.

Alasdair


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-06-25 Thread Chris Bolt
Another “me too” here. Nightly LVM snapshots cause all I/O to the snapshotted 
LV to hang. None of the dmsetup commands bring them back. The only way to bring 
things back is to power cycle the server, corrupting data. This is a serious 
regression.


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-06-24 Thread Frank Steinborn
This bug is a really serious issue for us. Let me explain.

We run active/passive MySQL-clusters using Pacemaker and DRBD. For backups,
we create an LVM-snapshot on the passive node, fire up a MySQL server and
dump data from that snapshot.
When the dump is finished, we stop MySQL on the passive node and remove the
snapshot. This fails almost every time due to this bug.

The big problem though is that when the bug appears and the LVM subsystem
freezes, our active node hangs completetly in iowait, due to DRBD unable to
finish IO operations on the passive node. So basically the whole cluster
fails and we get hard downtime as long as we don't take action manually and
do 'lvresume' and/or reboot.

We had a look on some of the proposed workarounds, but they are no option
for us.

What i basically want to ask is - are there any ETAs possible when this
will be resolved? We are seriously considering a downgrade to squeeze. That
would be a _huge_ pain, though.

Thanks,
Frank


Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-03-13 Thread Nikola Kotur
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

This was almost a show-stopper for me.

If you want use LVM snapshots until this bug is fixed, there are only
two options.

1) Low level logical volume management

Remove the snapshot from the device mapper before actually removing it:

dmsetup remove YOUR_VG-snap1
dmsetup remove YOUR_VG-snap1-cow

Then you'll be able to lvremove it without making your server useless.
But this looks and feels dirty. Better to...

2) Use thin provisioning [1]

And get arbitrary depth of recursive snapshots for free :)

Seriously, this bug does not exist if you use thin provisioning. Try
this (if your kernel supports it):

Create thin pool first:

lvcreate --size 300M --type thin-pool --thinpool thin_pool YOUR_VG

Create volume:

lvcreate -V4G -T YOUR_VG/thin_pool --name lv1

Then create and and remove snapshot many times:

for i in {1..20}; do lvcreate -L4G -n snap1
- --snapshot /dev/YOUR_VG/lv1 ; lvremove -f /dev/YOUR_VG/snap1 ; done

[1] -
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/Documentation/device-mapper/thin-provisioning.txt?id=refs/tags/v3.9-rc2

- -- 
Nikola Kotur

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)

iQIcBAEBAgAGBQJRQQI4AAoJEMOSYZrYwIE7PvgQAL1hZc5iFEOjwxvSSEddKSrA
G711GTXbEdFfmgxZLkmiP6eTswIk2sbJ5WY0xvnehjWEmaTeX8NK6kYDGDwMTT6M
AZdQ3USwcn/3EZDjEUXqvQSojKbI0tsTcanouccs/PQrByMiEM3HQsCDja/xikbv
6mB4yeQSXjSiSPSc8Nh2YijfPK+cAHg7ArZCQ4qpL8yoV8Db3sN5fkwGv97I3dBf
lW6izHKrHqNpDiOzAS6MOcURF1/h3mlW8bYgsSE22pfWLnFJcEXWeLzEmCXuI+e0
YbyVCgGlPKTO2vmdlKUozqR5hT6PyvPhKUy6vXXuXWPUQMr1sSMPGEtLM+yfKlOu
s7XJOvX2hMpH56e8cNEnsXZYRGWfFtFtBA0ZwAo6Z1t0Y83kA+uXyVJ4/emfo5Yv
SsdgB8Ou+G4DuRSjcDYSLH86qshSD6JjAVif48fYFtaoz2wiGrXdAeGYf0xDGYtJ
zauZfn/HCfm4iMUJOuGfHtxZnQJKPENpwN/jnvyqsEdvHcInyrD4AZ5fz2a1F2nb
zm03EFl9meQ5cOI4wWlLZ2vw8oMS+kp8vQTcJrwD5cFw6Oza9ziC7VzaxF+nrPFO
LZ8tD83gHW/PStApP7VYzs08r4v1rO5opfEWJa3u+0OBHKcSRW3v2VPgkJYxdasr
/Sbb9Kjg6Rv/nyiSY6la
=ZpzP
-END PGP SIGNATURE-


Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2013-03-13 Thread Nikola Kotur
I have the same issue. Here's my setup:

Debian: 7.0 wheezy
lvm: 2.02.95-6
dmsetup: 2:1.02.74-6
linux: 3.8.2

Using 'dmsetup resume' doesn't help at all, the only solution I have to
reboot.

-- 
Nikola Kotur


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2012-10-07 Thread C. R. Meloche - CREDIL

On 10/07/2012 07:33 AM, Paul LeoNerd Evans wrote:

On Tue, Feb 14, 2012 at 06:20:57PM +, Paul LeoNerd Evans wrote:

On Mon, Feb 13, 2012 at 05:40:34PM +, Alasdair G Kergon wrote:

dmsetup info -c
dmsetup table
dmsetup status

will show the current device-mapper state

It appears to have happened again:

root@cel:~
# lvremove --force vg_cel/backups-20120117
   /dev/vg_cel/backups-20120117: read failed after 0 of 4096 at 161061208064: 
Input/output error
   /dev/vg_cel/backups-20120117: read failed after 0 of 4096 at 161061265408: 
Input/output error
   /dev/vg_cel/backups-20120117: read failed after 0 of 4096 at 0: Input/output 
error
   /dev/vg_cel/backups-20120117: read failed after 0 of 4096 at 4096: 
Input/output error
   Unable to deactivate open vg_cel-backups--20120117-cow (254:31)
   Failed to resume backups-20120117.
libdevmapper exiting with 3 device(s) still suspended.
I have the same problem as well. After reboot I would be able to finally 
delete the device. This device would no longer be a snapshot by then. I 
noted that if the device is mirrored (lvm mirror) I get the problem 
every time. However, if the device is not mirrored, for instance I tried 
with swap files that are not, then I do not get the problem. So I 
mirrored the entire server and I was able to do my backups. Note that I 
was unable to get out of this using the dmsetup resume idea as described 
before, my device do not seem to get suspended.


I have a freshly installed Ubuntu server in operation, I'll get to test 
on Tuesday to see if I get the same problem. If I do not I'll report the 
different versions of software & kernel used then. Do we know if the 
problem is still there in sid?   In the mean time I may have to 
re-install my server using md mirrored raid and lvm on top instead.





root@cel:~
# lvs

[at this point the process is hard dead, no SIGTERM etc.. can wake it]

[on another terminal...]

root@cel:~
# ps -lp 26009
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY  TIME CMD
4 D 0 26009 25789  0  80   0 -  6384 ?  pts/900:00:00 lvs

root@cel:~
# kill -9 26009

root@cel:~
# ps -lp 26009
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY  TIME CMD
4 D 0 26009 25789  0  80   0 -  6384 -  pts/900:00:00 lvs




dmsetup info -c
dmsetup table
dmsetup status

Fortunately even in this state these three commands still work OK. Find
attached current outputs from them.




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2012-10-07 Thread Paul LeoNerd Evans
On Tue, Feb 14, 2012 at 06:20:57PM +, Paul LeoNerd Evans wrote:
> On Mon, Feb 13, 2012 at 05:40:34PM +, Alasdair G Kergon wrote:
> > dmsetup info -c 
> > dmsetup table
> > dmsetup status
> > 
> > will show the current device-mapper state

It appears to have happened again:

root@cel:~
# lvremove --force vg_cel/backups-20120117
  /dev/vg_cel/backups-20120117: read failed after 0 of 4096 at 161061208064: 
Input/output error
  /dev/vg_cel/backups-20120117: read failed after 0 of 4096 at 161061265408: 
Input/output error
  /dev/vg_cel/backups-20120117: read failed after 0 of 4096 at 0: Input/output 
error
  /dev/vg_cel/backups-20120117: read failed after 0 of 4096 at 4096: 
Input/output error
  Unable to deactivate open vg_cel-backups--20120117-cow (254:31)
  Failed to resume backups-20120117.
libdevmapper exiting with 3 device(s) still suspended.

root@cel:~
# lvs

[at this point the process is hard dead, no SIGTERM etc.. can wake it]

[on another terminal...]

root@cel:~
# ps -lp 26009
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY  TIME CMD
4 D 0 26009 25789  0  80   0 -  6384 ?  pts/900:00:00 lvs

root@cel:~
# kill -9 26009

root@cel:~
# ps -lp 26009
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY  TIME CMD
4 D 0 26009 25789  0  80   0 -  6384 -  pts/900:00:00 lvs



> > dmsetup info -c 
> > dmsetup table
> > dmsetup status

Fortunately even in this state these three commands still work OK. Find
attached current outputs from them.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/
Name Maj Min Stat Open Targ Event  UUID 

vg_cel-swap_b254   4 L--w21  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HI16iR35iOEcTzvQkP5FiB3gEM6j7xnFz 
vg_cel-usr32 254   5 L--w01  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H1Z1LYecBeVkgyNiDO9NWwav7Mlz3yaPZ 
vg_cel-usr64 254  37 L--w11  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HA0GN0FOMeOQcnoSSyfqQTP1pO3l0tTnp 
vg_cel-backups--20120219-cow 254  33 L-sw11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HOk3zIGteF4ZW8RMx6gK278OeQhx4XxVK-cow 
vg_cel-home_mlog 254  49 L--w01  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3Hxv6VYfC1GdWVa1iY27oJbrIlNJL7OkM5 
vg_cel-backups_mlog  254  24 L--w11  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H5c1VA0oGWdEkXfJ1YGEzbT3dQQN4IxM4 
vg_cel-swap_a254   3 L--w21  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HS56oauHoAv4bKEvbNYDY19g3ggwmZURp 
vg_cel-home_mimage_3 254  20 L--w12  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HLYoMacTflKmk0K651ABTvfZoeA7MZQ0c 
vg_cel-var64_mimage_1254  42 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HqoDJcke6dwNryJtgQ3Qx5O3BiPrN3I5f 
vg_cel-backups-real  254  27 L--w21  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H6rPj6EdEYtrNDpD5PJAT0Rcda62AfMKM-real
vg_cel-home_mimage_2 254  19 L--w12  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H2uhR9DbUE2KndNqe9VUxfVnODnKHwDB2 
vg_cel-var64_mimage_0254  41 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3Hl3MBXMQsrY28guo9JFwIaRgMqe25jHTL 
vg_cel-backups_mimage_1  254  26 L--w12  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HPWt0LqTzySg5Fxl4xULR4XY4KoR1yqGb 
vg_cel-root32_mimage_3   254  11 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HULW60XnmaFAf3M119FLhnfsplFiqHxSl 
vg_cel-backups_mimage_0  254  25 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H0l57MXC4SeQMiDyWRoqRLOoQiUDIN0PY 
vg_cel-root32_mimage_2   254  10 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HyRF0QSO3X1RhgF4tcELhwKtXrrPxq08a 
vg_cel-var32 254  18 L--w01  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HEbzCGZhM0COhWRT4ouh7NHm3Jlk2nfH1 
vg_cel-var64 254  43 L--w11  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H5ykOO1tSch01sjuvKUA00ztOVjRxmSfY 
vg_cel-backups--20120117-cow 254  31 L--w01  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3Hhaj2TQMEr2opc41ZmmfyKvbptO2hm2oI-cow 
vg_cel-var_mlog_mimage_1 254  14 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HO0jSYUkILFzJOZE0t94wHUha3xxRpDPQ 
vg_cel-usr64_mlog_mimage_1   254  51 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HI3mQKNiPg2xc5g66Yk85kkEqYiFkpopJ 
vg_cel-var64_mlog_mimage_1   254  39 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H0OemESnDTo95rCmeJYI0dBhYqXHHayfF 
vg_cel-var_mlog_mimage_0 254  13 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HddToiFoW4yVX1f64FIk8fJw23BETCbaI 
vg_cel-usr64_mlog_mimage_0   254  50 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HvNf7hEo1zdi3QEreZuViYu6Whva4Ad3r 
vg_cel-var64_

Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2012-04-25 Thread Chris Dunlop
On Mon, Feb 13, 2012 at 04:03:53PM +, Paul LeoNerd Evans wrote:
> Package: lvm2
> Version: 2.02.88-2
> Severity: normal
> 
> Tried and failed to remove an LVM snapshot:
> 
> 
> root@cel:~
> # lvremove vg_cel/backups-20110930
> Do you really want to remove active logical volume backups-20110930? [y/n]: ^C
>   Logical volume backups-20110930 not removed
> 
> root@cel:~
> # lvchange -an vg_cel/backups-20110930
>   Can't change snapshot logical volume "backups-20110930"
> 
> root@cel:~
> # lvremove vg_cel/backups-20110930
> Do you really want to remove active logical volume backups-20110930? [y/n]: y
>   Unable to deactivate open vg_cel-backups--20110930-cow (254:35)
>   Failed to resume backups-20110930.
>   libdevmapper exiting with 7 device(s) still suspended.
> 
> 
> At this point now the entire LVM subsystem is totally frozen. No commands
> ever complete. Any LVM-related command hangs and is not SIGKILLable.

"Me too", with same lvm2 package, on home-grown linux-3.3.1.

Synopsis...

I was able to get out of this state without rebooting using:

  dmsetup resume /dev/mapper/vg00-foo
  dmsetup remove /dev/mapper/vg00-foo-real

Expansion...

I had 2 snapshots, foo and foo2, mounted at /mnt/foo and
/mnt/foo2. I tried removing them like:

  # for d in foo foo2
  do
umount /mnt/${d} && lvremove -f vg00/${d}-snap
  done

...and got:

  Logical volume "foo-snap" successfully removed
  Unable to deactivate open vg00-foo2--snap-cow (253:10)
  Failed to resume foo2-snap.
  libdevmapper exiting with 1 device(s) still suspended.

After that my 'lvs' hung as Paul decribes.

Note: this was actually the second time I've had this problem
under the same circumstances, the first time I ended up
reluctantly rebooting. I wonder if doing multiple lvremoves in
quick succession has anything to do with it?

This time I looked a little deeper into it and found this bug
report which prompted me to look at the dmsetup stuff with which
I've been extravagently unfamiliar. So...

  # dmsetup info /dev/mapper/*foo2*
  Name:  vg00-foo2
  State: SUSPENDED <<<
  Read Ahead:26624
  Tables present:LIVE & INACTIVE   <<<
  Open count:2
  Event number:  0
  Major, minor:  253, 2
  Number of targets: 1
  UUID: LVM-nxK1Vn04ULIJaEIiwxsldXVoJAS9rp3APs0zI7cK2SDQf1lM2CHiXAfQsvKRfeWg

  Name:  vg00-foo2-real
  State: ACTIVE
  Read Ahead:0
  Tables present:LIVE
  Open count:1
  Event number:  0
  Major, minor:  253, 9
  Number of targets: 3
  UUID: 
LVM-nxK1Vn04ULIJaEIiwxsldXVoJAS9rp3APs0zI7cK2SDQf1lM2CHiXAfQsvKRfeWg-real

  Name:  vg00-foo2--snap
  State: ACTIVE
  Read Ahead:26624
  Tables present:LIVE
  Open count:0
  Event number:  0
  Major, minor:  253, 8
  Number of targets: 1
  UUID: LVM-nxK1Vn04ULIJaEIiwxsldXVoJAS9rp3AV6mYEfmfj24I3epL0ldVOHeOXfLDi3SI

  Name:  vg00-foo2--snap-cow
  State: ACTIVE
  Read Ahead:0
  Tables present:LIVE
  Open count:0
  Event number:  0
  Major, minor:  253, 10
  Number of targets: 1
  UUID: LVM-nxK1Vn04ULIJaEIiwxsldXVoJAS9rp3AV6mYEfmfj24I3epL0ldVOHeOXfLDi3SI-cow

Not knowing what I was doing, but because of the SUSPENDED state I
tried to resume that device:

  # dmsetup resume /dev/mapper/vg00-foo2

At this point the previously-hung 'lvs' returned. A new 'lvs' showed
the snapshot in question was not longer present, however I still had:

  # ls -l /dev/mapper/*foo2*
  lrwxrwxrwx 1 root root   7 2012-04-26 14:51 vg00-foo2 -> ../dm-2
  lrwxrwxrwx 1 root root   7 2012-04-26 14:21 vg00-foo2-real -> ../dm-9

Using dmsetup to remove that extra device worked:

  # dmsetup remove /dev/mapper/vg00-foo2-real
  # ls -l /dev/mapper/*foo2*
  lrwxrwxrwx 1 root root   7 2012-04-26 14:51 vg00-foo2 -> ../dm-2


Cheers,

Chris



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2012-02-19 Thread Paul LeoNerd Evans
On Tue, Feb 14, 2012 at 06:20:57PM +, Paul LeoNerd Evans wrote:
> On Mon, Feb 13, 2012 at 05:40:34PM +, Alasdair G Kergon wrote:
> > dmsetup info -c 
> > dmsetup table
> > dmsetup status
> > 
> > will show the current device-mapper state
> 
> OK; find attached three .txt files hopefully showing the output from
> these.

Is there anything else useful I can extract from the machine in its
current state? I'd like to reboot it sometime soon so I can manage LVM
again.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: Digital signature


Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2012-02-14 Thread Paul LeoNerd Evans
On Mon, Feb 13, 2012 at 05:40:34PM +, Alasdair G Kergon wrote:
> dmsetup info -c 
> dmsetup table
> dmsetup status
> 
> will show the current device-mapper state

OK; find attached three .txt files hopefully showing the output from
these.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/
Name Maj Min Stat Open Targ Event  UUID 

vg_cel-swap_b254   7 L--w21  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HI16iR35iOEcTzvQkP5FiB3gEM6j7xnFz 
vg_cel-usr32 254   8 L--w01  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H1Z1LYecBeVkgyNiDO9NWwav7Mlz3yaPZ 
vg_cel-backups--20120213-cow 254  54 L-sw11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3Hb6S2LpyJBiuOs25EX2OqRKUyhaX3x0X3-cow 
vg_cel-usr64 254  39 L--w11  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HA0GN0FOMeOQcnoSSyfqQTP1pO3l0tTnp 
vg_cel-backups_mlog  254  30 L--w11  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H5c1VA0oGWdEkXfJ1YGEzbT3dQQN4IxM4 
vg_cel-home_mlog 254  24 L--w11  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3Hxv6VYfC1GdWVa1iY27oJbrIlNJL7OkM5 
vg_cel-swap_a254   6 L--w21  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HS56oauHoAv4bKEvbNYDY19g3ggwmZURp 
vg_cel-backups--20111022 254  38 L-sw01  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3Hbf5eut1ALuptM6fKZG28KppAyYI3toNG 
vg_cel-home_mimage_3 254  26 L--w12  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HLYoMacTflKmk0K651ABTvfZoeA7MZQ0c 
vg_cel-backups--20120213 254  53 L-sw01  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3Hb6S2LpyJBiuOs25EX2OqRKUyhaX3x0X3 
vg_cel-var64_mimage_1254  44 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HqoDJcke6dwNryJtgQ3Qx5O3BiPrN3I5f 
vg_cel-backups-real  254  33 L--w41  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H6rPj6EdEYtrNDpD5PJAT0Rcda62AfMKM-real
vg_cel-home_mimage_2 254  25 L--w12  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H2uhR9DbUE2KndNqe9VUxfVnODnKHwDB2 
vg_cel-var64_mimage_0254  43 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3Hl3MBXMQsrY28guo9JFwIaRgMqe25jHTL 
vg_cel-backups_mimage_1  254  32 L--w12  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HPWt0LqTzySg5Fxl4xULR4XY4KoR1yqGb 
vg_cel-backups--20110930 254  36 L--w02  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HK1nWnhEbWMtMYTyFrUb9JM3aH2CKp4sn 
vg_cel-root32_mimage_3   254  14 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HULW60XnmaFAf3M119FLhnfsplFiqHxSl 
vg_cel-backups_mimage_0  254  31 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H0l57MXC4SeQMiDyWRoqRLOoQiUDIN0PY 
vg_cel-root32_mimage_2   254  13 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HyRF0QSO3X1RhgF4tcELhwKtXrrPxq08a 
vg_cel-var32 254  21 L--w01  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HEbzCGZhM0COhWRT4ouh7NHm3Jlk2nfH1 
vg_cel-backups--20120117-cow 254  52 L-sw11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3Hhaj2TQMEr2opc41ZmmfyKvbptO2hm2oI-cow 
vg_cel-var64 254  45 L--w11  1 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H5ykOO1tSch01sjuvKUA00ztOVjRxmSfY 
vg_cel-var_mlog_mimage_1 254  17 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HO0jSYUkILFzJOZE0t94wHUha3xxRpDPQ 
vg_cel-usr64_mlog_mimage_1   254  47 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HI3mQKNiPg2xc5g66Yk85kkEqYiFkpopJ 
vg_cel-backups--20110930-cow 254  35 L--w02  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HK1nWnhEbWMtMYTyFrUb9JM3aH2CKp4sn-cow 
vg_cel-var64_mlog_mimage_1   254  41 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3H0OemESnDTo95rCmeJYI0dBhYqXHHayfF 
vg_cel-backups--20111022-cow 254  37 L-sw11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3Hbf5eut1ALuptM6fKZG28KppAyYI3toNG-cow 
vg_cel-var_mlog_mimage_0 254  16 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HddToiFoW4yVX1f64FIk8fJw23BETCbaI 
vg_cel-usr64_mlog_mimage_0   254  46 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HvNf7hEo1zdi3QEreZuViYu6Whva4Ad3r 
vg_cel-var64_mlog_mimage_0   254  40 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HhsqXb6OOSOqtxUFfSfMJoL0hK6BIzC2C 
vg_cel-var32_mimage_3254  20 L--w12  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HJ5WKRVhn9xXOoXQTouET2hi0sGgjo08m 
vg_cel-usr64_mimage_1254  50 L--w11  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HsgeQlYNf4qSfVYNpm6mCByr1ccriY3qe 
vg_cel-var32_mimage_2254  19 L--w12  0 
LVM-85CFNaM1i2Op58cnR160cqLbFQ5pOb3HBh8rtUwqitorKB1r031fLGGcPzAy8xnR 
vg_cel-usr64_mimage_0254  49 L--w11  0 
LVM-85CFNaM1i2Op

Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2012-02-13 Thread Alasdair G Kergon
Maybe some udev interaction.

dmsetup info -c 
dmsetup table
dmsetup status

will show the current device-mapper state



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#659762: lvm2: LVM commands freeze after snapshot delete fails

2012-02-13 Thread Paul LeoNerd Evans
Package: lvm2
Version: 2.02.88-2
Severity: normal

Tried and failed to remove an LVM snapshot:


root@cel:~
# lvremove vg_cel/backups-20110930
Do you really want to remove active logical volume backups-20110930? [y/n]: ^C
  Logical volume backups-20110930 not removed

root@cel:~
# lvchange -an vg_cel/backups-20110930
  Can't change snapshot logical volume "backups-20110930"

root@cel:~
# lvremove vg_cel/backups-20110930
Do you really want to remove active logical volume backups-20110930? [y/n]: y
  Unable to deactivate open vg_cel-backups--20110930-cow (254:35)
  Failed to resume backups-20110930.
  libdevmapper exiting with 7 device(s) still suspended.


At this point now the entire LVM subsystem is totally frozen. No commands
ever complete. Any LVM-related command hangs and is not SIGKILLable.

root@cel:~
# lvs
^C

[elsewhere]
root@cel:~
# ps -ef | grep lvs
root 31791 31510  0 15:20 pts/14   00:00:00 lvs
root 32272 32176  0 15:59 pts/17   00:00:00 grep lvs

root@cel:~
# kill -9 31791

root@cel:~
# kill -9 31791

root@cel:~
# kill -9 31791

root@cel:~
# ps -ef | grep lvs
root 31791 31510  0 15:20 pts/14   00:00:00 lvs
root 32274 32176  0 15:59 pts/17   00:00:00 grep lvs


I tried to strace it to see what it was blocked on. Even strace now hangs:


root@cel:~
# strace -p 31791
Process 31791 attached - interrupt to quit
^C
^C


strace is at least killable though:

^Z
[1]+  Stopped strace -p 31791

root@cel:~
# kill %1

[1]+  Stopped strace -p 31791

root@cel:~
# kill -9 %1

[1]+  Stopped strace -p 31791

root@cel:~
# 
[1]+  Killed  strace -p 31791


In this state the only remedy I have found is a complete reboot of the system.

Other existing LVs do appear to be functioning normally, however, and the
machine generally works fine.

-- System Information:
Debian Release: wheezy/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.0.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages lvm2 depends on:
ii  dmsetup 2:1.02.67-2
ii  initscripts 2.88dsf-13.13
ii  libc6   2.13-24
ii  libdevmapper1.02.1  2:1.02.67-2
ii  libreadline55.2-11
ii  libudev0175-3
ii  lsb-base3.2-28

lvm2 recommends no packages.

lvm2 suggests no packages.

-- no debconf information



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org