[systemd-devel] [PATCH v3] device: Fix overzealous unmounting of tentative device mounts

2015-05-19 Thread Martin Pitt
Hello all,

I have a better fix now.

Martin Pitt [2015-05-19  8:59 +0200]:
  - Boot, dev-foo.device becomes DEVICE_FOUND_MOUNT/tentative
 
  - Do some more mounts from /dev/foo, e. g.
 
mkdir /tmp/etc /tmp/boot
mount -o bind /etc /tmp/etc
mount -o bind /boot /tmp/boot
 
(In practice, you'd probably do such bind mounts with nspawn --bind
or lxc.mount.entry)
 
tmp-etc.mount and tmp-boot.mount will be BindsTo=dev-foo.device,
and dev-foo.device's status is still unchanged
(DEVICE_FOUND_MOUNT/tentative)
 
  - Now do umount /tmp/boot. This *also* umounts /tmp/etc!

With this patch, the unit states are now as expected:

  Id=dev-foo.device
  BindsTo=
  BoundBy=tmp-boot.mount
  ActiveState=activating
  SubState=tentative

  Id=tmp-boot.mount
  BindsTo=dev-foo.device
  BoundBy=
  ActiveState=active
  SubState=mounted

And /tmp/etc stays around. After manually unmounting /tmp/mount, the
device state properly goes to dead:

  Id=dev-foo.device
  BindsTo=
  BoundBy=
  ActiveState=inactive
  SubState=dead

 So the problem is that this tentative → dead transition only works if
 a device is referenced once, but causes overzealous unmounts if there
 are more references.
 
 We need a reference count, or check if the device is bound by other
 device still. Come to think of it now, we actually already have that:
 
   Id=dev-foo.device
   BoundBy=tmp-boot.mount tmp-etc.mount
 
 But my patch doesn't take that into account yet.

This one does now. I left a fairly comprehensive commit log as this
isn't that easy to explain/understand. If it's too verbose for you I
can also trim it back a bit.

Thanks,

Martin

-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)
From bfd1d6f2032dd68d36fbba716ec0e645cb3cbf97 Mon Sep 17 00:00:00 2001
From: Martin Pitt martin.p...@ubuntu.com
Date: Tue, 19 May 2015 09:59:21 +0200
Subject: [PATCH 2/2] device: Fix overzealous unmounting of tentative device
 mounts
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Scenario: booting with plan9 fs, a container with writable /sys, or some other
thing with a mount that references a device /dev/foo which isn't actually
available as /dev/foo and in udev, i. e. it is state tentative.

 - Boot, dev-foo.device becomes DEVICE_FOUND_MOUNT/tentative

 - Do some more mounts from /dev/foo, e. g.

   mkdir /tmp/etc /tmp/boot
   mount -o bind /etc /tmp/etc
   mount -o bind /boot /tmp/boot

   (In practice, you'd probably do such bind mounts with nspawn --bind
   or lxc.mount.entry)

   tmp-etc.mount and tmp-boot.mount will be BindsTo=dev-foo.device,
   and dev-foo.device's status is still unchanged
   (DEVICE_FOUND_MOUNT/tentative)

 - Now do umount /tmp/boot. This *also* umounts /tmp/etc!

   Journal says

 systemd[1]: tmp-boot.mount: Changed mounted - dead
 systemd[1]: dev-foo.device: Changed tentative - dead
 systemd[1]: tmp-etc.mount: Changed mounted - unmounting
 systemd[1]: Unmounted /tmp/etc.

  and dev-foo.device is now inactive/dead, which tears down
  the bound tmp-etc.mount. Boom!

Reason: Unmounting /tmp/boot triggers
device_update_found_one(found==DEVICE_FOUND_MOUNT, add==0).
d-found previously was DEVICE_FOUND_MOUNT, and this

 n = add ? (d-found | found) : (d-found  ~found);

computes the new state to 0 (i. e. DEVICE_NOT_FOUND), and calls
device_set_state(DEVICE_DEAD). Thus here we (1) don't consider that
dev-foo.device is still bound by other units (tmp-etc.mount) and (2) lose the
information that dev-foo.device is tentative.

We must only do this tentative → dead transition if there are no other mount
references to dev-foo.device any more, otherwise it must keep its state to
prevent systemd from unmounting everything else from a tentative device as soon
as one mount from it gets unmounted.

Add a check device_update_found_one() to see if there any active mounts bound
to a device when we pick up an unmount from mountinfo, and keep the state if
so.

https://launchpad.net/bugs/102
http://lists.freedesktop.org/archives/systemd-devel/2015-May/031658.html
http://lists.freedesktop.org/archives/systemd-devel/2015-May/031952.html
---
 src/core/device.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/core/device.c b/src/core/device.c
index c784cab..65eb773 100644
--- a/src/core/device.c
+++ b/src/core/device.c
@@ -450,6 +450,25 @@ static void device_update_found_one(Device *d, bool add, DeviceFound found, bool
 if (n == d-found)
 return;
 
+/* If we found a tentative device via mountinfo which is still
+ * referenced by other mounts than the one that just got unmounted, we
+ * must leave it FOUND_MOUNT/tentative; otherwise we'd trigger
+ * unmounting of all other bound mounts. */
+if (d-found == DEVICE_FOUND_MOUNT  n == 0) {
+Iterator i;
+Unit *bound;
+
+

Re: [systemd-devel] [PATCH v3] device: Fix overzealous unmounting of tentative device mounts

2015-05-19 Thread systemd github import bot
Patchset imported to github.
Pull request:
https://github.com/systemd-devs/systemd/compare/master...systemd-mailing-devs:20150519082346.GG3222%40piware.de

--
Generated by https://github.com/haraldh/mail2git
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH v3] device: Fix overzealous unmounting of tentative device mounts

2015-05-19 Thread Lennart Poettering
On Tue, 19.05.15 10:23, Martin Pitt (martin.p...@ubuntu.com) wrote:

 Hello all,
 
 I have a better fix now.
 
  So the problem is that this tentative → dead transition only works if
  a device is referenced once, but causes overzealous unmounts if there
  are more references.

Ah, indeed, that makes sense!

  We need a reference count, or check if the device is bound by other
  device still. Come to think of it now, we actually already have that:
  
Id=dev-foo.device
BoundBy=tmp-boot.mount tmp-etc.mount
  
  But my patch doesn't take that into account yet.
 
 This one does now. I left a fairly comprehensive commit log as this
 isn't that easy to explain/understand. If it's too verbose for you I
 can also trim it back a bit.

I have now committed a different fix now, that keeps counting of the
mount points in mount.c, instead of reaching over from device.c.

I only gave this light testing, would be cool if you could check if
this fixes things for you.

http://cgit.freedesktop.org/systemd/systemd/commit/?id=fcd8b266edf0df2b85079fcf7b099cd4028740e6

This commit will now collect two sets of devices while going through
/proc/self/mountinfo: the devices of lines that are no longer there,
and the devices of lines that are there. Only for devices in the
former set that are not in the latter we will now propagate an event
to device.c.

Does this make sense?

 +/* If we found a tentative device via mountinfo which is still
 + * referenced by other mounts than the one that just got unmounted, 
 we
 + * must leave it FOUND_MOUNT/tentative; otherwise we'd trigger
 + * unmounting of all other bound mounts. */
 +if (d-found == DEVICE_FOUND_MOUNT  n == 0) {
 +Iterator i;
 +Unit *bound;
 +
 +SET_FOREACH(bound, UNIT(d)-dependencies[UNIT_BOUND_BY], i) {
 +if (bound-type == UNIT_MOUNT 
 +MOUNT(bound)-state != MOUNT_DEAD  
 MOUNT(bound)-state != MOUNT_FAILED) {
 +log_unit_debug(UNIT(d), Still bound by %s 
 (%s), keeping state.,
 +   bound-id, 
 mount_state_to_string(MOUNT(bound)-state));
 +return;
 +}
 +}
 +log_unit_debug(UNIT(d), Not referenced by any mounts any 
 more, changing to dead.);
 +}
 +
  previous = d-found;
  d-found = n;

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH v3] device: Fix overzealous unmounting of tentative device mounts

2015-05-19 Thread Martin Pitt
Hey Lennart,

Lennart Poettering [2015-05-19 13:56 +0200]:
 I have now committed a different fix now, that keeps counting of the
 mount points in mount.c, instead of reaching over from device.c.
 
 I only gave this light testing, would be cool if you could check if
 this fixes things for you.
 
 http://cgit.freedesktop.org/systemd/systemd/commit/?id=fcd8b266edf0df2b85079fcf7b099cd4028740e6
 
 This commit will now collect two sets of devices while going through
 /proc/self/mountinfo: the devices of lines that are no longer there,
 and the devices of lines that are there. Only for devices in the
 former set that are not in the latter we will now propagate an event
 to device.c.
 
 Does this make sense?

It does, and it indeed should avoid some round trips. However, it does
not work yet. I added this for extra debugging:

--- a/src/core/device.c
+++ b/src/core/device.c
@@ -771,6 +771,9 @@ int device_found_node(Manager *m, const char *node, bool 
add, DeviceFound found,
 assert(m);
 assert(node);
 
+if (node[0] == '/')
+log_warning( device_found_node node %s add %i found %i 
now %i, node, add, found, now);
+
 /* This is called whenever we find a device referenced in
  * /proc/swaps or /proc/self/mounts. Such a device might be
  * mounted/enabled at a time where udev has not finished

After unmounting /tmp/etc, my dev-sda3.device (which plays the role of
dev-foo.device) still becomes dead, and /tmp/boot gets unmounted:

 unmounting /tmp/etc 
Id=dev-sda3.device
BindsTo=
BoundBy=
ActiveState=inactive
SubState=dead

Id=tmp-boot.mount
BindsTo=
BoundBy=
ActiveState=inactive
SubState=dead

Journal follows. The first bits are from the manual mount commands:


| systemd[1]:  device_found_node node /dev/sda3 add 1 found 2 now 1
| systemd[1]:  device_found_node node /dev/sda3 add 1 found 2 now 1
| systemd[1]: tmp-etc.mount: Changed dead - mounted
| systemd[1]:  device_found_node node /dev/sda3 add 1 found 2 now 1
| systemd[1]:  device_found_node node /dev/sda3 add 1 found 2 now 1
| systemd[1]:  device_found_node node /dev/sda3 add 1 found 2 now 1
| systemd[1]: tmp-boot.mount: Changed dead - mounted
| systemd[1]:  device_found_node node /dev/sda3 add 1 found 2 now 1
| systemd[1]:  device_found_node node /dev/sda3 add 1 found 2 now 1

Now umount /tmp/etc happens:

| systemd[1]: tmp-etc.mount: Changed mounted - dead
| systemd[1]:  device_found_node node /dev/sda3 add 0 found 2 now 1

^ So device_found_node() already gets called here, although there
should still be another active mount on sda3. This now causes the
usual cleanup unmount slaughter:

| systemd[1]: dev-sda3.device: Changed tentative - dead
| systemd[1]: tmp-boot.mount: Trying to enqueue job tmp-boot.mount/stop/replace
| systemd[1]: tmp-boot.mount: Installed new job tmp-boot.mount/stop as 357
| systemd[1]: tmp-boot.mount: Enqueued job tmp-boot.mount/stop as 357
| systemd[1]: tmp-etc.mount: Collecting.
| systemd[1]: Failed to reset devices.list on 
/lxc/test/system.slice/tmp-boot.mount: Permission denied

(FTR, I get thousands of those, but that's unrelated)

| systemd[1]: tmp-boot.mount: About to execute: /bin/umount /tmp/boot -n
| systemd[1]: tmp-boot.mount: Forked /bin/umount as 641
| systemd[1]: tmp-boot.mount: Changed mounted - unmounting
| systemd[1]: Unmounting /tmp/boot...
| systemd[641]: tmp-boot.mount: Executing: /bin/umount /tmp/boot -n

I'm not sure where this comes from now; there is no manual mount
command to bring back /tmp/boot. It looks like it bounces, and
quickly remounts /tmp/boot and then unmounts it again:

| systemd[1]:  device_found_node node /dev/sda3 add 1 found 2 now 1
| systemd[1]: dev-sda3.device: Changed dead - tentative
| systemd[1]: tmp-boot.mount: Trying to enqueue job tmp-boot.mount/start/fail
| systemd[1]: Requested transaction contradicts existing jobs: Resource 
deadlock avoided
| systemd[1]:  device_found_node node /dev/sda3 add 0 found 2 now 1
| systemd[1]: dev-sda3.device: Changed tentative - dead
| systemd[1]: Received SIGCHLD from PID 641 (umount).
| systemd[1]: Child 641 (umount) died (code=exited, status=0/SUCCESS)
| systemd[1]: tmp-boot.mount: Child 641 belongs to tmp-boot.mount
| systemd[1]: tmp-boot.mount: Mount process exited, code=exited status=0
| systemd[1]: tmp-boot.mount: Changed unmounting - dead
| systemd[1]: tmp-boot.mount: Job tmp-boot.mount/stop finished, result=done
| systemd[1]: Unmounted /tmp/boot.
| systemd[1]: tmp-boot.mount: Collecting.
| systemd[1]: dev-sda3.device: Collecting.

My first hunch is that this is caused by calling
mount_load_proc_self_mountinfo() in mount_dispatch_io()
(src/core/mount.c:1682) *before* it goes through that new
SET_FOREACH() loop. That call will already see the removed mount and
call device_found_node() with the removal.

I'll debug this more closely ASAP, just need to finish something else
first.

Martin
-- 
Martin Pitt| 

Re: [systemd-devel] [PATCH v3] device: Fix overzealous unmounting of tentative device mounts

2015-05-19 Thread Martin Pitt
Martin Pitt [2015-05-19 15:17 +0200]:
 My first hunch is that this is caused by calling
 mount_load_proc_self_mountinfo() in mount_dispatch_io()
 (src/core/mount.c:1682) *before* it goes through that new
 SET_FOREACH() loop. That call will already see the removed mount and
 call device_found_node() with the removal.

That was a red herring. Turns out that your patch only added devices
to around which had mount-just_mounted || mount-just_changed,
which is obviously not the case for mounts which have been around for
a while (and these are the ones we need to keep!). This patch fixes
that; although I'm unsure yet whether we *also* need to record the ones
with mount-just_mounted || mount-just_changed; my gut feeling says
yes, and we need to restructure this a bit to run the set_put(around, ...)
for all mount-is_mounted Mount units. WDYT?

Anyway, with this it behaves as it should.

Martin
-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)
From fb089bb6dfccd63b395fbda8461bba6ad3541827 Mon Sep 17 00:00:00 2001
From: Martin Pitt martin.p...@ubuntu.com
Date: Tue, 19 May 2015 15:55:23 +0200
Subject: [PATCH] mount: Fix mountinfo check for devices which are still around

Fix commit fcd8b266: For recording the devices for which there are still mounts
around in mountinfo we need to consider the mounts which did *not* just get
mounted or changed, but actually have existed for a longer time.

TODO: check if we *also* need to consider mounts with just_{mounted,changed}
---
 src/core/mount.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/core/mount.c b/src/core/mount.c
index c1a9ea5..06e0807 100644
--- a/src/core/mount.c
+++ b/src/core/mount.c
@@ -1751,12 +1751,11 @@ static int mount_dispatch_io(sd_event_source *source, int fd, uint32_t revents,
 break;
 }
 
-if (mount-parameters_proc_self_mountinfo.what) {
-
-if (set_ensure_allocated(around, string_hash_ops)  0 ||
-set_put(around, mount-parameters_proc_self_mountinfo.what)  0)
-log_oom();
-}
+} else if (mount-state != MOUNT_DEAD  mount-state != MOUNT_FAILED 
+   mount-parameters_proc_self_mountinfo.what) {
+if (set_ensure_allocated(around, string_hash_ops)  0 ||
+set_put(around, mount-parameters_proc_self_mountinfo.what)  0)
+log_oom();
 }
 
 /* Reset the flags for later calls */
-- 
2.1.4



signature.asc
Description: Digital signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH v3] device: Fix overzealous unmounting of tentative device mounts

2015-05-19 Thread Umut Tezduyar Lindskog
On Tue, May 19, 2015 at 1:56 PM, Lennart Poettering
lenn...@poettering.net wrote:
 On Tue, 19.05.15 10:23, Martin Pitt (martin.p...@ubuntu.com) wrote:

 Hello all,

 I have a better fix now.

  So the problem is that this tentative → dead transition only works if
  a device is referenced once, but causes overzealous unmounts if there
  are more references.

 Ah, indeed, that makes sense!

  We need a reference count, or check if the device is bound by other
  device still. Come to think of it now, we actually already have that:
 
Id=dev-foo.device
BoundBy=tmp-boot.mount tmp-etc.mount
 
  But my patch doesn't take that into account yet.

 This one does now. I left a fairly comprehensive commit log as this
 isn't that easy to explain/understand. If it's too verbose for you I
 can also trim it back a bit.

 I have now committed a different fix now, that keeps counting of the
 mount points in mount.c, instead of reaching over from device.c.

 I only gave this light testing, would be cool if you could check if
 this fixes things for you.

 http://cgit.freedesktop.org/systemd/systemd/commit/?id=fcd8b266edf0df2b85079fcf7b099cd4028740e6

 This commit will now collect two sets of devices while going through
 /proc/self/mountinfo: the devices of lines that are no longer there,
 and the devices of lines that are there. Only for devices in the
 former set that are not in the latter we will now propagate an event
 to device.c.

 Does this make sense?

 +/* If we found a tentative device via mountinfo which is still
 + * referenced by other mounts than the one that just got unmounted, 
 we
 + * must leave it FOUND_MOUNT/tentative; otherwise we'd trigger
 + * unmounting of all other bound mounts. */
 +if (d-found == DEVICE_FOUND_MOUNT  n == 0) {
 +Iterator i;
 +Unit *bound;
 +
 +SET_FOREACH(bound, UNIT(d)-dependencies[UNIT_BOUND_BY], i) 
 {
 +if (bound-type == UNIT_MOUNT 
 +MOUNT(bound)-state != MOUNT_DEAD  
 MOUNT(bound)-state != MOUNT_FAILED) {
 +log_unit_debug(UNIT(d), Still bound by %s 
 (%s), keeping state.,
 +   bound-id, 
 mount_state_to_string(MOUNT(bound)-state));
 +return;
 +}
 +}
 +log_unit_debug(UNIT(d), Not referenced by any mounts any 
 more, changing to dead.);
 +}
 +
  previous = d-found;
  d-found = n;

 Lennart

Didn't work for me. Removed device doesn't go in to either cases, so
it never makes it to the around set but it makes it to gone set.

if (!mount-is_mounted)
{

}
else if (mount-just_mounted || mount-just_changed)
{

}

Umut


 --
 Lennart Poettering, Red Hat
 ___
 systemd-devel mailing list
 systemd-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/systemd-devel
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH v3] device: Fix overzealous unmounting of tentative device mounts

2015-05-19 Thread Lennart Poettering
On Tue, 19.05.15 16:05, Martin Pitt (martin.p...@ubuntu.com) wrote:

 That was a red herring. Turns out that your patch only added devices
 to around which had mount-just_mounted || mount-just_changed,
 which is obviously not the case for mounts which have been around for
 a while (and these are the ones we need to keep!). This patch fixes
 that; although I'm unsure yet whether we *also* need to record the ones
 with mount-just_mounted || mount-just_changed; my gut feeling says
 yes, and we need to restructure this a bit to run the set_put(around, ...)
 for all mount-is_mounted Mount units. WDYT?

Yes, we need to also record those which were just mounted or changed,
in case systemd rereads /proc/self/mountinfo at a time where two
mounts where created at once from the same device.

I now committed a change to implement that, can you check if this
fixes the issue for you?

Thanks!

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH v3] device: Fix overzealous unmounting of tentative device mounts

2015-05-19 Thread Martin Pitt
Hey Lennart,

Lennart Poettering [2015-05-19 17:23 +0200]:
 Yes, we need to also record those which were just mounted or changed,
 in case systemd rereads /proc/self/mountinfo at a time where two
 mounts where created at once from the same device.

OK, as I suspected.

 I now committed a change to implement that, can you check if this
 fixes the issue for you?

It does, and it looks much more straightforward now.

Many thanks for your patience, this was a hairy thing to untangle..

Martin
-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)


signature.asc
Description: Digital signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] [PATCH v3] device: Fix overzealous unmounting of tentative device mounts

2015-05-19 Thread Lennart Poettering
On Tue, 19.05.15 17:31, Martin Pitt (martin.p...@ubuntu.com) wrote:

 Hey Lennart,
 
 Lennart Poettering [2015-05-19 17:23 +0200]:
  Yes, we need to also record those which were just mounted or changed,
  in case systemd rereads /proc/self/mountinfo at a time where two
  mounts where created at once from the same device.
 
 OK, as I suspected.
 
  I now committed a change to implement that, can you check if this
  fixes the issue for you?
 
 It does, and it looks much more straightforward now.

Perfect! Thanks for tracking this down and keeping it up even when I
was too dumb to understand what the issue was!

Thanks!

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel