Re: [PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

2020-08-05 Thread Ian Kent
On Wed, 2020-08-05 at 13:27 +0200, Miklos Szeredi wrote:
> On Wed, Aug 5, 2020 at 1:23 PM Ian Kent  wrote:
> > On Wed, 2020-08-05 at 09:45 +0200, Miklos Szeredi wrote:
> > > Hmm, what's the other possibility for lost notifications?
> > 
> > In user space that is:
> > 
> > Multi-threaded application races, single threaded applications and
> > signal processing races, other bugs ...
> 
> Okay, let's fix the bugs then.

It's the the bugs you don't know about that get you, in this case
the world "is" actually out to get you, ;)

Ian



Re: [PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

2020-08-05 Thread David Howells
Miklos Szeredi  wrote:

> Shoun't we just make sure that the likelyhood of overruns is low

That's not necessarily easy.  To avoid overruns you need a bigger buffer.  The
buffer is preallocated from unswappable kernel space.  Yes, you can increase
the size of the buffer, but it eats out of your pipe bufferage limit.

Further, it's a *general* notifications queue, not just for a specific
purpose, but that means it might get connected to multiple sources, and doing
something like tearing down a container might generate enough notifications to
overrun the queue.

> and if it happens, just reinitialize everthing from scratch (shouldn't be
> *that* expensive).

If you then spend time reinitialising everything, you're increasing the
likelihood of racing with further events.  Further, there multiple expenses:
firstly, you have to tear down and discard all the data that you've spent time
setting up; secondly, it takes time doing all this; thirdly, it takes cpu
cycles away from applications.

The reason I put the event counters in there and made it so that fsinfo()
could read all the mounts in a subtree and their event counters in one go is
to make it faster for the user to find out what changed in the event that a
notification is lost.

I have a patch (not included here as it occasionally induces oopses) that
attempts to make this doable under the RCU read lock so that it doesn't
prevent mounts from taking place during the scan.

David



Re: [PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

2020-08-05 Thread Ian Kent
On Wed, 2020-08-05 at 09:45 +0200, Miklos Szeredi wrote:
> On Wed, Aug 5, 2020 at 4:46 AM Ian Kent  wrote:
> > Coming back to an actual use case.
> > 
> > What I said above is one aspect but, since I'm looking at this
> > right
> > now with systemd, and I do have the legacy code to fall back to,
> > the
> > "just reset everything" suggestion does make sense.
> > 
> > But I'm struggling to see how I can identify notification buffer
> > overrun in libmount, and overrun is just one possibility for lost
> > notifications, so I like the idea that, as a library user, I can
> > work out that I need to take action based on what I have in the
> > notifications themselves.
> 
> Hmm, what's the other possibility for lost notifications?

In user space that is:

Multi-threaded application races, single threaded applications and
signal processing races, other bugs ...

For example systemd has it's own event handling sub-system and handles
half a dozen or so event types of which the mount changes are one. It's
fairly complex so I find myself wondering if I can trust it and
wondering if there are undiscovered bugs in it. The answer to the
former is probably yes but the answer to the later is also probably
yes.

Maybe I just paranoid!
Ian




Re: [PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

2020-08-05 Thread Miklos Szeredi
On Wed, Aug 5, 2020 at 1:23 PM Ian Kent  wrote:
>
> On Wed, 2020-08-05 at 09:45 +0200, Miklos Szeredi wrote:

> > Hmm, what's the other possibility for lost notifications?
>
> In user space that is:
>
> Multi-threaded application races, single threaded applications and
> signal processing races, other bugs ...

Okay, let's fix the bugs then.

Thanks,
Miklos


Re: [PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

2020-08-05 Thread Miklos Szeredi
On Wed, Aug 5, 2020 at 6:07 PM David Howells  wrote:
>
> Miklos Szeredi  wrote:
>
> > Shoun't we just make sure that the likelyhood of overruns is low
>
> That's not necessarily easy.  To avoid overruns you need a bigger buffer.  The
> buffer is preallocated from unswappable kernel space.  Yes, you can increase
> the size of the buffer, but it eats out of your pipe bufferage limit.
>
> Further, it's a *general* notifications queue, not just for a specific
> purpose, but that means it might get connected to multiple sources, and doing
> something like tearing down a container might generate enough notifications to
> overrun the queue.
>
> > and if it happens, just reinitialize everthing from scratch (shouldn't be
> > *that* expensive).
>
> If you then spend time reinitialising everything, you're increasing the
> likelihood of racing with further events.  Further, there multiple expenses:
> firstly, you have to tear down and discard all the data that you've spent time
> setting up; secondly, it takes time doing all this; thirdly, it takes cpu
> cycles away from applications.
>
> The reason I put the event counters in there and made it so that fsinfo()
> could read all the mounts in a subtree and their event counters in one go is
> to make it faster for the user to find out what changed in the event that a
> notification is lost.

That's just overdesigning it, IMO.

If the protocol is extensible (as you state) then the counters can be
added as needed.  And unless the above CPU cycle wastage is actually
observed in practice, the whole thing is unnecessary.

Thanks,
Miklos


Re: [PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

2020-08-05 Thread Miklos Szeredi
On Wed, Aug 5, 2020 at 4:46 AM Ian Kent  wrote:
>

> Coming back to an actual use case.
>
> What I said above is one aspect but, since I'm looking at this right
> now with systemd, and I do have the legacy code to fall back to, the
> "just reset everything" suggestion does make sense.
>
> But I'm struggling to see how I can identify notification buffer
> overrun in libmount, and overrun is just one possibility for lost
> notifications, so I like the idea that, as a library user, I can
> work out that I need to take action based on what I have in the
> notifications themselves.

Hmm, what's the other possibility for lost notifications?

Thanks,
Miklos


Re: [PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

2020-08-04 Thread Ian Kent
On Wed, 2020-08-05 at 10:05 +0800, Ian Kent wrote:
> On Tue, 2020-08-04 at 15:56 +0200, Miklos Szeredi wrote:
> > On Mon, Aug 03, 2020 at 02:37:50PM +0100, David Howells wrote:
> > > Provide support for the handling of an overrun in a watch
> > > queue.  In the
> > > event that an overrun occurs, the watcher needs to be able to
> > > find
> > > out what
> > > it was that they missed.  To this end, previous patches added
> > > event
> > > counters to struct mount.
> > 
> > So this is optimizing the buffer overrun case?
> > 
> > Shoun't we just make sure that the likelyhood of overruns is low
> > and
> > if it
> > happens, just reinitialize everthing from scratch (shouldn't be
> > *that*
> > expensive).
> 
> But maybe not possible if you are using notifications for tracking
> state in user space, you need to know when the thing you have needs
> to be synced because you missed something and it's during the
> notification processing you actually have the object that may need
> to be refreshed.
> 
> > Trying to find out what was missed seems like just adding
> > complexity
> > for no good
> > reason.

Coming back to an actual use case.

What I said above is one aspect but, since I'm looking at this right
now with systemd, and I do have the legacy code to fall back to, the
"just reset everything" suggestion does make sense.

But I'm struggling to see how I can identify notification buffer
overrun in libmount, and overrun is just one possibility for lost
notifications, so I like the idea that, as a library user, I can
work out that I need to take action based on what I have in the
notifications themselves.

Ian



Re: [PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

2020-08-04 Thread Ian Kent
On Tue, 2020-08-04 at 15:56 +0200, Miklos Szeredi wrote:
> On Mon, Aug 03, 2020 at 02:37:50PM +0100, David Howells wrote:
> > Provide support for the handling of an overrun in a watch
> > queue.  In the
> > event that an overrun occurs, the watcher needs to be able to find
> > out what
> > it was that they missed.  To this end, previous patches added event
> > counters to struct mount.
> 
> So this is optimizing the buffer overrun case?
> 
> Shoun't we just make sure that the likelyhood of overruns is low and
> if it
> happens, just reinitialize everthing from scratch (shouldn't be
> *that*
> expensive).

But maybe not possible if you are using notifications for tracking
state in user space, you need to know when the thing you have needs
to be synced because you missed something and it's during the
notification processing you actually have the object that may need
to be refreshed.

> 
> Trying to find out what was missed seems like just adding complexity
> for no good
> reason.
> 



Re: [PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

2020-08-04 Thread Miklos Szeredi
On Mon, Aug 03, 2020 at 02:37:50PM +0100, David Howells wrote:
> Provide support for the handling of an overrun in a watch queue.  In the
> event that an overrun occurs, the watcher needs to be able to find out what
> it was that they missed.  To this end, previous patches added event
> counters to struct mount.

So this is optimizing the buffer overrun case?

Shoun't we just make sure that the likelyhood of overruns is low and if it
happens, just reinitialize everthing from scratch (shouldn't be *that*
expensive).

Trying to find out what was missed seems like just adding complexity for no good
reason.



[PATCH 10/18] fsinfo: Provide notification overrun handling support [ver #21]

2020-08-03 Thread David Howells
Provide support for the handling of an overrun in a watch queue.  In the
event that an overrun occurs, the watcher needs to be able to find out what
it was that they missed.  To this end, previous patches added event
counters to struct mount.

To make them accessible, they can be retrieved using fsinfo() and the
FSINFO_ATTR_MOUNT_INFO attribute.

struct fsinfo_mount_info {
__u64   mnt_unique_id;
__u64   mnt_attr_changes;
__u64   mnt_topology_changes;
__u64   mnt_subtree_notifications;
...
};

There's a uniquifier and some event counters:

 (1) mnt_unique_id - This is an effectively non-repeating ID given to each
 mount object on creation.  This allows the caller to check that the
 mount ID didn't get reused (the 32-bit mount ID is more efficient to
 look up).

 (2) mnt_attr_changes - Count of attribute changes on a mount object.

 (3) mnt_topology_changes - Count of alterations to the mount tree that
 affected this node.

 (4) mnt_subtree_notifications - Count of mount object event notifications
 that were generated in the subtree rooted at this node.  This excludes
 events generated on this node itself and does not include superblock
 events.

The counters are also accessible through the FSINFO_ATTR_MOUNT_CHILDREN
attribute, where a list of all the children of a mount can be scanned.  The
record returned for each child includes the sum of the counters for that
child.  An additional record is added at the end for the queried object and
that also includes the sum of its counters

The mnt_topology_changes counter is also included in
FSINFO_ATTR_MOUNT_TOPOLOGY.

Signed-off-by: David Howells 
---

 fs/mount_notify.c   |2 ++
 fs/namespace.c  |   21 +
 include/uapi/linux/fsinfo.h |7 +++
 samples/vfs/test-fsinfo.c   |   10 --
 4 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/fs/mount_notify.c b/fs/mount_notify.c
index 57eebae51cb1..57995c27ca88 100644
--- a/fs/mount_notify.c
+++ b/fs/mount_notify.c
@@ -93,6 +93,8 @@ void notify_mount(struct mount *trigger,
n.watch.info= info_flags | watch_sizeof(n);
n.triggered_on  = trigger->mnt_unique_id;
 
+   smp_wmb(); /* See fsinfo_generic_mount_info(). */
+
switch (subtype) {
case NOTIFY_MOUNT_EXPIRY:
case NOTIFY_MOUNT_READONLY:
diff --git a/fs/namespace.c b/fs/namespace.c
index b5c2a3b4f96d..122c12f9512b 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4282,6 +4282,17 @@ int fsinfo_generic_mount_info(struct path *path, struct 
fsinfo_context *ctx)
p->mnt_unique_id= m->mnt_unique_id;
p->mnt_id   = m->mnt_id;
 
+#ifdef CONFIG_MOUNT_NOTIFICATIONS
+   p->mnt_subtree_notifications = 
atomic_long_read(>mnt_subtree_notifications);
+   p->mnt_topology_changes = atomic_long_read(>mnt_topology_changes);
+   p->mnt_attr_changes = atomic_long_read(>mnt_attr_changes);
+#endif
+
+   /* Record the counters before reading the attributes as we're not
+* holding a lock.  Paired with a write barrier in notify_mount().
+*/
+   smp_rmb();
+
flags = READ_ONCE(m->mnt.mnt_flags);
if (flags & MNT_READONLY)
p->attr |= MOUNT_ATTR_RDONLY;
@@ -4319,6 +4330,9 @@ int fsinfo_generic_mount_topology(struct path *path, 
struct fsinfo_context *ctx)
 
m = real_mount(path->mnt);
 
+#ifdef CONFIG_MOUNT_NOTIFICATIONS
+   p->mnt_topology_changes = atomic_long_read(>mnt_topology_changes);
+#endif
p->parent_id = m->mnt_parent->mnt_id;
 
if (path->mnt == root.mnt) {
@@ -4445,6 +4459,13 @@ static void fsinfo_store_mount(struct fsinfo_context 
*ctx, const struct mount *p
record.mnt_unique_id= p->mnt_unique_id;
record.mnt_id   = p->mnt_id;
record.parent_id= is_root ? p->mnt_id : p->mnt_parent->mnt_id;
+
+#ifdef CONFIG_MOUNT_NOTIFICATIONS
+   record.mnt_notify_sum   = (atomic_long_read(>mnt_attr_changes) +
+  atomic_long_read(>mnt_topology_changes) +
+  
atomic_long_read(>mnt_subtree_notifications));
+#endif
+
memcpy(ctx->buffer + usage, , sizeof(record));
 }
 
diff --git a/include/uapi/linux/fsinfo.h b/include/uapi/linux/fsinfo.h
index f0a352b7028e..b021466dee0f 100644
--- a/include/uapi/linux/fsinfo.h
+++ b/include/uapi/linux/fsinfo.h
@@ -100,6 +100,9 @@ struct fsinfo_mount_info {
__u64   mnt_unique_id;  /* Kernel-lifetime unique mount ID */
__u32   mnt_id; /* Mount identifier (use with 
AT_FSINFO_MOUNTID_PATH) */
__u32   attr;   /* MOUNT_ATTR_* flags */
+   __u64   mnt_attr_changes;   /* Number of attribute changes to this 
mount. */
+   __u64   mnt_topology_changes;   /* Number of topology changes to this 
mount. */
+   __u64