Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
On Tue, Apr 26, 2016 at 09:36:23AM -0500, Serge E. Hallyn wrote: > > In the ideal world there is no mountinfo file, but > > /proc/self/mountinfo// > > directory with individual files, so every subsystem and filesystem has > > absolute freedom to store there all relevant information. The result will > > be also lucky kernel that does not have to always generate entire huge > > mountinfo file for all mountpoins... etc. :-) > > Yeah mountinfo does seem like a big stick to swing around every time I want > one little piece of information. Also mght be good to have per-fstype > directories so we can just look under /proc/self/mountsdir/cgroupfs/ for > only cgroupfs s. > > There we might also find open fds for source and mountdir, kinda fitting > in with previous discussions of separating bdev_open() and mountat(). > > BTW, assuming this would in fact report source and mountpoint location > with fds, these would really (through realpath) be reported relative to > the reader's namespace, as I'm doing and advocating here. So, what's the consensus here? Is everyone okay with the posted patches? Thanks. -- tejun
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
On Tue, Apr 26, 2016 at 09:36:23AM -0500, Serge E. Hallyn wrote: > > In the ideal world there is no mountinfo file, but > > /proc/self/mountinfo// > > directory with individual files, so every subsystem and filesystem has > > absolute freedom to store there all relevant information. The result will > > be also lucky kernel that does not have to always generate entire huge > > mountinfo file for all mountpoins... etc. :-) > > Yeah mountinfo does seem like a big stick to swing around every time I want > one little piece of information. Also mght be good to have per-fstype > directories so we can just look under /proc/self/mountsdir/cgroupfs/ for > only cgroupfs s. > > There we might also find open fds for source and mountdir, kinda fitting > in with previous discussions of separating bdev_open() and mountat(). > > BTW, assuming this would in fact report source and mountpoint location > with fds, these would really (through realpath) be reported relative to > the reader's namespace, as I'm doing and advocating here. So, what's the consensus here? Is everyone okay with the posted patches? Thanks. -- tejun
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
Hello, On Sun, Apr 17, 2016 at 03:04:32PM -0500, serge.hal...@ubuntu.com wrote: > +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node, > + struct kernfs_root *kf_root) > +{ > + int len = 0, ret = 0; > + char *buf = NULL; > + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; > + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); > + struct cgroup *ns_cgroup; > + > + mutex_lock(_mutex); > + spin_lock_bh(_set_lock); > + ns_cgroup = cset_cgroup_from_root(ns->root_cset, kf_cgroot); > + len = kernfs_path_from_node(kf_node, ns_cgroup->kn, NULL, 0); > + if (len > 0) > + buf = kmalloc(len + 1, GFP_ATOMIC); Ugh... What's up with GFP_ATOMIC? Just allocate maximum sized buffer before grabbing any locks. Thanks. -- tejun
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
Hello, On Sun, Apr 17, 2016 at 03:04:32PM -0500, serge.hal...@ubuntu.com wrote: > +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node, > + struct kernfs_root *kf_root) > +{ > + int len = 0, ret = 0; > + char *buf = NULL; > + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; > + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); > + struct cgroup *ns_cgroup; > + > + mutex_lock(_mutex); > + spin_lock_bh(_set_lock); > + ns_cgroup = cset_cgroup_from_root(ns->root_cset, kf_cgroot); > + len = kernfs_path_from_node(kf_node, ns_cgroup->kn, NULL, 0); > + if (len > 0) > + buf = kmalloc(len + 1, GFP_ATOMIC); Ugh... What's up with GFP_ATOMIC? Just allocate maximum sized buffer before grabbing any locks. Thanks. -- tejun
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
On Tue, Apr 26, 2016 at 12:29:25PM +0200, Karel Zak wrote: > On Mon, Apr 25, 2016 at 09:42:07PM -0500, Serge E. Hallyn wrote: > > > Looking at it another way... The value we're talking about shows us > > > the path of the root dentry of a cgroup mount. If a task in cgns2 > > > rooted at /a/b/c mounts a cgroupfs, it will see '/' as the root dentry. > > > If a task in cgns1 rooted at /a/b looks at that mountinfo, '/' would > > > be misleading. It really should be '/c'. > > > > So I think that for cgroup mount entries in mountinfo to be useful (i.e. > > to criu) we either need the root dentry path to be given as relative to > > the reader's cgroup namespace (as I have it in this patchset), or else > > we need to add another piece of information in the mountinfo entry, such > > as the nsfd inode number of the cgroup namespace in which it was > > mounted. > > In the ideal world there is no mountinfo file, but /proc/self/mountinfo// > directory with individual files, so every subsystem and filesystem has > absolute freedom to store there all relevant information. The result will > be also lucky kernel that does not have to always generate entire huge > mountinfo file for all mountpoins... etc. :-) Yeah mountinfo does seem like a big stick to swing around every time I want one little piece of information. Also mght be good to have per-fstype directories so we can just look under /proc/self/mountsdir/cgroupfs/ for only cgroupfs s. There we might also find open fds for source and mountdir, kinda fitting in with previous discussions of separating bdev_open() and mountat(). BTW, assuming this would in fact report source and mountpoint location with fds, these would really (through realpath) be reported relative to the reader's namespace, as I'm doing and advocating here. -serge
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
On Tue, Apr 26, 2016 at 12:29:25PM +0200, Karel Zak wrote: > On Mon, Apr 25, 2016 at 09:42:07PM -0500, Serge E. Hallyn wrote: > > > Looking at it another way... The value we're talking about shows us > > > the path of the root dentry of a cgroup mount. If a task in cgns2 > > > rooted at /a/b/c mounts a cgroupfs, it will see '/' as the root dentry. > > > If a task in cgns1 rooted at /a/b looks at that mountinfo, '/' would > > > be misleading. It really should be '/c'. > > > > So I think that for cgroup mount entries in mountinfo to be useful (i.e. > > to criu) we either need the root dentry path to be given as relative to > > the reader's cgroup namespace (as I have it in this patchset), or else > > we need to add another piece of information in the mountinfo entry, such > > as the nsfd inode number of the cgroup namespace in which it was > > mounted. > > In the ideal world there is no mountinfo file, but /proc/self/mountinfo// > directory with individual files, so every subsystem and filesystem has > absolute freedom to store there all relevant information. The result will > be also lucky kernel that does not have to always generate entire huge > mountinfo file for all mountpoins... etc. :-) Yeah mountinfo does seem like a big stick to swing around every time I want one little piece of information. Also mght be good to have per-fstype directories so we can just look under /proc/self/mountsdir/cgroupfs/ for only cgroupfs s. There we might also find open fds for source and mountdir, kinda fitting in with previous discussions of separating bdev_open() and mountat(). BTW, assuming this would in fact report source and mountpoint location with fds, these would really (through realpath) be reported relative to the reader's namespace, as I'm doing and advocating here. -serge
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
On Mon, Apr 25, 2016 at 09:42:07PM -0500, Serge E. Hallyn wrote: > > Looking at it another way... The value we're talking about shows us > > the path of the root dentry of a cgroup mount. If a task in cgns2 > > rooted at /a/b/c mounts a cgroupfs, it will see '/' as the root dentry. > > If a task in cgns1 rooted at /a/b looks at that mountinfo, '/' would > > be misleading. It really should be '/c'. > > So I think that for cgroup mount entries in mountinfo to be useful (i.e. > to criu) we either need the root dentry path to be given as relative to > the reader's cgroup namespace (as I have it in this patchset), or else > we need to add another piece of information in the mountinfo entry, such > as the nsfd inode number of the cgroup namespace in which it was > mounted. In the ideal world there is no mountinfo file, but /proc/self/mountinfo// directory with individual files, so every subsystem and filesystem has absolute freedom to store there all relevant information. The result will be also lucky kernel that does not have to always generate entire huge mountinfo file for all mountpoins... etc. :-) Karel -- Karel Zakhttp://karelzak.blogspot.com
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
On Mon, Apr 25, 2016 at 09:42:07PM -0500, Serge E. Hallyn wrote: > > Looking at it another way... The value we're talking about shows us > > the path of the root dentry of a cgroup mount. If a task in cgns2 > > rooted at /a/b/c mounts a cgroupfs, it will see '/' as the root dentry. > > If a task in cgns1 rooted at /a/b looks at that mountinfo, '/' would > > be misleading. It really should be '/c'. > > So I think that for cgroup mount entries in mountinfo to be useful (i.e. > to criu) we either need the root dentry path to be given as relative to > the reader's cgroup namespace (as I have it in this patchset), or else > we need to add another piece of information in the mountinfo entry, such > as the nsfd inode number of the cgroup namespace in which it was > mounted. In the ideal world there is no mountinfo file, but /proc/self/mountinfo// directory with individual files, so every subsystem and filesystem has absolute freedom to store there all relevant information. The result will be also lucky kernel that does not have to always generate entire huge mountinfo file for all mountpoins... etc. :-) Karel -- Karel Zak http://karelzak.blogspot.com
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
Quoting Serge E. Hallyn (se...@hallyn.com): > Quoting Serge E. Hallyn (se...@hallyn.com): > > Quoting Eric W. Biederman (ebied...@xmission.com): > > > "Serge E. Hallyn"writes: > > > > > > >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c > > > >> index 671dc05..9a0d7b3 100644 > > > >> --- a/kernel/cgroup.c > > > >> +++ b/kernel/cgroup.c > > > >> @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root > > > >> *dst_root, u16 ss_mask) > > > >>return 0; > > > >> } > > > >> > > > >> +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node > > > >> *kf_node, > > > >> + struct kernfs_root *kf_root) > > > >> +{ > > > >> + int len = 0, ret = 0; > > > >> + char *buf = NULL; > > > >> + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; > > > >> + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); > > > >> + struct cgroup *ns_cgroup; > > > >> + > > > >> + mutex_lock(_mutex); > > > > > > > > Hm, I can't grab the cgroup mutex here because I already have the > > > > namespace_sem. But that's required by cset_cgroup_from_root(). Can > > > > I just call that under rcu_read_lock() instead? (Not without > > > > changing the lockdep_assert_help()). Is there another way to get the > > > > info needed here? > > > > > > Do we need the current cgroup namespace information at all? > > > > > > Could we not get the relevant cgroup namespace from the mount of > > > cgroupfs? > > > > I don't think so. That was my first inclination. But at show_path() > > all we have is the vfsmunt->mnt_root. Since all cgroup namespaces > > for a hierarchy share the same dentry tree and superblock, there's > > no way to tell where the mount's namespace root is supposed to be. > > > > whether we did > > > > # enter new cgroup namespace rooted at cgroup /user.slice/user-1000.slice > > mount -t cgroup -o freezer freezer /mnt > > > > or > > > > mount --bind /sys/fs/cgroup/freezer/user.slice/user-1000.slice /mnt > > > > the mountinfo entry will be the same. > > > > > In general the better path is not to have the contents of files depend on > > > who is reading the file. > > And actually, while as i said above this was my first inclination, I now > think that's wrong. /proc/$$/cgroup is virtualized per the reader. The > point of this patch is to make mountinfo virtualized analogously to > /proc/$$/cgroup, so that we can be certain how a particular cgroup dentry > relates to a task's actual cgroup. So the mountinfo dentry root path > should in fact depend on the reader. > > Looking at it another way... The value we're talking about shows us > the path of the root dentry of a cgroup mount. If a task in cgns2 > rooted at /a/b/c mounts a cgroupfs, it will see '/' as the root dentry. > If a task in cgns1 rooted at /a/b looks at that mountinfo, '/' would > be misleading. It really should be '/c'. So I think that for cgroup mount entries in mountinfo to be useful (i.e. to criu) we either need the root dentry path to be given as relative to the reader's cgroup namespace (as I have it in this patchset), or else we need to add another piece of information in the mountinfo entry, such as the nsfd inode number of the cgroup namespace in which it was mounted. -serge
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
Quoting Serge E. Hallyn (se...@hallyn.com): > Quoting Serge E. Hallyn (se...@hallyn.com): > > Quoting Eric W. Biederman (ebied...@xmission.com): > > > "Serge E. Hallyn" writes: > > > > > > >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c > > > >> index 671dc05..9a0d7b3 100644 > > > >> --- a/kernel/cgroup.c > > > >> +++ b/kernel/cgroup.c > > > >> @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root > > > >> *dst_root, u16 ss_mask) > > > >>return 0; > > > >> } > > > >> > > > >> +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node > > > >> *kf_node, > > > >> + struct kernfs_root *kf_root) > > > >> +{ > > > >> + int len = 0, ret = 0; > > > >> + char *buf = NULL; > > > >> + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; > > > >> + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); > > > >> + struct cgroup *ns_cgroup; > > > >> + > > > >> + mutex_lock(_mutex); > > > > > > > > Hm, I can't grab the cgroup mutex here because I already have the > > > > namespace_sem. But that's required by cset_cgroup_from_root(). Can > > > > I just call that under rcu_read_lock() instead? (Not without > > > > changing the lockdep_assert_help()). Is there another way to get the > > > > info needed here? > > > > > > Do we need the current cgroup namespace information at all? > > > > > > Could we not get the relevant cgroup namespace from the mount of > > > cgroupfs? > > > > I don't think so. That was my first inclination. But at show_path() > > all we have is the vfsmunt->mnt_root. Since all cgroup namespaces > > for a hierarchy share the same dentry tree and superblock, there's > > no way to tell where the mount's namespace root is supposed to be. > > > > whether we did > > > > # enter new cgroup namespace rooted at cgroup /user.slice/user-1000.slice > > mount -t cgroup -o freezer freezer /mnt > > > > or > > > > mount --bind /sys/fs/cgroup/freezer/user.slice/user-1000.slice /mnt > > > > the mountinfo entry will be the same. > > > > > In general the better path is not to have the contents of files depend on > > > who is reading the file. > > And actually, while as i said above this was my first inclination, I now > think that's wrong. /proc/$$/cgroup is virtualized per the reader. The > point of this patch is to make mountinfo virtualized analogously to > /proc/$$/cgroup, so that we can be certain how a particular cgroup dentry > relates to a task's actual cgroup. So the mountinfo dentry root path > should in fact depend on the reader. > > Looking at it another way... The value we're talking about shows us > the path of the root dentry of a cgroup mount. If a task in cgns2 > rooted at /a/b/c mounts a cgroupfs, it will see '/' as the root dentry. > If a task in cgns1 rooted at /a/b looks at that mountinfo, '/' would > be misleading. It really should be '/c'. So I think that for cgroup mount entries in mountinfo to be useful (i.e. to criu) we either need the root dentry path to be given as relative to the reader's cgroup namespace (as I have it in this patchset), or else we need to add another piece of information in the mountinfo entry, such as the nsfd inode number of the cgroup namespace in which it was mounted. -serge
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
Quoting Serge E. Hallyn (se...@hallyn.com): > Quoting Eric W. Biederman (ebied...@xmission.com): > > "Serge E. Hallyn"writes: > > > > >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c > > >> index 671dc05..9a0d7b3 100644 > > >> --- a/kernel/cgroup.c > > >> +++ b/kernel/cgroup.c > > >> @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root > > >> *dst_root, u16 ss_mask) > > >> return 0; > > >> } > > >> > > >> +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node > > >> *kf_node, > > >> +struct kernfs_root *kf_root) > > >> +{ > > >> +int len = 0, ret = 0; > > >> +char *buf = NULL; > > >> +struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; > > >> +struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); > > >> +struct cgroup *ns_cgroup; > > >> + > > >> +mutex_lock(_mutex); > > > > > > Hm, I can't grab the cgroup mutex here because I already have the > > > namespace_sem. But that's required by cset_cgroup_from_root(). Can > > > I just call that under rcu_read_lock() instead? (Not without > > > changing the lockdep_assert_help()). Is there another way to get the > > > info needed here? > > > > Do we need the current cgroup namespace information at all? > > > > Could we not get the relevant cgroup namespace from the mount of > > cgroupfs? > > I don't think so. That was my first inclination. But at show_path() > all we have is the vfsmunt->mnt_root. Since all cgroup namespaces > for a hierarchy share the same dentry tree and superblock, there's > no way to tell where the mount's namespace root is supposed to be. > > whether we did > > # enter new cgroup namespace rooted at cgroup /user.slice/user-1000.slice > mount -t cgroup -o freezer freezer /mnt > > or > > mount --bind /sys/fs/cgroup/freezer/user.slice/user-1000.slice /mnt > > the mountinfo entry will be the same. > > > In general the better path is not to have the contents of files depend on > > who is reading the file. And actually, while as i said above this was my first inclination, I now think that's wrong. /proc/$$/cgroup is virtualized per the reader. The point of this patch is to make mountinfo virtualized analogously to /proc/$$/cgroup, so that we can be certain how a particular cgroup dentry relates to a task's actual cgroup. So the mountinfo dentry root path should in fact depend on the reader. Looking at it another way... The value we're talking about shows us the path of the root dentry of a cgroup mount. If a task in cgns2 rooted at /a/b/c mounts a cgroupfs, it will see '/' as the root dentry. If a task in cgns1 rooted at /a/b looks at that mountinfo, '/' would be misleading. It really should be '/c'. If there were security implications those might override this. But there is no security benefit to this. (The usual security argument is about the opener vs the reader, not the mounter verses the reader, but in either case I maintain there is no security benefit to virtualizing these paths)
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
Quoting Serge E. Hallyn (se...@hallyn.com): > Quoting Eric W. Biederman (ebied...@xmission.com): > > "Serge E. Hallyn" writes: > > > > >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c > > >> index 671dc05..9a0d7b3 100644 > > >> --- a/kernel/cgroup.c > > >> +++ b/kernel/cgroup.c > > >> @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root > > >> *dst_root, u16 ss_mask) > > >> return 0; > > >> } > > >> > > >> +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node > > >> *kf_node, > > >> +struct kernfs_root *kf_root) > > >> +{ > > >> +int len = 0, ret = 0; > > >> +char *buf = NULL; > > >> +struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; > > >> +struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); > > >> +struct cgroup *ns_cgroup; > > >> + > > >> +mutex_lock(_mutex); > > > > > > Hm, I can't grab the cgroup mutex here because I already have the > > > namespace_sem. But that's required by cset_cgroup_from_root(). Can > > > I just call that under rcu_read_lock() instead? (Not without > > > changing the lockdep_assert_help()). Is there another way to get the > > > info needed here? > > > > Do we need the current cgroup namespace information at all? > > > > Could we not get the relevant cgroup namespace from the mount of > > cgroupfs? > > I don't think so. That was my first inclination. But at show_path() > all we have is the vfsmunt->mnt_root. Since all cgroup namespaces > for a hierarchy share the same dentry tree and superblock, there's > no way to tell where the mount's namespace root is supposed to be. > > whether we did > > # enter new cgroup namespace rooted at cgroup /user.slice/user-1000.slice > mount -t cgroup -o freezer freezer /mnt > > or > > mount --bind /sys/fs/cgroup/freezer/user.slice/user-1000.slice /mnt > > the mountinfo entry will be the same. > > > In general the better path is not to have the contents of files depend on > > who is reading the file. And actually, while as i said above this was my first inclination, I now think that's wrong. /proc/$$/cgroup is virtualized per the reader. The point of this patch is to make mountinfo virtualized analogously to /proc/$$/cgroup, so that we can be certain how a particular cgroup dentry relates to a task's actual cgroup. So the mountinfo dentry root path should in fact depend on the reader. Looking at it another way... The value we're talking about shows us the path of the root dentry of a cgroup mount. If a task in cgns2 rooted at /a/b/c mounts a cgroupfs, it will see '/' as the root dentry. If a task in cgns1 rooted at /a/b looks at that mountinfo, '/' would be misleading. It really should be '/c'. If there were security implications those might override this. But there is no security benefit to this. (The usual security argument is about the opener vs the reader, not the mounter verses the reader, but in either case I maintain there is no security benefit to virtualizing these paths)
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
Quoting Eric W. Biederman (ebied...@xmission.com): > "Serge E. Hallyn"writes: > > >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c > >> index 671dc05..9a0d7b3 100644 > >> --- a/kernel/cgroup.c > >> +++ b/kernel/cgroup.c > >> @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root > >> *dst_root, u16 ss_mask) > >>return 0; > >> } > >> > >> +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node > >> *kf_node, > >> + struct kernfs_root *kf_root) > >> +{ > >> + int len = 0, ret = 0; > >> + char *buf = NULL; > >> + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; > >> + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); > >> + struct cgroup *ns_cgroup; > >> + > >> + mutex_lock(_mutex); > > > > Hm, I can't grab the cgroup mutex here because I already have the > > namespace_sem. But that's required by cset_cgroup_from_root(). Can > > I just call that under rcu_read_lock() instead? (Not without > > changing the lockdep_assert_help()). Is there another way to get the > > info needed here? > > Do we need the current cgroup namespace information at all? > > Could we not get the relevant cgroup namespace from the mount of > cgroupfs? I don't think so. That was my first inclination. But at show_path() all we have is the vfsmunt->mnt_root. Since all cgroup namespaces for a hierarchy share the same dentry tree and superblock, there's no way to tell where the mount's namespace root is supposed to be. whether we did # enter new cgroup namespace rooted at cgroup /user.slice/user-1000.slice mount -t cgroup -o freezer freezer /mnt or mount --bind /sys/fs/cgroup/freezer/user.slice/user-1000.slice /mnt the mountinfo entry will be the same. > In general the better path is not to have the contents of files depend on > who is reading the file. -serge
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
Quoting Eric W. Biederman (ebied...@xmission.com): > "Serge E. Hallyn" writes: > > >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c > >> index 671dc05..9a0d7b3 100644 > >> --- a/kernel/cgroup.c > >> +++ b/kernel/cgroup.c > >> @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root > >> *dst_root, u16 ss_mask) > >>return 0; > >> } > >> > >> +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node > >> *kf_node, > >> + struct kernfs_root *kf_root) > >> +{ > >> + int len = 0, ret = 0; > >> + char *buf = NULL; > >> + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; > >> + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); > >> + struct cgroup *ns_cgroup; > >> + > >> + mutex_lock(_mutex); > > > > Hm, I can't grab the cgroup mutex here because I already have the > > namespace_sem. But that's required by cset_cgroup_from_root(). Can > > I just call that under rcu_read_lock() instead? (Not without > > changing the lockdep_assert_help()). Is there another way to get the > > info needed here? > > Do we need the current cgroup namespace information at all? > > Could we not get the relevant cgroup namespace from the mount of > cgroupfs? I don't think so. That was my first inclination. But at show_path() all we have is the vfsmunt->mnt_root. Since all cgroup namespaces for a hierarchy share the same dentry tree and superblock, there's no way to tell where the mount's namespace root is supposed to be. whether we did # enter new cgroup namespace rooted at cgroup /user.slice/user-1000.slice mount -t cgroup -o freezer freezer /mnt or mount --bind /sys/fs/cgroup/freezer/user.slice/user-1000.slice /mnt the mountinfo entry will be the same. > In general the better path is not to have the contents of files depend on > who is reading the file. -serge
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
"Serge E. Hallyn"writes: >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c >> index 671dc05..9a0d7b3 100644 >> --- a/kernel/cgroup.c >> +++ b/kernel/cgroup.c >> @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root >> *dst_root, u16 ss_mask) >> return 0; >> } >> >> +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node >> *kf_node, >> +struct kernfs_root *kf_root) >> +{ >> +int len = 0, ret = 0; >> +char *buf = NULL; >> +struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; >> +struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); >> +struct cgroup *ns_cgroup; >> + >> +mutex_lock(_mutex); > > Hm, I can't grab the cgroup mutex here because I already have the > namespace_sem. But that's required by cset_cgroup_from_root(). Can > I just call that under rcu_read_lock() instead? (Not without > changing the lockdep_assert_help()). Is there another way to get the > info needed here? Do we need the current cgroup namespace information at all? Could we not get the relevant cgroup namespace from the mount of cgroupfs? In general the better path is not to have the contents of files depend on who is reading the file. Eric
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
"Serge E. Hallyn" writes: >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c >> index 671dc05..9a0d7b3 100644 >> --- a/kernel/cgroup.c >> +++ b/kernel/cgroup.c >> @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root >> *dst_root, u16 ss_mask) >> return 0; >> } >> >> +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node >> *kf_node, >> +struct kernfs_root *kf_root) >> +{ >> +int len = 0, ret = 0; >> +char *buf = NULL; >> +struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; >> +struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); >> +struct cgroup *ns_cgroup; >> + >> +mutex_lock(_mutex); > > Hm, I can't grab the cgroup mutex here because I already have the > namespace_sem. But that's required by cset_cgroup_from_root(). Can > I just call that under rcu_read_lock() instead? (Not without > changing the lockdep_assert_help()). Is there another way to get the > info needed here? Do we need the current cgroup namespace information at all? Could we not get the relevant cgroup namespace from the mount of cgroupfs? In general the better path is not to have the contents of files depend on who is reading the file. Eric
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
On Sun, Apr 17, 2016 at 03:04:32PM -0500, serge.hal...@ubuntu.com wrote: > From: Serge Hallyn> > When showing a cgroupfs entry in mountinfo, show the > path of the mount root dentry relative to the reader's > cgroup namespace root. > > Signed-off-by: Serge Hallyn > --- > fs/kernfs/mount.c | 14 ++ > include/linux/kernfs.h | 2 ++ > kernel/cgroup.c| 35 +++ > 3 files changed, 51 insertions(+) > > diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c > index f73541f..3b78724 100644 > --- a/fs/kernfs/mount.c > +++ b/fs/kernfs/mount.c > @@ -15,6 +15,7 @@ > #include > #include > #include > +#include > > #include "kernfs-internal.h" > > @@ -40,6 +41,18 @@ static int kernfs_sop_show_options(struct seq_file *sf, > struct dentry *dentry) > return 0; > } > > +static int kernfs_sop_show_path(struct seq_file *sf, struct dentry *dentry) > +{ > + struct kernfs_node *node = dentry->d_fsdata; > + struct kernfs_root *root = kernfs_root(node); > + struct kernfs_syscall_ops *scops = root->syscall_ops; > + > + if (scops && scops->show_path) > + return scops->show_path(sf, node, root); > + > + return seq_dentry(sf, dentry, " \t\n\\"); > +} > + > const struct super_operations kernfs_sops = { > .statfs = simple_statfs, > .drop_inode = generic_delete_inode, > @@ -47,6 +60,7 @@ const struct super_operations kernfs_sops = { > > .remount_fs = kernfs_sop_remount_fs, > .show_options = kernfs_sop_show_options, > + .show_path = kernfs_sop_show_path, > }; > > /** > diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h > index c06c442..30f089e 100644 > --- a/include/linux/kernfs.h > +++ b/include/linux/kernfs.h > @@ -152,6 +152,8 @@ struct kernfs_syscall_ops { > int (*rmdir)(struct kernfs_node *kn); > int (*rename)(struct kernfs_node *kn, struct kernfs_node *new_parent, > const char *new_name); > + int (*show_path)(struct seq_file *sf, struct kernfs_node *kn, > + struct kernfs_root *root); > }; > > struct kernfs_root { > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index 671dc05..9a0d7b3 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root > *dst_root, u16 ss_mask) > return 0; > } > > +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node, > + struct kernfs_root *kf_root) > +{ > + int len = 0, ret = 0; > + char *buf = NULL; > + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; > + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); > + struct cgroup *ns_cgroup; > + > + mutex_lock(_mutex); Hm, I can't grab the cgroup mutex here because I already have the namespace_sem. But that's required by cset_cgroup_from_root(). Can I just call that under rcu_read_lock() instead? (Not without changing the lockdep_assert_help()). Is there another way to get the info needed here? > + spin_lock_bh(_set_lock); > + ns_cgroup = cset_cgroup_from_root(ns->root_cset, kf_cgroot); > + len = kernfs_path_from_node(kf_node, ns_cgroup->kn, NULL, 0); > + if (len > 0) > + buf = kmalloc(len + 1, GFP_ATOMIC); > + if (buf) > + ret = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, len + > 1); > + > + spin_unlock_bh(_set_lock); > + mutex_unlock(_mutex); > + > + if (len <= 0) > + return len; > + if (!buf) > + return -ENOMEM; > + if (ret == len) { > + seq_escape(sf, buf, " \t\n\\"); > + ret = 0; > + } else if (ret >= 0) > + ret = -EINVAL; > + kfree(buf); > + return ret; > +} > + > static int cgroup_show_options(struct seq_file *seq, > struct kernfs_root *kf_root) > { > @@ -5430,6 +5464,7 @@ static struct kernfs_syscall_ops cgroup_kf_syscall_ops > = { > .mkdir = cgroup_mkdir, > .rmdir = cgroup_rmdir, > .rename = cgroup_rename, > + .show_path = cgroup_show_path, > }; > > static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early) > -- > 2.7.4 > > ___ > Containers mailing list > contain...@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/containers
Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
On Sun, Apr 17, 2016 at 03:04:32PM -0500, serge.hal...@ubuntu.com wrote: > From: Serge Hallyn > > When showing a cgroupfs entry in mountinfo, show the > path of the mount root dentry relative to the reader's > cgroup namespace root. > > Signed-off-by: Serge Hallyn > --- > fs/kernfs/mount.c | 14 ++ > include/linux/kernfs.h | 2 ++ > kernel/cgroup.c| 35 +++ > 3 files changed, 51 insertions(+) > > diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c > index f73541f..3b78724 100644 > --- a/fs/kernfs/mount.c > +++ b/fs/kernfs/mount.c > @@ -15,6 +15,7 @@ > #include > #include > #include > +#include > > #include "kernfs-internal.h" > > @@ -40,6 +41,18 @@ static int kernfs_sop_show_options(struct seq_file *sf, > struct dentry *dentry) > return 0; > } > > +static int kernfs_sop_show_path(struct seq_file *sf, struct dentry *dentry) > +{ > + struct kernfs_node *node = dentry->d_fsdata; > + struct kernfs_root *root = kernfs_root(node); > + struct kernfs_syscall_ops *scops = root->syscall_ops; > + > + if (scops && scops->show_path) > + return scops->show_path(sf, node, root); > + > + return seq_dentry(sf, dentry, " \t\n\\"); > +} > + > const struct super_operations kernfs_sops = { > .statfs = simple_statfs, > .drop_inode = generic_delete_inode, > @@ -47,6 +60,7 @@ const struct super_operations kernfs_sops = { > > .remount_fs = kernfs_sop_remount_fs, > .show_options = kernfs_sop_show_options, > + .show_path = kernfs_sop_show_path, > }; > > /** > diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h > index c06c442..30f089e 100644 > --- a/include/linux/kernfs.h > +++ b/include/linux/kernfs.h > @@ -152,6 +152,8 @@ struct kernfs_syscall_ops { > int (*rmdir)(struct kernfs_node *kn); > int (*rename)(struct kernfs_node *kn, struct kernfs_node *new_parent, > const char *new_name); > + int (*show_path)(struct seq_file *sf, struct kernfs_node *kn, > + struct kernfs_root *root); > }; > > struct kernfs_root { > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index 671dc05..9a0d7b3 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root > *dst_root, u16 ss_mask) > return 0; > } > > +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node, > + struct kernfs_root *kf_root) > +{ > + int len = 0, ret = 0; > + char *buf = NULL; > + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; > + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); > + struct cgroup *ns_cgroup; > + > + mutex_lock(_mutex); Hm, I can't grab the cgroup mutex here because I already have the namespace_sem. But that's required by cset_cgroup_from_root(). Can I just call that under rcu_read_lock() instead? (Not without changing the lockdep_assert_help()). Is there another way to get the info needed here? > + spin_lock_bh(_set_lock); > + ns_cgroup = cset_cgroup_from_root(ns->root_cset, kf_cgroot); > + len = kernfs_path_from_node(kf_node, ns_cgroup->kn, NULL, 0); > + if (len > 0) > + buf = kmalloc(len + 1, GFP_ATOMIC); > + if (buf) > + ret = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, len + > 1); > + > + spin_unlock_bh(_set_lock); > + mutex_unlock(_mutex); > + > + if (len <= 0) > + return len; > + if (!buf) > + return -ENOMEM; > + if (ret == len) { > + seq_escape(sf, buf, " \t\n\\"); > + ret = 0; > + } else if (ret >= 0) > + ret = -EINVAL; > + kfree(buf); > + return ret; > +} > + > static int cgroup_show_options(struct seq_file *seq, > struct kernfs_root *kf_root) > { > @@ -5430,6 +5464,7 @@ static struct kernfs_syscall_ops cgroup_kf_syscall_ops > = { > .mkdir = cgroup_mkdir, > .rmdir = cgroup_rmdir, > .rename = cgroup_rename, > + .show_path = cgroup_show_path, > }; > > static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early) > -- > 2.7.4 > > ___ > Containers mailing list > contain...@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/containers
[PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
From: Serge HallynWhen showing a cgroupfs entry in mountinfo, show the path of the mount root dentry relative to the reader's cgroup namespace root. Signed-off-by: Serge Hallyn --- fs/kernfs/mount.c | 14 ++ include/linux/kernfs.h | 2 ++ kernel/cgroup.c| 35 +++ 3 files changed, 51 insertions(+) diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index f73541f..3b78724 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "kernfs-internal.h" @@ -40,6 +41,18 @@ static int kernfs_sop_show_options(struct seq_file *sf, struct dentry *dentry) return 0; } +static int kernfs_sop_show_path(struct seq_file *sf, struct dentry *dentry) +{ + struct kernfs_node *node = dentry->d_fsdata; + struct kernfs_root *root = kernfs_root(node); + struct kernfs_syscall_ops *scops = root->syscall_ops; + + if (scops && scops->show_path) + return scops->show_path(sf, node, root); + + return seq_dentry(sf, dentry, " \t\n\\"); +} + const struct super_operations kernfs_sops = { .statfs = simple_statfs, .drop_inode = generic_delete_inode, @@ -47,6 +60,7 @@ const struct super_operations kernfs_sops = { .remount_fs = kernfs_sop_remount_fs, .show_options = kernfs_sop_show_options, + .show_path = kernfs_sop_show_path, }; /** diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index c06c442..30f089e 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -152,6 +152,8 @@ struct kernfs_syscall_ops { int (*rmdir)(struct kernfs_node *kn); int (*rename)(struct kernfs_node *kn, struct kernfs_node *new_parent, const char *new_name); + int (*show_path)(struct seq_file *sf, struct kernfs_node *kn, +struct kernfs_root *root); }; struct kernfs_root { diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 671dc05..9a0d7b3 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask) return 0; } +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node, + struct kernfs_root *kf_root) +{ + int len = 0, ret = 0; + char *buf = NULL; + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); + struct cgroup *ns_cgroup; + + mutex_lock(_mutex); + spin_lock_bh(_set_lock); + ns_cgroup = cset_cgroup_from_root(ns->root_cset, kf_cgroot); + len = kernfs_path_from_node(kf_node, ns_cgroup->kn, NULL, 0); + if (len > 0) + buf = kmalloc(len + 1, GFP_ATOMIC); + if (buf) + ret = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, len + 1); + + spin_unlock_bh(_set_lock); + mutex_unlock(_mutex); + + if (len <= 0) + return len; + if (!buf) + return -ENOMEM; + if (ret == len) { + seq_escape(sf, buf, " \t\n\\"); + ret = 0; + } else if (ret >= 0) + ret = -EINVAL; + kfree(buf); + return ret; +} + static int cgroup_show_options(struct seq_file *seq, struct kernfs_root *kf_root) { @@ -5430,6 +5464,7 @@ static struct kernfs_syscall_ops cgroup_kf_syscall_ops = { .mkdir = cgroup_mkdir, .rmdir = cgroup_rmdir, .rename = cgroup_rename, + .show_path = cgroup_show_path, }; static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early) -- 2.7.4
[PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
From: Serge Hallyn When showing a cgroupfs entry in mountinfo, show the path of the mount root dentry relative to the reader's cgroup namespace root. Signed-off-by: Serge Hallyn --- fs/kernfs/mount.c | 14 ++ include/linux/kernfs.h | 2 ++ kernel/cgroup.c| 35 +++ 3 files changed, 51 insertions(+) diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index f73541f..3b78724 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "kernfs-internal.h" @@ -40,6 +41,18 @@ static int kernfs_sop_show_options(struct seq_file *sf, struct dentry *dentry) return 0; } +static int kernfs_sop_show_path(struct seq_file *sf, struct dentry *dentry) +{ + struct kernfs_node *node = dentry->d_fsdata; + struct kernfs_root *root = kernfs_root(node); + struct kernfs_syscall_ops *scops = root->syscall_ops; + + if (scops && scops->show_path) + return scops->show_path(sf, node, root); + + return seq_dentry(sf, dentry, " \t\n\\"); +} + const struct super_operations kernfs_sops = { .statfs = simple_statfs, .drop_inode = generic_delete_inode, @@ -47,6 +60,7 @@ const struct super_operations kernfs_sops = { .remount_fs = kernfs_sop_remount_fs, .show_options = kernfs_sop_show_options, + .show_path = kernfs_sop_show_path, }; /** diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index c06c442..30f089e 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -152,6 +152,8 @@ struct kernfs_syscall_ops { int (*rmdir)(struct kernfs_node *kn); int (*rename)(struct kernfs_node *kn, struct kernfs_node *new_parent, const char *new_name); + int (*show_path)(struct seq_file *sf, struct kernfs_node *kn, +struct kernfs_root *root); }; struct kernfs_root { diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 671dc05..9a0d7b3 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask) return 0; } +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node, + struct kernfs_root *kf_root) +{ + int len = 0, ret = 0; + char *buf = NULL; + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root); + struct cgroup *ns_cgroup; + + mutex_lock(_mutex); + spin_lock_bh(_set_lock); + ns_cgroup = cset_cgroup_from_root(ns->root_cset, kf_cgroot); + len = kernfs_path_from_node(kf_node, ns_cgroup->kn, NULL, 0); + if (len > 0) + buf = kmalloc(len + 1, GFP_ATOMIC); + if (buf) + ret = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, len + 1); + + spin_unlock_bh(_set_lock); + mutex_unlock(_mutex); + + if (len <= 0) + return len; + if (!buf) + return -ENOMEM; + if (ret == len) { + seq_escape(sf, buf, " \t\n\\"); + ret = 0; + } else if (ret >= 0) + ret = -EINVAL; + kfree(buf); + return ret; +} + static int cgroup_show_options(struct seq_file *seq, struct kernfs_root *kf_root) { @@ -5430,6 +5464,7 @@ static struct kernfs_syscall_ops cgroup_kf_syscall_ops = { .mkdir = cgroup_mkdir, .rmdir = cgroup_rmdir, .rename = cgroup_rename, + .show_path = cgroup_show_path, }; static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early) -- 2.7.4