On Sat, Mar 11, 2017 at 6:13 PM, Al Viro <v...@zeniv.linux.org.uk> wrote: > PS: AFAICS, simple mount --bind of your pid-only mount will suddenly > expose the full thing. And as for the lifetimes making no sense... > note that you are simply not freeing these structures of yours. > Try to handle that and you'll get a serious PITA all over the > place. > > What are you trying to achieve, anyway? Why not add a second vfsmount > pointer per pid_namespace and make it initialized on demand, at the > first attempt of no-pid mount? Just have a separate no-pid instance > created for those namespaces where it had been asked for, with > separate superblock and dentry tree not containing anything other > that pid-only parts + self + thread-self...
Can't we just make procfs work like most other filesystems and have each mount have its own superblock? If we need to do something funky to stat() output to keep existing userspace working, I think that's okay. As far as I can tell, proc_mnt is very nearly useless -- it seems to be used for proc_flush_task (which claims to be purely an optimization and could be preserved in the common case where there's only one relevant mount) and for sysctl_binary. For the latter, we could create proc_mnt but make actual user-initiated mounts be new superblocks anyway.