On 02/21/2017 05:57 PM, Oleg Nesterov wrote:
> On 02/18, Alexey Gladkov wrote:
>>
>> This patch allows to mount only the part of /proc related to pids
>> without rest objects. Since this is an addon to /proc, flags applied to
>> /proc have an effect on this pidfs filesystem.
> 
> I leave this to you and Eric, but imo it would be nice to avoid another
> filesystem.
> 
>> Why not implement it as another flag to /proc ?
>>
>> The /proc flags is stored in the pid_namespace and are global for
>> namespace. It means that if you add a flag to hide all except the pids,
>> then it will act on all mounted instances of /proc.
> 
> But perhaps we can use mnt_flags? For example, lets abuse MNT_NODEV, see
> the simple patch below. Not sure it is correct/complete, just to illustrate
> the idea.
> 
> With this patch you can mount proc with -onodev and it will only show
> pids/self/thread_self:
> 
>       # mkdir /tmp/D
>       # mount -t proc -o nodev none /tmp/D
>       # ls /tmp/D
>       1   11  13  15  17  19  20  22  24  28  3   31  33  4  56  7  9     
> thread-self
>       10  12  14  16  18  2   21  23  27  29  30  32  34  5  6   8  self
>       # cat /tmp/D/meminfo
>       cat: /tmp/D/meminfo: No such file or directory
>       # ls /tmp/D/irq
>       ls: cannot open directory /tmp/D/irq: No such file or directory
> 
> No?

Yes!!! If this whole effort with pidfs and overlayfs will move forward, I would
prefer seeing the nodev procfs version, rather than another fs.

As far as the overlayfs part is concerned, having an overlayfs mounted on /proc
inside container may result in problems as applications sometimes check for 
/proc
containing procfs (by checking statfs.f_type == PROC_SUPER_MAGIC or by reading
the /proc/mounts).

-- Pavel

> Oleg.
> 
> 
> --- a/fs/proc/generic.c
> +++ b/fs/proc/generic.c
> @@ -305,11 +305,22 @@ int proc_readdir_de(struct proc_dir_entry *de, struct 
> file *file,
>  
>  int proc_readdir(struct file *file, struct dir_context *ctx)
>  {
> +     int mnt_flags = file->f_path.mnt->mnt_flags;
>       struct inode *inode = file_inode(file);
>  
> +     if (mnt_flags & MNT_NODEV)
> +             return 1;
> +
>       return proc_readdir_de(PDE(inode), file, ctx);
>  }
>  
> +static int proc_dir_open(struct inode *inode, struct file *file)
> +{
> +     if (file->f_path.mnt->mnt_flags & MNT_NODEV)
> +             return -ENOENT;
> +     return 0;
> +}
> +
>  /*
>   * These are the generic /proc directory operations. They
>   * use the in-memory "struct proc_dir_entry" tree to parse
> @@ -319,6 +330,7 @@ static const struct file_operations proc_dir_operations = 
> {
>       .llseek                 = generic_file_llseek,
>       .read                   = generic_read_dir,
>       .iterate_shared         = proc_readdir,
> +     .open                   = proc_dir_open,
>  };
>  
>  /*
> --- a/fs/proc/inode.c
> +++ b/fs/proc/inode.c
> @@ -318,12 +318,16 @@ proc_reg_get_unmapped_area(struct file *file, unsigned 
> long orig_addr,
>  
>  static int proc_reg_open(struct inode *inode, struct file *file)
>  {
> +     int mnt_flags = file->f_path.mnt->mnt_flags;
>       struct proc_dir_entry *pde = PDE(inode);
>       int rv = 0;
>       int (*open)(struct inode *, struct file *);
>       int (*release)(struct inode *, struct file *);
>       struct pde_opener *pdeo;
>  
> +     if (mnt_flags & MNT_NODEV)
> +             return -ENOENT;
> +
>       /*
>        * Ensure that
>        * 1) PDE's ->release hook will be called no matter what
> 
> .
> 

Reply via email to