On Mon, Mar 9, 2015 at 1:43 PM, Dave Hansen <d...@sr71.net> wrote: > > From: Dave Hansen <dave.han...@linux.intel.com> > > Physical addresses are sensitive information. There are > existing, known exploits that are made easier if physical > information is available. Here is one example: > > http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf > > If you know the physical address of something you also know at > which kernel virtual address you can find something (modulo > highmem). It means that things that keep the kernel from > accessing user mappings (like SMAP/SMEP) can be worked around > because the _kernel_ mapping can get used instead. > > But, /proc/$pid/pagemap exposes the physical addresses of all > pages accessible to userspace. This works against all of the > efforts to keep kernel addresses out of places where unprivileged > apps can find them. > > This patch introduces a "paranoid" option for /proc. It can be > enabled like this: > > mount -o remount,paranoid /proc > > Or when /proc is mounted initially. When 'paranoid' mode is > active, opens to /proc/$pid/pagemap will return -EPERM for users > without CAP_SYS_RAWIO. It can be disabled like this: > > mount -o remount,notparanoid /proc > > The option is applied to the pid namespace, so an app that wanted > a separate policy from the rest of the system could get run in > its own pid namespace. > > I'm not really that stuck on the name. I'm not opposed to making > it apply only to pagemap or to giving it a pagemap-specific > name. > > pagemap is also the kind of feature that could be used to escalate > privileged from root in to the kernel. It probably needs to be > protected in the same way that /dev/mem or module loading is in > cases where the kernel needs to be protected from root, thus the > choice to use CAP_SYS_RAWIO. > > Signed-off-by: Dave Hansen <dave.han...@linux.intel.com>
Seems reasonable. I would note that even CAP_SYS_RAWIO isn't enough to actually do anything with RAM in /dev/mem. That's entirely controlled by CONFIG_STRICT_DEVMEM. I think /proc/kpagecount and /proc/kpageflags should get filtered as well, instead of them relying on the uid=0 check. Reviewed-by: Kees Cook <keesc...@chromium.org> -Kees > --- > > b/fs/proc/root.c | 10 +++++++++- > b/fs/proc/task_mmu.c | 11 +++++++++++ > b/include/linux/pid_namespace.h | 1 + > 3 files changed, 21 insertions(+), 1 deletion(-) > > diff -puN fs/proc/root.c~privileged-pagemap fs/proc/root.c > --- a/fs/proc/root.c~privileged-pagemap 2015-03-09 13:33:12.104796793 -0700 > +++ b/fs/proc/root.c 2015-03-09 13:33:12.111797109 -0700 > @@ -39,10 +39,12 @@ static int proc_set_super(struct super_b > } > > enum { > - Opt_gid, Opt_hidepid, Opt_err, > + Opt_gid, Opt_hidepid, Opt_paranoid, Opt_notparanoid, Opt_err, > }; > > static const match_table_t tokens = { > + {Opt_paranoid, "paranoid"}, > + {Opt_notparanoid, "notparanoid"}, > {Opt_hidepid, "hidepid=%u"}, > {Opt_gid, "gid=%u"}, > {Opt_err, NULL}, > @@ -70,6 +72,12 @@ static int proc_parse_options(char *opti > return 0; > pid->pid_gid = make_kgid(current_user_ns(), option); > break; > + case Opt_paranoid: > + pid->paranoid = 1; > + break; > + case Opt_notparanoid: > + pid->paranoid = 0; > + break; > case Opt_hidepid: > if (match_int(&args[0], &option)) > return 0; > diff -puN fs/proc/task_mmu.c~privileged-pagemap fs/proc/task_mmu.c > --- a/fs/proc/task_mmu.c~privileged-pagemap 2015-03-09 13:33:12.106796883 > -0700 > +++ b/fs/proc/task_mmu.c 2015-03-09 13:33:12.112797154 -0700 > @@ -1322,9 +1322,20 @@ out: > > static int pagemap_open(struct inode *inode, struct file *file) > { > + struct pid_namespace *ns = inode->i_sb->s_fs_info; > + > pr_warn_once("Bits 55-60 of /proc/PID/pagemap entries are about " > "to stop being page-shift some time soon. See the " > "linux/Documentation/vm/pagemap.txt for details.\n"); > + > + /* > + * Use the RAWIO capability bit. If you can not go open > + * /dev/mem, then you also have no business knowing the > + * physical addresses of things. > + */ > + if (ns->paranoid && !capable(CAP_SYS_RAWIO)) > + return -EPERM; > + > return 0; > } > > diff -puN include/linux/pid_namespace.h~privileged-pagemap > include/linux/pid_namespace.h > --- a/include/linux/pid_namespace.h~privileged-pagemap 2015-03-09 > 13:33:12.108796973 -0700 > +++ b/include/linux/pid_namespace.h 2015-03-09 13:33:12.112797154 -0700 > @@ -43,6 +43,7 @@ struct pid_namespace { > struct work_struct proc_work; > kgid_t pid_gid; > int hide_pid; > + int paranoid; > int reboot; /* group exit code if this pidns was rebooted */ > struct ns_common ns; > }; > _ -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/