Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
Dave Hansen writes: > On 03/12/2015 03:35 PM, Andrew Morton wrote: >> On Mon, 09 Mar 2015 13:43:21 -0700 Dave Hansen wrote: >>> From: Dave Hansen >>> >>> Physical addresses are sensitive information. There are >>> existing, known exploits that are made easier if physical >>> information is available. Here is one example: >>> >>> http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf >> Do we really need to disable pagemap entirely? What happens if we just >> obscure the addresses (ie: zero them)? > > I think we have 3 basic options: > > 1. Disable it entirely (-EPERM or whatever). Apps using it break > quickly and fairly obviously (diagnosable with an strace) > 2. Zero it, or return some nonsensical thing for the physical address > portion, but maintain exporting the PTE flags. Apps only caring > about PTE flags work, but anything trying to do lookups in > /proc/kpageflags break. If we zero it, apps pay get confused > thinking they have the _actual_ pfn=0. > 3. Scramble it in some way obscuring the physical address. Unscramble > it upon access to /proc/kpageflags. > > I think you're suggesting (2). Doesn't that risk silently breaking > apps? I think 3 where the scramble is something like AES crypto is likely to scramble this well and still protect us from plain text attacks. >>> pagemap is also the kind of feature that could be used to escalate >>> privileged from root in to the kernel. It probably needs to be >>> protected in the same way that /dev/mem or module loading is in >>> cases where the kernel needs to be protected from root, thus the >>> choice to use CAP_SYS_RAWIO. >> >> Confused. If you have root, you can do mount -o notparanoid. > > Good point. I guess it doesn't protect us much here unless we also > restrict the ability to remount. And the ability to unmount... A write-once sysctl or a boot time only parameter is much more likely to be useful in the scenario where you are concerned about root. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
On 03/12/2015 03:35 PM, Andrew Morton wrote: > On Mon, 09 Mar 2015 13:43:21 -0700 Dave Hansen wrote: >> From: Dave Hansen >> >> Physical addresses are sensitive information. There are >> existing, known exploits that are made easier if physical >> information is available. Here is one example: >> >> http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf > Do we really need to disable pagemap entirely? What happens if we just > obscure the addresses (ie: zero them)? I think we have 3 basic options: 1. Disable it entirely (-EPERM or whatever). Apps using it break quickly and fairly obviously (diagnosable with an strace) 2. Zero it, or return some nonsensical thing for the physical address portion, but maintain exporting the PTE flags. Apps only caring about PTE flags work, but anything trying to do lookups in /proc/kpageflags break. If we zero it, apps pay get confused thinking they have the _actual_ pfn=0. 3. Scramble it in some way obscuring the physical address. Unscramble it upon access to /proc/kpageflags. I think you're suggesting (2). Doesn't that risk silently breaking apps? >> pagemap is also the kind of feature that could be used to escalate >> privileged from root in to the kernel. It probably needs to be >> protected in the same way that /dev/mem or module loading is in >> cases where the kernel needs to be protected from root, thus the >> choice to use CAP_SYS_RAWIO. > > Confused. If you have root, you can do mount -o notparanoid. Good point. I guess it doesn't protect us much here unless we also restrict the ability to remount. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
On Mon, 09 Mar 2015 13:43:21 -0700 Dave Hansen wrote: > > From: Dave Hansen > > Physical addresses are sensitive information. There are > existing, known exploits that are made easier if physical > information is available. Here is one example: > > http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf > > If you know the physical address of something you also know at > which kernel virtual address you can find something (modulo > highmem). It means that things that keep the kernel from > accessing user mappings (like SMAP/SMEP) can be worked around > because the _kernel_ mapping can get used instead. > > But, /proc/$pid/pagemap exposes the physical addresses of all > pages accessible to userspace. This works against all of the > efforts to keep kernel addresses out of places where unprivileged > apps can find them. > > This patch introduces a "paranoid" option for /proc. It can be > enabled like this: > > mount -o remount,paranoid /proc > > Or when /proc is mounted initially. When 'paranoid' mode is > active, opens to /proc/$pid/pagemap will return -EPERM for users > without CAP_SYS_RAWIO. It can be disabled like this: > > mount -o remount,notparanoid /proc > > The option is applied to the pid namespace, so an app that wanted > a separate policy from the rest of the system could get run in > its own pid namespace. > > I'm not really that stuck on the name. I'm not opposed to making > it apply only to pagemap or to giving it a pagemap-specific > name. Do we really need to disable pagemap entirely? What happens if we just obscure the addresses (ie: zero them)? > pagemap is also the kind of feature that could be used to escalate > privileged from root in to the kernel. It probably needs to be > protected in the same way that /dev/mem or module loading is in > cases where the kernel needs to be protected from root, thus the > choice to use CAP_SYS_RAWIO. Confused. If you have root, you can do mount -o notparanoid. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
Dave Hansen writes: > On 03/09/2015 05:03 PM, Kees Cook wrote: >> On Mon, Mar 9, 2015 at 4:43 PM, Eric W. Biederman >> wrote: >>> A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an >>> appropriate random number ought to keep from revealing page numbers or >>> page ajacencies while not requiring any changes in userspace. >>> >>> That way the revealed pfn and the physcial pfn would be different but >>> you could still use pagemap for it's intended purpose. >> >> If this could be done in a way where it was sufficiently hard to >> expose the random number, we should absolutely do this. > > We would need something which is both reversible (so that the given > offsets can still be used in /proc/kpagemap) and also hard to do a > known-plaintext-type attack on it. > > Transparent huge pages are a place where userspace knows the > relationship between 512 adjacent physical addresses. That represents a > good chunk of known data. Surely there are more of these kinds of things. > > Right now, for instance, the ways in which a series of sequential > allocations come out of the page allocator are fairly deterministic. We > would also need to do some kind of allocator randomization to ensure > that userspace couldn't make good guesses about the physical addresses > of things coming out of the allocator. > > Or, we just be sure and turn the darn thing off. :) Yes. If we are worried about something a big off switch is fine. As for a one-to-one transform that is resitant to plain text attacks I think that is the definition of a cypher. That is we should just use AES or something well know to encrypt the pafe frame numbers if we want to hide them. I don't know if the block mode of AES would be a problem or not. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
On 03/09/2015 05:03 PM, Kees Cook wrote: > On Mon, Mar 9, 2015 at 4:43 PM, Eric W. Biederman > wrote: >> A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an >> appropriate random number ought to keep from revealing page numbers or >> page ajacencies while not requiring any changes in userspace. >> >> That way the revealed pfn and the physcial pfn would be different but >> you could still use pagemap for it's intended purpose. > > If this could be done in a way where it was sufficiently hard to > expose the random number, we should absolutely do this. We would need something which is both reversible (so that the given offsets can still be used in /proc/kpagemap) and also hard to do a known-plaintext-type attack on it. Transparent huge pages are a place where userspace knows the relationship between 512 adjacent physical addresses. That represents a good chunk of known data. Surely there are more of these kinds of things. Right now, for instance, the ways in which a series of sequential allocations come out of the page allocator are fairly deterministic. We would also need to do some kind of allocator randomization to ensure that userspace couldn't make good guesses about the physical addresses of things coming out of the allocator. Or, we just be sure and turn the darn thing off. :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
On 03/09/2015 04:08 PM, Eric W. Biederman wrote: > If the concern is to protect against root getting into the kernel the > "trusted_kernel" snake-oil just compile out the pagemap file. Nothing > else is remotely interesting from a mainenance point of view. The paper I linked to showed one example of how pagemap makes a user->kernel exploint _easier_. Note that the authors had another way of actually doing the exploit when pagemap was not available, but it required some more trouble than if pagemap was around. I mentioned the "trusted_kernel" stuff as an aside. It's really not the main concern. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
On Mon, Mar 9, 2015 at 4:43 PM, Eric W. Biederman wrote: > > A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an > appropriate random number ought to keep from revealing page numbers or > page ajacencies while not requiring any changes in userspace. > > That way the revealed pfn and the physcial pfn would be different but > you could still use pagemap for it's intended purpose. If this could be done in a way where it was sufficiently hard to expose the random number, we should absolutely do this. And this could be done for socket handles in INET_DIAG too. We have a lot of these kinds of "handle" leaks where the handle's can be regarded as private information leakage. -Kees -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an appropriate random number ought to keep from revealing page numbers or page ajacencies while not requiring any changes in userspace. That way the revealed pfn and the physcial pfn would be different but you could still use pagemap for it's intended purpose. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
On Mon, Mar 9, 2015 at 4:08 PM, Eric W. Biederman wrote: > Kees Cook writes: > >> On Mon, Mar 9, 2015 at 3:13 PM, Eric W. Biederman >> wrote: >>> Dave Hansen writes: >>> From: Dave Hansen Physical addresses are sensitive information. There are existing, known exploits that are made easier if physical information is available. Here is one example: http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf If you know the physical address of something you also know at which kernel virtual address you can find something (modulo highmem). It means that things that keep the kernel from accessing user mappings (like SMAP/SMEP) can be worked around because the _kernel_ mapping can get used instead. But, /proc/$pid/pagemap exposes the physical addresses of all pages accessible to userspace. This works against all of the efforts to keep kernel addresses out of places where unprivileged apps can find them. This patch introduces a "paranoid" option for /proc. It can be enabled like this: mount -o remount,paranoid /proc Or when /proc is mounted initially. When 'paranoid' mode is active, opens to /proc/$pid/pagemap will return -EPERM for users without CAP_SYS_RAWIO. It can be disabled like this: mount -o remount,notparanoid /proc The option is applied to the pid namespace, so an app that wanted a separate policy from the rest of the system could get run in its own pid namespace. I'm not really that stuck on the name. I'm not opposed to making it apply only to pagemap or to giving it a pagemap-specific name. pagemap is also the kind of feature that could be used to escalate privileged from root in to the kernel. It probably needs to be protected in the same way that /dev/mem or module loading is in cases where the kernel needs to be protected from root, thus the choice to use CAP_SYS_RAWIO. >>> >>> >>> There is already a way to make pagemap go away. It is called >>> CONFIG_PROC_PAGE_MONITOR. >>> >>> I suspect the right answer here is if you enable kernel address >>> randomization you disable CONFIG_PROC_PAGE_MONTIOR. Aka you make the >>> two options conflict with each other. >> >> It's not a good idea to make CONFIG options conflict with each other >> like this as it puts distros is a tricky spot to decide which to use. >> Allowing both and having a runtime flag of some kind tends to be the >> better option (e.g. kASLR vs Hibernation). > > But there is a fundamental conflict. As such it might as well be > expressed in Kconfig. Hm? I was using kASLR vs Hibernation as an example of something that while even at odds with each other currently is available as a runtime selectable option (putting "kaslr" on the command line enables it and disables hibernation, rather than forcing a CONFIG choice to pick one or the other). > >>> That is a lot less code and a lot less to maintain. >>> >>> On the other hand if this is truly a valuable interface that you can't >>> part with we need an alternative to pagemaps that does the same job >>> with out the exploit potential. And I don't how to do that. >>> >>> Arguing in favor of just making the options conflict is the fact that >>> kernel address randomization is pretty much snake oil. At least on >>> x86_64 the address pool is so small it can be trivially brute forced. I >>> think there are maybe 10 bits you can randomize within. >>> >>> As for a way to disable this I expect it would do better with something >>> like a set once flag that prevents a process and all of it's children >>> from accessing this file. >>> >>> *Blink* *Blink* Did you say you are worried about escalting privileges >>> from root into the kernel space. That is non-sense. We give root the >>> power to shot themselves in the foot and any proc option will be >>> something that root will be able to get around. >>> >>> The pieces of the patch description don't add up. >> >> No, that's an entirely valid use-case. You can trust the kernel but >> not root. This is the point of the "trusted_kernel" patch series that >> disables all sorts of dangerous interfaces that allow root to get at >> physical memory. >> >> This situation is more a memory leak than a direct compromise, so it >> seems like providing at least some runtime control of it (separate >> from potential future "trusted_kernel" stuff) makes sense. > > I am too tired to argue about the kASLR snake-oil. No problem. :) > > I do not think a proc mount option is at all apropriate for controlling > the behavior of the pagemap file. And "paranoid" is entirely too > generic of a string to have any meaning. > > Either just tighten the permissions when kASLR is enabled, or have the > file go away entirely. > > If you want run-time knobs there are all kinds of run-time knobs you can > use. > > If the concern is to prot
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
Kees Cook writes: > On Mon, Mar 9, 2015 at 3:13 PM, Eric W. Biederman > wrote: >> Dave Hansen writes: >> >>> From: Dave Hansen >>> >>> Physical addresses are sensitive information. There are >>> existing, known exploits that are made easier if physical >>> information is available. Here is one example: >>> >>> http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf >>> >>> If you know the physical address of something you also know at >>> which kernel virtual address you can find something (modulo >>> highmem). It means that things that keep the kernel from >>> accessing user mappings (like SMAP/SMEP) can be worked around >>> because the _kernel_ mapping can get used instead. >>> >>> But, /proc/$pid/pagemap exposes the physical addresses of all >>> pages accessible to userspace. This works against all of the >>> efforts to keep kernel addresses out of places where unprivileged >>> apps can find them. >>> >>> This patch introduces a "paranoid" option for /proc. It can be >>> enabled like this: >>> >>> mount -o remount,paranoid /proc >>> >>> Or when /proc is mounted initially. When 'paranoid' mode is >>> active, opens to /proc/$pid/pagemap will return -EPERM for users >>> without CAP_SYS_RAWIO. It can be disabled like this: >>> >>> mount -o remount,notparanoid /proc >>> >>> The option is applied to the pid namespace, so an app that wanted >>> a separate policy from the rest of the system could get run in >>> its own pid namespace. >>> >>> I'm not really that stuck on the name. I'm not opposed to making >>> it apply only to pagemap or to giving it a pagemap-specific >>> name. >>> >>> pagemap is also the kind of feature that could be used to escalate >>> privileged from root in to the kernel. It probably needs to be >>> protected in the same way that /dev/mem or module loading is in >>> cases where the kernel needs to be protected from root, thus the >>> choice to use CAP_SYS_RAWIO. >> >> >> There is already a way to make pagemap go away. It is called >> CONFIG_PROC_PAGE_MONITOR. >> >> I suspect the right answer here is if you enable kernel address >> randomization you disable CONFIG_PROC_PAGE_MONTIOR. Aka you make the >> two options conflict with each other. > > It's not a good idea to make CONFIG options conflict with each other > like this as it puts distros is a tricky spot to decide which to use. > Allowing both and having a runtime flag of some kind tends to be the > better option (e.g. kASLR vs Hibernation). But there is a fundamental conflict. As such it might as well be expressed in Kconfig. >> That is a lot less code and a lot less to maintain. >> >> On the other hand if this is truly a valuable interface that you can't >> part with we need an alternative to pagemaps that does the same job >> with out the exploit potential. And I don't how to do that. >> >> Arguing in favor of just making the options conflict is the fact that >> kernel address randomization is pretty much snake oil. At least on >> x86_64 the address pool is so small it can be trivially brute forced. I >> think there are maybe 10 bits you can randomize within. >> >> As for a way to disable this I expect it would do better with something >> like a set once flag that prevents a process and all of it's children >> from accessing this file. >> >> *Blink* *Blink* Did you say you are worried about escalting privileges >> from root into the kernel space. That is non-sense. We give root the >> power to shot themselves in the foot and any proc option will be >> something that root will be able to get around. >> >> The pieces of the patch description don't add up. > > No, that's an entirely valid use-case. You can trust the kernel but > not root. This is the point of the "trusted_kernel" patch series that > disables all sorts of dangerous interfaces that allow root to get at > physical memory. > > This situation is more a memory leak than a direct compromise, so it > seems like providing at least some runtime control of it (separate > from potential future "trusted_kernel" stuff) makes sense. I am too tired to argue about the kASLR snake-oil. I do not think a proc mount option is at all apropriate for controlling the behavior of the pagemap file. And "paranoid" is entirely too generic of a string to have any meaning. Either just tighten the permissions when kASLR is enabled, or have the file go away entirely. If you want run-time knobs there are all kinds of run-time knobs you can use. If the concern is to protect against root getting into the kernel the "trusted_kernel" snake-oil just compile out the pagemap file. Nothing else is remotely interesting from a mainenance point of view. As I said. Nacked-by: "Eric W. Biederman" Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
On Mon, Mar 9, 2015 at 3:13 PM, Eric W. Biederman wrote: > Dave Hansen writes: > >> From: Dave Hansen >> >> Physical addresses are sensitive information. There are >> existing, known exploits that are made easier if physical >> information is available. Here is one example: >> >> http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf >> >> If you know the physical address of something you also know at >> which kernel virtual address you can find something (modulo >> highmem). It means that things that keep the kernel from >> accessing user mappings (like SMAP/SMEP) can be worked around >> because the _kernel_ mapping can get used instead. >> >> But, /proc/$pid/pagemap exposes the physical addresses of all >> pages accessible to userspace. This works against all of the >> efforts to keep kernel addresses out of places where unprivileged >> apps can find them. >> >> This patch introduces a "paranoid" option for /proc. It can be >> enabled like this: >> >> mount -o remount,paranoid /proc >> >> Or when /proc is mounted initially. When 'paranoid' mode is >> active, opens to /proc/$pid/pagemap will return -EPERM for users >> without CAP_SYS_RAWIO. It can be disabled like this: >> >> mount -o remount,notparanoid /proc >> >> The option is applied to the pid namespace, so an app that wanted >> a separate policy from the rest of the system could get run in >> its own pid namespace. >> >> I'm not really that stuck on the name. I'm not opposed to making >> it apply only to pagemap or to giving it a pagemap-specific >> name. >> >> pagemap is also the kind of feature that could be used to escalate >> privileged from root in to the kernel. It probably needs to be >> protected in the same way that /dev/mem or module loading is in >> cases where the kernel needs to be protected from root, thus the >> choice to use CAP_SYS_RAWIO. > > > There is already a way to make pagemap go away. It is called > CONFIG_PROC_PAGE_MONITOR. > > I suspect the right answer here is if you enable kernel address > randomization you disable CONFIG_PROC_PAGE_MONTIOR. Aka you make the > two options conflict with each other. It's not a good idea to make CONFIG options conflict with each other like this as it puts distros is a tricky spot to decide which to use. Allowing both and having a runtime flag of some kind tends to be the better option (e.g. kASLR vs Hibernation). > That is a lot less code and a lot less to maintain. > > On the other hand if this is truly a valuable interface that you can't > part with we need an alternative to pagemaps that does the same job > with out the exploit potential. And I don't how to do that. > > Arguing in favor of just making the options conflict is the fact that > kernel address randomization is pretty much snake oil. At least on > x86_64 the address pool is so small it can be trivially brute forced. I > think there are maybe 10 bits you can randomize within. > > As for a way to disable this I expect it would do better with something > like a set once flag that prevents a process and all of it's children > from accessing this file. > > *Blink* *Blink* Did you say you are worried about escalting privileges > from root into the kernel space. That is non-sense. We give root the > power to shot themselves in the foot and any proc option will be > something that root will be able to get around. > > The pieces of the patch description don't add up. No, that's an entirely valid use-case. You can trust the kernel but not root. This is the point of the "trusted_kernel" patch series that disables all sorts of dangerous interfaces that allow root to get at physical memory. This situation is more a memory leak than a direct compromise, so it seems like providing at least some runtime control of it (separate from potential future "trusted_kernel" stuff) makes sense. -Kees > > Nacked-by: "Eric W. Biederman" > > Eric -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
Dave Hansen writes: > From: Dave Hansen > > Physical addresses are sensitive information. There are > existing, known exploits that are made easier if physical > information is available. Here is one example: > > http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf > > If you know the physical address of something you also know at > which kernel virtual address you can find something (modulo > highmem). It means that things that keep the kernel from > accessing user mappings (like SMAP/SMEP) can be worked around > because the _kernel_ mapping can get used instead. > > But, /proc/$pid/pagemap exposes the physical addresses of all > pages accessible to userspace. This works against all of the > efforts to keep kernel addresses out of places where unprivileged > apps can find them. > > This patch introduces a "paranoid" option for /proc. It can be > enabled like this: > > mount -o remount,paranoid /proc > > Or when /proc is mounted initially. When 'paranoid' mode is > active, opens to /proc/$pid/pagemap will return -EPERM for users > without CAP_SYS_RAWIO. It can be disabled like this: > > mount -o remount,notparanoid /proc > > The option is applied to the pid namespace, so an app that wanted > a separate policy from the rest of the system could get run in > its own pid namespace. > > I'm not really that stuck on the name. I'm not opposed to making > it apply only to pagemap or to giving it a pagemap-specific > name. > > pagemap is also the kind of feature that could be used to escalate > privileged from root in to the kernel. It probably needs to be > protected in the same way that /dev/mem or module loading is in > cases where the kernel needs to be protected from root, thus the > choice to use CAP_SYS_RAWIO. There is already a way to make pagemap go away. It is called CONFIG_PROC_PAGE_MONITOR. I suspect the right answer here is if you enable kernel address randomization you disable CONFIG_PROC_PAGE_MONTIOR. Aka you make the two options conflict with each other. That is a lot less code and a lot less to maintain. On the other hand if this is truly a valuable interface that you can't part with we need an alternative to pagemaps that does the same job with out the exploit potential. And I don't how to do that. Arguing in favor of just making the options conflict is the fact that kernel address randomization is pretty much snake oil. At least on x86_64 the address pool is so small it can be trivially brute forced. I think there are maybe 10 bits you can randomize within. As for a way to disable this I expect it would do better with something like a set once flag that prevents a process and all of it's children from accessing this file. *Blink* *Blink* Did you say you are worried about escalting privileges from root into the kernel space. That is non-sense. We give root the power to shot themselves in the foot and any proc option will be something that root will be able to get around. The pieces of the patch description don't add up. Nacked-by: "Eric W. Biederman" Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
On Mon, Mar 9, 2015 at 1:43 PM, Dave Hansen wrote: > > From: Dave Hansen > > Physical addresses are sensitive information. There are > existing, known exploits that are made easier if physical > information is available. Here is one example: > > http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf > > If you know the physical address of something you also know at > which kernel virtual address you can find something (modulo > highmem). It means that things that keep the kernel from > accessing user mappings (like SMAP/SMEP) can be worked around > because the _kernel_ mapping can get used instead. > > But, /proc/$pid/pagemap exposes the physical addresses of all > pages accessible to userspace. This works against all of the > efforts to keep kernel addresses out of places where unprivileged > apps can find them. > > This patch introduces a "paranoid" option for /proc. It can be > enabled like this: > > mount -o remount,paranoid /proc > > Or when /proc is mounted initially. When 'paranoid' mode is > active, opens to /proc/$pid/pagemap will return -EPERM for users > without CAP_SYS_RAWIO. It can be disabled like this: > > mount -o remount,notparanoid /proc > > The option is applied to the pid namespace, so an app that wanted > a separate policy from the rest of the system could get run in > its own pid namespace. > > I'm not really that stuck on the name. I'm not opposed to making > it apply only to pagemap or to giving it a pagemap-specific > name. > > pagemap is also the kind of feature that could be used to escalate > privileged from root in to the kernel. It probably needs to be > protected in the same way that /dev/mem or module loading is in > cases where the kernel needs to be protected from root, thus the > choice to use CAP_SYS_RAWIO. > > Signed-off-by: Dave Hansen Seems reasonable. I would note that even CAP_SYS_RAWIO isn't enough to actually do anything with RAM in /dev/mem. That's entirely controlled by CONFIG_STRICT_DEVMEM. I think /proc/kpagecount and /proc/kpageflags should get filtered as well, instead of them relying on the uid=0 check. Reviewed-by: Kees Cook -Kees > --- > > b/fs/proc/root.c| 10 +- > b/fs/proc/task_mmu.c| 11 +++ > b/include/linux/pid_namespace.h |1 + > 3 files changed, 21 insertions(+), 1 deletion(-) > > diff -puN fs/proc/root.c~privileged-pagemap fs/proc/root.c > --- a/fs/proc/root.c~privileged-pagemap 2015-03-09 13:33:12.104796793 -0700 > +++ b/fs/proc/root.c2015-03-09 13:33:12.111797109 -0700 > @@ -39,10 +39,12 @@ static int proc_set_super(struct super_b > } > > enum { > - Opt_gid, Opt_hidepid, Opt_err, > + Opt_gid, Opt_hidepid, Opt_paranoid, Opt_notparanoid, Opt_err, > }; > > static const match_table_t tokens = { > + {Opt_paranoid, "paranoid"}, > + {Opt_notparanoid, "notparanoid"}, > {Opt_hidepid, "hidepid=%u"}, > {Opt_gid, "gid=%u"}, > {Opt_err, NULL}, > @@ -70,6 +72,12 @@ static int proc_parse_options(char *opti > return 0; > pid->pid_gid = make_kgid(current_user_ns(), option); > break; > + case Opt_paranoid: > + pid->paranoid = 1; > + break; > + case Opt_notparanoid: > + pid->paranoid = 0; > + break; > case Opt_hidepid: > if (match_int(&args[0], &option)) > return 0; > diff -puN fs/proc/task_mmu.c~privileged-pagemap fs/proc/task_mmu.c > --- a/fs/proc/task_mmu.c~privileged-pagemap 2015-03-09 13:33:12.106796883 > -0700 > +++ b/fs/proc/task_mmu.c2015-03-09 13:33:12.112797154 -0700 > @@ -1322,9 +1322,20 @@ out: > > static int pagemap_open(struct inode *inode, struct file *file) > { > + struct pid_namespace *ns = inode->i_sb->s_fs_info; > + > pr_warn_once("Bits 55-60 of /proc/PID/pagemap entries are about " > "to stop being page-shift some time soon. See the " > "linux/Documentation/vm/pagemap.txt for details.\n"); > + > + /* > +* Use the RAWIO capability bit. If you can not go open > +* /dev/mem, then you also have no business knowing the > +* physical addresses of things. > +*/ > + if (ns->paranoid && !capable(CAP_SYS_RAWIO)) > + return -EPERM; > + > return 0; > } > > diff -puN include/linux/pid_namespace.h~privileged-pagemap > include/linux/pid_namespace.h > --- a/include/linux/pid_namespace.h~privileged-pagemap 2015-03-09 > 13:33:12.108796973 -0700 > +++ b/include/linux/pid_namespace.h 2015-03-09 13:33:12.112797154 -0700 > @@ -43,6 +43,7 @@ struct pid_namespace { > struct work_struct proc_work; > kgid_t pid_gid; > int hide_pid; > + int paranoid; > int reboot; /