subject:"Re\: \[RFC PATCH\] seccomp\: Add protection keys into seccomp

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2022-11-16 Thread Stephen Röttger

On Tue, Nov 15, 2022 at 5:16 AM Michael Sammler  wrote:
> > We're currently working on a feature in chromium that uses pkeys for
> > in-process isolation. Being able to use the pkey state in the seccomp
> > filter would be pretty useful for this. For example, it would allow
> > us to enforce that no code outside the isolated thread would ever
> > map/mprotect executable memory.
> > We can probably do something similar by adding instruction pointer
> > checks to the seccomp filter, but that feels quite hacky and this
> > feature would make a much nicer implementation.
> >
> > Are there any plans to make a version 2 of this patch?
>
> Thanks for your interest in this patch, but I am now working on other 
> projects and currently don't plan to make a version 2 of this patch.

I'd be happy to take over writing a version 2 for this.

Kees and Dave, does this feature overall look good to you?

>From the discussion, I think there are two proposed changes:
* use an architecture-generic interface as Ram Pai suggested (i.e. add
a read_pkey function)
* ensure to restore the pkru value or fetch it from the xsave buffer


smime.p7s
Description: S/MIME Cryptographic Signature

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2022-11-14 Thread Michael Sammler




> We're currently working on a feature in chromium that uses pkeys for
> in-process isolation. Being able to use the pkey state in the seccomp
> filter would be pretty useful for this. For example, it would allow
> us to enforce that no code outside the isolated thread would ever
> map/mprotect executable memory.
> We can probably do something similar by adding instruction pointer
> checks to the seccomp filter, but that feels quite hacky and this
> feature would make a much nicer implementation.
>
> Are there any plans to make a version 2 of this patch?

Thanks for your interest in this patch, but I am now working on other projects 
and currently don't plan to make a version 2 of this patch.

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2022-11-14 Thread Stephen Röttger

> Add the current value of an architecture specific protection keys
> register (currently PKRU on x86) to data available for seccomp-bpf
> programs to work on. This allows filters based on the currently
> enabled protection keys.

We're currently working on a feature in chromium that uses pkeys for
in-process isolation. Being able to use the pkey state in the seccomp
filter would be pretty useful for this. For example, it would allow
us to enforce that no code outside the isolated thread would ever
map/mprotect executable memory.
We can probably do something similar by adding instruction pointer
checks to the seccomp filter, but that feels quite hacky and this
feature would make a much nicer implementation.

Are there any plans to make a version 2 of this patch?


smime.p7s
Description: S/MIME Cryptographic Signature

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2018-10-30 Thread Michael Sammler


On 10/29/2018 11:33 PM, Dave Hansen wrote:

But, that's really an implementation detail.  The effect on the ABI and
how this might constrain future pkeys use is my bigger worry.

I'd also want to make sure that your specific use-case is compatible
with all the oddities of pkeys, like the 'clone' and signal behavior.
Some of that is spelled out here:

http://man7.org/linux/man-pages/man7/pkeys.7.html



I think my specific use case is compatible with these oddities since the 
untrusted code is not allowed to install signal handlers or spawn 
threads himself but only through the trusted code, which ensures, that 
the PKRU has the correct value when control passes back to the untrusted 
code. But I agree that these oddities are something a programmer needs 
to take into account if he uses pkeys in general (and this patch adds 
new considerations to it if the program installs a seccomp filter and 
e.g. wants to do system calls from a signal handler).



One thing that's a worry is that we have never said that you *can't*
write to arbitrary permissions in PKRU.  I can totally see some really
paranoid code saying, "I'm about to do something risky, so I'll turn off
access to *all* pkeys", or " turn off all access except my current
stack".  If they did that, they might also inadvertently disable access
to certain seccomp-restricted syscalls.

We can fix that up by documenting restrictions like "code should never
change the access rights of any pkey other than those that it
allocated", but that doesn't help any old code (of which I hope there is
relatively little).



I can also see paranoid code wanting to do something like you describe 
and I think, that we should try to not forbid this behaviour. 
Documenting restrictions on code which writes to the PKRU as you 
describe would be one way to go, but this would disallow this paranoid 
use case if I understand it correctly because the code would not be 
allowed to disable access to all except of one pkey.


My idea would be to not put the blame on the code which writes to the 
PKRU but on the code which installs the seccomp filter based on the 
PKRU: It has to make sure, that it allows the system calls which the 
paranoid code should be allowed to do when the pkey of the paranoid code 
is the (only) enabled pkey. This could be ensured e.g. by the paranoid 
code providing the pkey it intends to use to the code which installs the 
seccomp filter. This of course means, that you need to know what you are 
doing, when you install a seccomp filter which depends on the PKRU, but 
I think this is already the case when you install a seccomp filter 
without this patch (but maybe someone with more seccomp experience than 
me can say better how much reasoning is needed to install a seccomp 
filter without and with this patch).


-- Michael

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2018-10-29 Thread Dave Hansen

On 10/29/18 2:55 PM, Michael Sammler wrote:
>> PKRU getting reset on signals, and the requirement now that it *can't*
>> be changed if you make syscalls probably needs to get thought about very
>> carefully before we do this, though.
> I am not sure, whether I follow you. Are you saying, that PKRU is
> currently not guaranteed to be preserved across system calls?
> This would make it very hard to use protection keys if libc does not
> save and restore the PKRU before/after systemcalls (and I am not aware
> of this).

It's preserved *across* system calls, but you have to be a bit careful
using it _inside_ the kernel.  We could context switch off to something
else, and not think that we need to restore PKRU until _just_ before we
return to userspace.

> Or do you mean, that the kernel might want to use the PKRU register for
> its own purposes while it is executing?

That, or we might keep another process's PKRU state in the register if
we don't think anyone is using it.  Now that I think about it, I think
Rik (cc'd, who was working on those patches) *had* to explicitly restore
PKRU because it's hard to tell where we might do a copy_to/from_user()
and need it.

> Then the solution you proposed in another email in this thread would
> work: instead of providing the seccomp filter with the current value of
> the PKRU (which might be different from what the user space expects) use
> the user space value which must have been saved somewhere (otherwise it
> would not be possible to restore it).

Yep, that's the worst-case scenario: either fetch PKRU out of the XSAVE
buffer (current->fpu->something), or just restore them using an existing
API before doing RDPKRU.

But, that's really an implementation detail.  The effect on the ABI and
how this might constrain future pkeys use is my bigger worry.

I'd also want to make sure that your specific use-case is compatible
with all the oddities of pkeys, like the 'clone' and signal behavior.
Some of that is spelled out here:

http://man7.org/linux/man-pages/man7/pkeys.7.html

One thing that's a worry is that we have never said that you *can't*
write to arbitrary permissions in PKRU.  I can totally see some really
paranoid code saying, "I'm about to do something risky, so I'll turn off
access to *all* pkeys", or " turn off all access except my current
stack".  If they did that, they might also inadvertently disable access
to certain seccomp-restricted syscalls.

We can fix that up by documenting restrictions like "code should never
change the access rights of any pkey other than those that it
allocated", but that doesn't help any old code (of which I hope there is
relatively little).

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2018-10-29 Thread Michael Sammler


Am 29.10.2018 um 18:29 schrieb Dave Hansen:


On 10/29/18 9:48 AM, Jann Horn wrote:

On Mon, Oct 29, 2018 at 5:37 PM Dave Hansen  wrote:

I'm not sure this is a great use for PKRU.  I *think* the basic problem
is that you want to communicate some rights information down into a
filter, and you want to communicate it with PKRU.  While it's handy to
have an extra register that nobody (generally) mucks with, I'm not quite
convinced that we want to repurpose it this way.

That's not how I understand it; I believe that the context is probably
https://arxiv.org/pdf/1801.06822.pdf ?
My understanding is that PKRU is used for lightweight in-process
sandboxing, and to extend this sandbox protection to the syscall
interface, it is necessary to expose PKRU state to seccomp filters.
In other words, this isn't using PKRU exclusively for passing rights
into a filter, but it has to use PKRU anyway.

PKRU gives information about rights to various bits of application data.
  From that, a seccomp filter can infer the context, and thus the ability
for the code to call a given syscall at a certain point in time.

This makes PKRU an opt-in part of the syscall ABI, which is pretty
interesting.  We _could_ do the same kind of thing with any callee-saved
general purpose register, but PKRU is particularly attractive because
there is only one instruction that writes to it (well, outside of
XSAVE*), and random library code is very unlikely at this point to be
using it.
I agree with you on the points, why PKRU is particularly attractive but 
I think the most important point is that PKRU is _not_ a general purpose 
register, but is already used to control access to some resource 
(memory). This patch would allow to also control access to another 
resource (system calls) using the PKRU. This is why it makes sense to 
use the PKRU in this patch instead of another callee-saved register.

PKRU getting reset on signals, and the requirement now that it *can't*
be changed if you make syscalls probably needs to get thought about very
carefully before we do this, though.
I am not sure, whether I follow you. Are you saying, that PKRU is 
currently not guaranteed to be preserved across system calls?
This would make it very hard to use protection keys if libc does not 
save and restore the PKRU before/after systemcalls (and I am not aware 
of this).


Or do you mean, that the kernel might want to use the PKRU register for 
its own purposes while it is executing?
Then the solution you proposed in another email in this thread would 
work: instead of providing the seccomp filter with the current value of 
the PKRU (which might be different from what the user space expects) use 
the user space value which must have been saved somewhere (otherwise it 
would not be possible to restore it).


Or are you afraid, that one part of a user space program installs a 
seccomp filter, which blocks system calls based on the PKRU, and another 
part of the same program (maybe a library) changes the PKRU in a way, 
which the first part did not expect and the program dies because it 
tries to do a forbidden system call?
I don't know whether the kernel can (and wants) do anything against 
this. This problem also exists without this patch if you replace system 
call with memory access.


-- Michael

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2018-10-29 Thread Dave Hansen

On 10/29/18 9:48 AM, Jann Horn wrote:
> On Mon, Oct 29, 2018 at 5:37 PM Dave Hansen  wrote:
>> I'm not sure this is a great use for PKRU.  I *think* the basic problem
>> is that you want to communicate some rights information down into a
>> filter, and you want to communicate it with PKRU.  While it's handy to
>> have an extra register that nobody (generally) mucks with, I'm not quite
>> convinced that we want to repurpose it this way.
> 
> That's not how I understand it; I believe that the context is probably
> https://arxiv.org/pdf/1801.06822.pdf ?
> My understanding is that PKRU is used for lightweight in-process
> sandboxing, and to extend this sandbox protection to the syscall
> interface, it is necessary to expose PKRU state to seccomp filters.
> In other words, this isn't using PKRU exclusively for passing rights
> into a filter, but it has to use PKRU anyway.

PKRU gives information about rights to various bits of application data.
 From that, a seccomp filter can infer the context, and thus the ability
for the code to call a given syscall at a certain point in time.

This makes PKRU an opt-in part of the syscall ABI, which is pretty
interesting.  We _could_ do the same kind of thing with any callee-saved
general purpose register, but PKRU is particularly attractive because
there is only one instruction that writes to it (well, outside of
XSAVE*), and random library code is very unlikely at this point to be
using it.

PKRU getting reset on signals, and the requirement now that it *can't*
be changed if you make syscalls probably needs to get thought about very
carefully before we do this, though.

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2018-10-29 Thread Dave Hansen

On 10/29/18 10:02 AM, Michael Sammler wrote:
>>> Also, I'm not sure the kernel provides the PKRU guarantees you want at
>>> the moment.  Our implementation *probably* works, but it's mostly by
>>> accident.
> I don't know, which guarantees about the PKRU are provided at the
> moment, but the only guarantee needed for this patch is, that the kernel
> does not change the bits of the PKRU register, which belong to pkeys
> allocated by the user program, between the syscall entry and the call to
> secure_computing(). Is there are use case where the kernel would like to
> modify these bits of the PKRU?

We've been talking about doing more lax save/restore of the XSAVE
content (PKRU is part of this content).  We would, for instance, only
restore it when returning to userspace, but PKRU might not be up-to-date
with the value in current->fpu.

It's not a deal-breaker with your approach, it's just something to be
careful of and make sure PKRU is up-to-date before you go use it.

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2018-10-29 Thread Michael Sammler


On 10/29/2018 05:48 PM, Ram Pai wrote:


On Mon, Oct 29, 2018 at 09:25:15AM -0700, Kees Cook wrote:

On Mon, Oct 29, 2018 at 4:23 AM, Michael Sammler  wrote:

Add the current value of an architecture specific protection keys
register (currently PKRU on x86) to data available for seccomp-bpf
programs to work on. This allows filters based on the currently
enabled protection keys.

Support for protection keys on the POWER architecture is not part of
this patch since I do not have access to a PowerPC, but adding support
for it can be achieved by setting sd->pkeys to the AMR register in
populate_seccomp_data.

Maybe you can use a generic read_pkey() function, and each arch can
provide their own implementation.  In the case of powerpc it is
mfspr(SPRN_AMR);

something like the code below might help?

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 92a9962..d79efe3 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -89,6 +89,11 @@ static inline u16 pte_to_pkey_bits(u64 pteflags)
  #define __mm_pkey_is_reserved(pkey) (reserved_allocation_mask & \
   pkey_alloc_mask(pkey))
  
+static inline u64 read_pkey()

+{
+return mfspr(SPRN_AMR);
+}
+
  static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
  {
if (pkey < 0 || pkey >= arch_max_pkey())
diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index 19b137f..feea114 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -9,6 +9,11 @@
  extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
unsigned long init_val);
  
+static inline u64 read_pkey()

+{
+return read_pkru();
+}
+
  static inline bool arch_pkeys_enabled(void)
  {
return boot_cpu_has(X86_FEATURE_OSPKE);
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 2955ba97..f1538c7 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -13,6 +13,11 @@
  #define PKEY_DEDICATED_EXECUTE_ONLY 0
  #define ARCH_VM_PKEY_FLAGS 0
  
+static inline u64 read_pkey()

+{
+   return 0;
+}
+
  static inline int vma_pkey(struct vm_area_struct *vma)
  {
return 0;

RP
Thank you very much, this helps indeed. I will add this to the next 
version of this patch.



One use case for this patch is disabling unnecessary system calls for a
library (e.g. network i/o for a crypto library) while the library runs
without disabling the system calls for the whole program (by changing
the protection keys before and after the library executes). Using this
one could ensure that the library behaves a expected (e.g. the crypto
library not sending private keys to a malicious server).

This patch also enables lightweight sandboxing of untrusted code using
memory protection keys: Protection keys provide memory isolation but
for a sandbox system call isolation is needed as well. This patch
allows writing a seccomp filter to prevent system calls by the
untrusted code while still allowing system calls for the trusted code.

Isn't PKU instruction based? Couldn't a malicious library just change
the state of the MPK registers? This seems like an easy way to bypass
any filters that used PKU. I'm not convinced this is a meaningful
barrier that should be enforced by seccomp.

This can also be done with a signal handler with SECCOMP_RET_TRAP and
check instruction pointer vs PKU state.

-Kees


An alternative design would be to extend (c)BPF with a new instruction
to read the state of a protection key. This alternate design would
provide a simpler interface to the user space since the BPF program
would not need to deal with the architecture specific pkeys field in
seccomp_data, but the question is, how much of an advantage this would
be as the nr field in seccomp_data is already architecture specific.
Adding a new instruction for BPF programs is more complicated than
this patch and might be a breaking change.

Results of selftests/seccomp_benchmark.c on a x86 machine with pkeys
support:

With patch:
Benchmarking 33554432 samples...
28.019505558 - 18.676858522 = 9342647036
getpid native: 278 ns
42.279109885 - 28.019657031 = 14259452854
getpid RET_ALLOW: 424 ns
Estimated seccomp overhead per syscall: 146 ns

Without patch:
Benchmarking 33554432 samples...
28.059619466 - 18.706769155 = 9352850311
getpid native: 278 ns
42.299228279 - 28.059761804 = 14239466475
getpid RET_ALLOW: 424 ns
Estimated seccomp overhead per syscall: 146 ns

Cc: Kees Cook 
Cc: Andy Lutomirski 
Cc: Will Drewry 
Cc: Ram Pai 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-...@vger.kernel.org
Signed-off-by: Michael Sammler 
---
Changes to the previous version:
- added motivation, notes about POWER, alternative design and benchmark results 
to the commit log
- renamed pkru field in seccomp_data to pkeys
- changed size of pkru field to __u64 and removed reserved field
- added test for x86

  arch/mips/kernel/ptrace.c

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2018-10-29 Thread Michael Sammler


On 10/29/2018 05:48 PM, Jann Horn wrote:

On Mon, Oct 29, 2018 at 5:37 PM Dave Hansen  wrote:

On 10/29/18 9:25 AM, Kees Cook wrote:

On Mon, Oct 29, 2018 at 4:23 AM, Michael Sammler  wrote:

Add the current value of an architecture specific protection keys
register (currently PKRU on x86) to data available for seccomp-bpf
programs to work on. This allows filters based on the currently
enabled protection keys.

How does the current "assignment" of protection keys to the various uses
get communicated to the filter?

I assume that you first allocate your protection keys, then install the filter?

Yes, but I agree that it should probably be documented, that the filter 
should only look at the parts of the PKRU, which belong to pkeys the 
user space program allocated (if the kernel wants to use some parts of 
the PKRU for its own purposes).

I'm not sure this is a great use for PKRU.  I *think* the basic problem
is that you want to communicate some rights information down into a
filter, and you want to communicate it with PKRU.  While it's handy to
have an extra register that nobody (generally) mucks with, I'm not quite
convinced that we want to repurpose it this way.

That's not how I understand it; I believe that the context is probably
https://arxiv.org/pdf/1801.06822.pdf ?
My understanding is that PKRU is used for lightweight in-process
sandboxing, and to extend this sandbox protection to the syscall
interface, it is necessary to expose PKRU state to seccomp filters.
In other words, this isn't using PKRU exclusively for passing rights
into a filter, but it has to use PKRU anyway.
Yes, https://arxiv.org/pdf/1801.06822.pdf is indeed the context and what 
you say is correct.

Also, I'm not sure the kernel provides the PKRU guarantees you want at
the moment.  Our implementation *probably* works, but it's mostly by
accident.
I don't know, which guarantees about the PKRU are provided at the 
moment, but the only guarantee needed for this patch is, that the kernel 
does not change the bits of the PKRU register, which belong to pkeys 
allocated by the user program, between the syscall entry and the call to 
secure_computing(). Is there are use case where the kernel would like to 
modify these bits of the PKRU?


-- MIchael

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2018-10-29 Thread Ram Pai

On Mon, Oct 29, 2018 at 09:25:15AM -0700, Kees Cook wrote:
> On Mon, Oct 29, 2018 at 4:23 AM, Michael Sammler  wrote:
> > Add the current value of an architecture specific protection keys
> > register (currently PKRU on x86) to data available for seccomp-bpf
> > programs to work on. This allows filters based on the currently
> > enabled protection keys.
> >
> > Support for protection keys on the POWER architecture is not part of
> > this patch since I do not have access to a PowerPC, but adding support
> > for it can be achieved by setting sd->pkeys to the AMR register in
> > populate_seccomp_data.

Maybe you can use a generic read_pkey() function, and each arch can
provide their own implementation.  In the case of powerpc it is
mfspr(SPRN_AMR);

something like the code below might help?

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 92a9962..d79efe3 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -89,6 +89,11 @@ static inline u16 pte_to_pkey_bits(u64 pteflags)
 #define __mm_pkey_is_reserved(pkey) (reserved_allocation_mask & \
   pkey_alloc_mask(pkey))
 
+static inline u64 read_pkey()
+{
+return mfspr(SPRN_AMR);
+}
+
 static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 {
if (pkey < 0 || pkey >= arch_max_pkey())
diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index 19b137f..feea114 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -9,6 +9,11 @@
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
unsigned long init_val);
 
+static inline u64 read_pkey()
+{
+return read_pkru();
+}
+
 static inline bool arch_pkeys_enabled(void)
 {
return boot_cpu_has(X86_FEATURE_OSPKE);
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 2955ba97..f1538c7 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -13,6 +13,11 @@
 #define PKEY_DEDICATED_EXECUTE_ONLY 0
 #define ARCH_VM_PKEY_FLAGS 0
 
+static inline u64 read_pkey()
+{
+   return 0;
+}
+
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
return 0;

RP

> >
> > One use case for this patch is disabling unnecessary system calls for a
> > library (e.g. network i/o for a crypto library) while the library runs
> > without disabling the system calls for the whole program (by changing
> > the protection keys before and after the library executes). Using this
> > one could ensure that the library behaves a expected (e.g. the crypto
> > library not sending private keys to a malicious server).
> >
> > This patch also enables lightweight sandboxing of untrusted code using
> > memory protection keys: Protection keys provide memory isolation but
> > for a sandbox system call isolation is needed as well. This patch
> > allows writing a seccomp filter to prevent system calls by the
> > untrusted code while still allowing system calls for the trusted code.
> 
> Isn't PKU instruction based? Couldn't a malicious library just change
> the state of the MPK registers? This seems like an easy way to bypass
> any filters that used PKU. I'm not convinced this is a meaningful
> barrier that should be enforced by seccomp.
> 
> This can also be done with a signal handler with SECCOMP_RET_TRAP and
> check instruction pointer vs PKU state.
> 
> -Kees
> 
> >
> > An alternative design would be to extend (c)BPF with a new instruction
> > to read the state of a protection key. This alternate design would
> > provide a simpler interface to the user space since the BPF program
> > would not need to deal with the architecture specific pkeys field in
> > seccomp_data, but the question is, how much of an advantage this would
> > be as the nr field in seccomp_data is already architecture specific.
> > Adding a new instruction for BPF programs is more complicated than
> > this patch and might be a breaking change.
> >
> > Results of selftests/seccomp_benchmark.c on a x86 machine with pkeys
> > support:
> >
> > With patch:
> > Benchmarking 33554432 samples...
> > 28.019505558 - 18.676858522 = 9342647036
> > getpid native: 278 ns
> > 42.279109885 - 28.019657031 = 14259452854
> > getpid RET_ALLOW: 424 ns
> > Estimated seccomp overhead per syscall: 146 ns
> >
> > Without patch:
> > Benchmarking 33554432 samples...
> > 28.059619466 - 18.706769155 = 9352850311
> > getpid native: 278 ns
> > 42.299228279 - 28.059761804 = 14239466475
> > getpid RET_ALLOW: 424 ns
> > Estimated seccomp overhead per syscall: 146 ns
> >
> > Cc: Kees Cook 
> > Cc: Andy Lutomirski 
> > Cc: Will Drewry 
> > Cc: Ram Pai 
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Cc: linux-...@vger.kernel.org
> > Signed-off-by: Michael Sammler 
> > ---
> > Changes to the previous version:
> > - added motivation, notes about POWER, alternative design and benchmark 
> > results to the commit log
> > - renamed pkru field in seccomp_data to pkeys
> > - changed

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2018-10-29 Thread Dave Hansen

On 10/29/18 9:25 AM, Kees Cook wrote:
> On Mon, Oct 29, 2018 at 4:23 AM, Michael Sammler  wrote:
>> Add the current value of an architecture specific protection keys
>> register (currently PKRU on x86) to data available for seccomp-bpf
>> programs to work on. This allows filters based on the currently
>> enabled protection keys.

How does the current "assignment" of protection keys to the various uses
get communicated to the filter?

I'm not sure this is a great use for PKRU.  I *think* the basic problem
is that you want to communicate some rights information down into a
filter, and you want to communicate it with PKRU.  While it's handy to
have an extra register that nobody (generally) mucks with, I'm not quite
convinced that we want to repurpose it this way.

Also, I'm not sure the kernel provides the PKRU guarantees you want at
the moment.  Our implementation *probably* works, but it's mostly by
accident.

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

2018-10-29 Thread Kees Cook

On Mon, Oct 29, 2018 at 4:23 AM, Michael Sammler  wrote:
> Add the current value of an architecture specific protection keys
> register (currently PKRU on x86) to data available for seccomp-bpf
> programs to work on. This allows filters based on the currently
> enabled protection keys.
>
> Support for protection keys on the POWER architecture is not part of
> this patch since I do not have access to a PowerPC, but adding support
> for it can be achieved by setting sd->pkeys to the AMR register in
> populate_seccomp_data.
>
> One use case for this patch is disabling unnecessary system calls for a
> library (e.g. network i/o for a crypto library) while the library runs
> without disabling the system calls for the whole program (by changing
> the protection keys before and after the library executes). Using this
> one could ensure that the library behaves a expected (e.g. the crypto
> library not sending private keys to a malicious server).
>
> This patch also enables lightweight sandboxing of untrusted code using
> memory protection keys: Protection keys provide memory isolation but
> for a sandbox system call isolation is needed as well. This patch
> allows writing a seccomp filter to prevent system calls by the
> untrusted code while still allowing system calls for the trusted code.

Isn't PKU instruction based? Couldn't a malicious library just change
the state of the MPK registers? This seems like an easy way to bypass
any filters that used PKU. I'm not convinced this is a meaningful
barrier that should be enforced by seccomp.

This can also be done with a signal handler with SECCOMP_RET_TRAP and
check instruction pointer vs PKU state.

-Kees

>
> An alternative design would be to extend (c)BPF with a new instruction
> to read the state of a protection key. This alternate design would
> provide a simpler interface to the user space since the BPF program
> would not need to deal with the architecture specific pkeys field in
> seccomp_data, but the question is, how much of an advantage this would
> be as the nr field in seccomp_data is already architecture specific.
> Adding a new instruction for BPF programs is more complicated than
> this patch and might be a breaking change.
>
> Results of selftests/seccomp_benchmark.c on a x86 machine with pkeys
> support:
>
> With patch:
> Benchmarking 33554432 samples...
> 28.019505558 - 18.676858522 = 9342647036
> getpid native: 278 ns
> 42.279109885 - 28.019657031 = 14259452854
> getpid RET_ALLOW: 424 ns
> Estimated seccomp overhead per syscall: 146 ns
>
> Without patch:
> Benchmarking 33554432 samples...
> 28.059619466 - 18.706769155 = 9352850311
> getpid native: 278 ns
> 42.299228279 - 28.059761804 = 14239466475
> getpid RET_ALLOW: 424 ns
> Estimated seccomp overhead per syscall: 146 ns
>
> Cc: Kees Cook 
> Cc: Andy Lutomirski 
> Cc: Will Drewry 
> Cc: Ram Pai 
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-...@vger.kernel.org
> Signed-off-by: Michael Sammler 
> ---
> Changes to the previous version:
> - added motivation, notes about POWER, alternative design and benchmark 
> results to the commit log
> - renamed pkru field in seccomp_data to pkeys
> - changed size of pkru field to __u64 and removed reserved field
> - added test for x86
>
>  arch/mips/kernel/ptrace.c |   1 +
>  arch/x86/entry/common.c   |   1 +
>  include/uapi/linux/seccomp.h  |   3 +
>  kernel/seccomp.c  |   1 +
>  tools/testing/selftests/seccomp/seccomp_bpf.c | 107 
> +-
>  5 files changed, 112 insertions(+), 1 deletion(-)
>
> diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c
> index e5ba56c0..a58dd04d 100644
> --- a/arch/mips/kernel/ptrace.c
> +++ b/arch/mips/kernel/ptrace.c
> @@ -1277,6 +1277,7 @@ asmlinkage long syscall_trace_enter(struct pt_regs 
> *regs, long syscall)
> for (i = 0; i < 6; i++)
> sd.args[i] = args[i];
> sd.instruction_pointer = KSTK_EIP(current);
> +   sd.pkeys = 0;
>
> ret = __secure_computing();
> if (ret == -1)
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 3b2490b8..20c51bf2 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -98,6 +98,7 @@ static long syscall_trace_enter(struct pt_regs *regs)
> sd.arch = arch;
> sd.nr = regs->orig_ax;
> sd.instruction_pointer = regs->ip;
> +   sd.pkeys = read_pkru();
>  #ifdef CONFIG_X86_64
> if (arch == AUDIT_ARCH_X86_64) {
> sd.args[0] = regs->di;
> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> index 9efc0e73..3aa2d934 100644
> --- a/include/uapi/linux/seccomp.h
> +++ b/include/uapi/linux/seccomp.h
> @@ -52,12 +52,15 @@
>   * @instruction_pointer: at the time of the system call.
>   * @args: up to 6 system call arguments always stored

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data

13 matches

Site Navigation

Mail list logo

Footer information