Re: [RFC] kref: pin objects with dangerously high reference count

2016-06-25 Thread Jann Horn
On Sun, Jun 26, 2016 at 2:03 AM, PaX Team  wrote:
> On 25 Jun 2016 at 3:13, Jann Horn wrote:
>
>> Since 2009 or so, PaX had reference count overflow mitigation code. My main
>> reasons for reinventing the wheel are:
>>
>>  - PaX adds arch-specific code, both in the atomic_t operations and in
>>exception handlers. I'd like to keep the code as
>>architecture-independent as possible, especially without adding
>>complexity to assembler code, to make it more maintainable and
>>auditable.
>
> complexity is a few simple lines of asm insns in what is already asm, hardly
> a big deal ;). in exchange you'd lose on code density, performance

Yes. It would probably be hard to get an interrupt instruction instead
of a (longer) call instruction without arch-specific code, and the cmp
operation probably can't be removed without putting the conditional
jump into assembly. I guess the main impact of this is higher
instruction cache pressure, leading to more cache faults? (Since
executing an extra cmp should afaik be really fast? But I don't know
much about optimization.)

Now I'm wondering how often atomic ops actually occur in the kernel...
in some grsec build I have here, I see 3687 "int 4" calls in a vmlinux
file that is 25MB big. In the optimal case of 8 additional bytes (3
for call instead of int, 5 for cmp) compared to PaX, that's about 30KB
more compared to assembly... hm, a ~0.1% increase in kernel size is
quite a lot. I guess one remaining question is how many overflow
checks have to happen in hot code paths.

(One of the hottest overflow checks in PaX is probably the one for
f_count in get_file_rcu() - in a multithreaded process, that's called
for every read() or write() -, and inlined copies of get_file()
probably account for a bunch of overflow checks. I believe that all of
these could be removed on 64bit because f_count is an atomic_long_t.
Why does PaX have overflow checks explicitly for atomic64_t? At least
for reference counters, I think that 64bit overflows shouldn't matter;
even the unrealistic worst-case (2^64)/(4GHz) is still over 100
years.)

I guess you might be right about inline asm being a better choice.

> and race windows.

Oh? I thought my code was race-free.

>>  - The refcounting hardening from PaX does not handle the "simple reference
>>count overflow" case when just the refcounting hardening code is used as
>>a standalone patch.
>
> i don't think that this is quite true as stated as handling this case depends
> on the exact sequencing of events. there're 3 cases in practice:
[...]
> 2. non-local use, safe sequence
>
>   inc_refcount -> object reference escapes to non-local memory
>
>   same as above, this case is not exploitable because the reference wouldn't
>   escape to non-local memory on overflow.

Ah, true. I somewhat dislike having to oops here, but I guess it's
safe; the only way to exploit oopses that I know of is to abuse them
for a very slow and ugly refcount overincrement, and with refcount
hardening, that shouldn't be an issue.

> 3. non-local use, unsafe sequence
>
>object reference escapes to non-local memory -> inc_refcount
>
>this case may already be a logic bug (the object could be freed between 
> these
>two actions if there's no other synchronization mechanism protecting them) 
> and
>a reference would escape even though the refcount itself wouldn't actually 
> be
>incremented. further decrements could then trigger the overdecrement case 
> and
>be exploitable as a use-after-free bug.
>
>the REFCOUNT feature in PaX wasn't designed to handle this case because 
> it's
>too rare to be worth the additional code and performance impact it'd 
> require
>to saturate the refcount in this case as well.

True, this sequence probably doesn't occur very often since acquiring
the new reference first is more obviously safe, and I don't remember
seeing it anywhere.

That said: I think that this sequence would usually be safe: As long
as the thread that makes the object globally available eventually
increments the reference counter (while still having a valid reference
to the object; but that's implicitly true, since otherwise a refcount
increment would be obviously unsafe), the object won't go away before
the refcount increment.

The case where the existing reference to the object is via RCU is an
exception because the current refcount can be zero and the refcount
increment can fail.

But yes, this sequence is less safe than the others because if an oops
happens before the increment, the reference count would end up being
too low.


Re: [RFC] kref: pin objects with dangerously high reference count

2016-06-25 Thread PaX Team
On 25 Jun 2016 at 3:13, Jann Horn wrote:

> Since 2009 or so, PaX had reference count overflow mitigation code. My main
> reasons for reinventing the wheel are:
> 
>  - PaX adds arch-specific code, both in the atomic_t operations and in
>exception handlers. I'd like to keep the code as
>architecture-independent as possible, especially without adding
>complexity to assembler code, to make it more maintainable and
>auditable.

complexity is a few simple lines of asm insns in what is already asm, hardly
a big deal ;). in exchange you'd lose on code density, performance and race
windows.

>  - The refcounting hardening from PaX does not handle the "simple reference
>count overflow" case when just the refcounting hardening code is used as
>a standalone patch.

i don't think that this is quite true as stated as handling this case depends
on the exact sequencing of events. there're 3 cases in practice:

1. local use

   inc_refcount -> use referenced object -> dec_refcount

   in this case inc_refcount would detect and prevent the overflow, regardless
   of how many outstanding non-local or local references there are.

2. non-local use, safe sequence

   inc_refcount -> object reference escapes to non-local memory

   same as above, this case is not exploitable because the reference wouldn't
   escape to non-local memory on overflow.

3. non-local use, unsafe sequence

   object reference escapes to non-local memory -> inc_refcount

   this case may already be a logic bug (the object could be freed between these
   two actions if there's no other synchronization mechanism protecting them) 
and
   a reference would escape even though the refcount itself wouldn't actually be
   incremented. further decrements could then trigger the overdecrement case and
   be exploitable as a use-after-free bug.

   the REFCOUNT feature in PaX wasn't designed to handle this case because it's
   too rare to be worth the additional code and performance impact it'd require
   to saturate the refcount in this case as well.

cheers,
 PaX Team



[RFC] kref: pin objects with dangerously high reference count

2016-06-24 Thread Jann Horn
Reference counting is a frequent source of (security) trouble in the
kernel. Here are the three main things that can, as far as I know, go wrong
with it, together with examples of specific bugs:

 - reference count overdecrement: If something (e.g. a normally-unused
   error path) decrements a reference counter more than it was previously
   incremented, this can cause the reference counter to end up lower than
   it was before. This will cause the object to be freed before the last
   reference to the object is gone.
   This bug class is hard to mitigate - it isn't very different from a
   generic use-after-free.
   Vulnerability examples:
 commit 8358b02bf67d ("bpf: fix double-fdput [...]")
   Bug examples where I haven't looked at impact:
 commit 75f0b68b75da ("debugfs: [...]: avoid double fops release")
 commit 121323ae668e ("drivers/perf: [...]: Fix reference count[...]")

 - reference count overincrement: If an error path omits a necessary
   reference count decrement, this can cause the reference counter to end
   up higher than it was before even though no new reference to the object
   was created. If such a bug is triggered around 2^32 times, the (32 bits
   wide) reference counter overflows to a small value. After dropping a
   few references so that exactly 2^32 references remain, the object will
   be freed prematurely.
   This kind of bug can easily occur when mistakes are made while writing
   error handling code.
   Vulnerability examples:
 commit 23567fd052a9 ("KEYS: Fix keyring ref leak [...]")
   Bug examples where I haven't looked at impact:
 commit b10e3e90485e ("debugfs: [...]: free proxy on ->open() failure")
 commit 3ea411c56ef5 ("android: fix reference leak in [...]")
 commit 8ed9e0e1b602 ("Input: turbografx - fix reference counting")

 - simple reference count overflow: If there are over 2^32 legitimate
   references to an object, a 32-bit reference counter can overflow. Since
   this can't happen on 32-bit architectures and normal references require
   at least a pointer, the minimum amount of physical memory required to
   hit this bug class is sizeof(void*) * 2^32 = 32 GiB ~= 34.36 GB.
   One example of a reference count overflow that could be triggered in
   some uncommon configurations on systems with 40GB of RAM is
   commit 92117d8443bc ("bpf: fix refcnt overflow").
   The nasty thing about this bug class is that it doesn't require a "real"
   reference counting bug, and with increasing amounts of physical memory,
   the number of reachable instances of this issue increases. (Probably not
   at 64GB or so; but it might get dangerous around 1TB or so.)

This patch is a first step towards addressing overincrements and simple
overflows, but not overdecrements.

My concerns when choosing an appropriate fix implementation are:

 - memory usage: An easy fix would be to increase the size of refcounts to
   64 bits. However, that would both increase memory usage and mess with
   data structures that have been designed carefully to e.g. fit into a
   single cacheline.
   (Note: Because file descriptor tables can contain lots of struct file
   pointers and are relatively dense, struct file already has a 64-bit
   refcount.)
 - CPU usage: Refcounting is a central, relatively hot piece of kernel
   code, so a fix must not use a lot of CPU time. In particular, additional
   or less deterministic atomic operations should be avoided.
 - complexity: A fix shouldn't involve significant changes in arch-specific
   code.
 - crashiness: While these kinds of bugs currently cause (potentially
   exploitable) crashes and oopsing would be a big improvement over that,
   ideally the system should just keep going without killing anything.
 - impact on non-refcounting code: atomic_t is a generic atomic type, and
   while a lot of code uses it for reference counting, it also has other
   users for whom reference count hardening might mess up things.

Since 2009 or so, PaX had reference count overflow mitigation code. My main
reasons for reinventing the wheel are:

 - PaX adds arch-specific code, both in the atomic_t operations and in
   exception handlers. I'd like to keep the code as
   architecture-independent as possible, especially without adding
   complexity to assembler code, to make it more maintainable and
   auditable.
 - The refcounting hardening from PaX does not handle the "simple reference
   count overflow" case when just the refcounting hardening code is used as
   a standalone patch. (On a system with the full PaX patch, the active
   response code would probably prevent exploitation of this case.)

I think that the biggest disadvantages of my code are:

 - It increases the size of the inline code for kref_get_unless_zero() and,
   if CONFIG_HARDEN_KREF_BIGMEM is enabled, of kref_sub(). (Also of
   kref_put_mutex(), but that method isn't used often.)
 - If CONFIG_HARDEN_KREF_BIGMEM is enabled, its kref_sub() implementation
   always has to do an additional