Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free

2020-10-05 Thread Daniel Micay
It will reuse the memory for other things when the whole slab is freed
though. Not really realistic to change that without it being backed by
virtual memory along with higher-level management of regions to avoid
intense fragmentation and metadata waste. It would depend a lot on
having much finer-grained slab caches, otherwise it's not going to be
much of an alternative to a quarantine feature. Even then, a
quarantine feature is still useful, but is less suitable for a
mainstream feature due to performance cost. Even a small quarantine
has a fairly high performance cost.


Re: [PATCH 1/2] x86/pkeys: copy pkey state at fork()

2018-10-29 Thread Daniel Micay
Yeah, a no-op pkey_alloc flag tied to this patch to provide a way to
detect if pkey state is preserved on fork, since kernels without the
patch would report EINVAL. Something like
PKEY_ASSERT_FORK_INHERIT_STATE would make sense. Otherwise, it's
going to be quite painful to adopt this in userspace software without kernel
version checks. Most software can't accept causing a crash / abort
after forking in environments not applying all the stable kernel
patches, or in fresh OS installs that aren't fully updated + rebooted yet.


Re: [PATCH 1/2] x86/pkeys: copy pkey state at fork()

2018-10-26 Thread Daniel Micay
On Fri, 26 Oct 2018 at 18:12, Andy Lutomirski  wrote:
>
>
>
> > On Oct 26, 2018, at 2:39 PM, Daniel Micay  wrote:
> >
> > I ended up working around this with a pthread_atfork handler disabling
> > my usage of the feature in the child process for the time being. I
> > don't have an easy way to detect if the bug is present within a
> > library so
>
> Can you not just make sure that the fix is backported to all relevant kernels?

There are too many different distribution kernels and I won't be in
control of where the software is used.

> I suppose we could add a new flag for pkey_get() or something.

That would work, since I can apply the workaround (disabling the
feature in child processes) if I get EINVAL. The flag wouldn't need to
do anything, just existing and being tied to this patch so I have a
way to detect that I can safely use MPK after fork.

> > I'm going to need a kernel version check with a table of
> > kernel releases fixing the problem for each stable branch.
>
> That won’t work right on district kernels. Please don’t go there.

If it's backported to an earlier version, it will just mean missing a
chance to use it. I'm not going to assume that it behaves a certain
way based on having an old kernel, but rather I just won't use it in a
forked child on an older kernel version if I don't have a way to
detect the problem. It's for a few different uses in library code so
I can't have a runtime test trying to detect it with clone(...). I'd
definitely prefer a proper way to detect that I can use it after fork. I really
don't want to have a hack like that which is why I'm bringing it up.


Re: [PATCH 1/2] x86/pkeys: copy pkey state at fork()

2018-10-26 Thread Daniel Micay
I ended up working around this with a pthread_atfork handler disabling
my usage of the feature in the child process for the time being. I
don't have an easy way to detect if the bug is present within a
library so I'm going to need a kernel version check with a table of
kernel releases fixing the problem for each stable branch.

It would be helpful if there was a new cpuinfo flag to check if the
MPK state is preserved on fork in addition to the existing ospke flag.
The problem will fade away over time but in my experience there are a
lot of people using distributions with kernels not incorporating all
of the stable fixes. I expect other people will run into the problem
once hardware with MPK is more widely available and other people try
to use it for various things like moving GC or assorted security
features. Someone will end up running software adopting it on an older
kernel with the problem.

The clobbering issue I found with MAP_FIXED_NOREPLACE isn't quite
as annoying because it was easy to make a runtime test usable in a library
to see if the feature works properly.


Re: [regression v4.17-rc0] Re: FORTIFY_SOURCE breaks ARM compilation in -next -- was Re: ARM compile failure in Re: linux-next: Tree for Apr 4

2018-04-20 Thread Daniel Micay
Well, that's not related, it's just this:

#ifdef __GNUC__
#if (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
#error Your compiler is too buggy; it is known to miscompile kernels.
#errorKnown good compilers: 3.3, 4.x
#endif
#if GCC_VERSION >= 40800 && GCC_VERSION < 40803
#error Your compiler is too buggy; it is known to miscompile kernels
#error and result in filesystem corruption and oopses.
#endif
#endif


Re: [regression v4.17-rc0] Re: FORTIFY_SOURCE breaks ARM compilation in -next -- was Re: ARM compile failure in Re: linux-next: Tree for Apr 4

2018-04-20 Thread Daniel Micay
On 20 April 2018 at 15:15, Pavel Machek  wrote:
> Hi!
>
>> >> Hi! Sorry I lost this email in my inbox. It seems this is specific to
>> >> a particular subset of arm architectures? (My local builds of arm all
>> >> succeed, for example. Can you send your failing config?) I'll take a
>> >> closer look on Monday if Daniel doesn't beat me to it.
>> >
>> > Daniel, Kees: any news?
>> >
>> > I'm aware you did not specify which Monday :-).
>>
>> Hi! Sorry, I got distracted. So the .config you sent me builds fine
>> with my cross compiler. I suspect this is something specific to ELDK's
>> compiler. I can try some other compiler versions. What version of gcc
>> is failing?
>
> I have:
>
> pavel@duo:/data/l/linux-n900$ arm-linux-gnueabi-gcc --version
> arm-linux-gnueabi-gcc (GCC) 4.7.2
> Copyright (C) 2012 Free Software Foundation, Inc.
> pavel@duo:/data/l/linux-n900$ arm-linux-gnueabi-ld --version
> GNU ld (GNU Binutils) 2.23.1.20121113
> Copyright 2012 Free Software Foundation, Inc.
>
> Let me try with eldk-5.6:
>
> pavel@duo:/data/l/linux-n900$ arm-linux-gnueabi-gcc --version
> arm-linux-gnueabi-gcc (GCC) 4.8.2
> pavel@duo:/data/l/linux-n900$ arm-linux-gnueabi-ld --version
> GNU ld (GNU Binutils) 2.24
>
> make  lib/string.o
>
> Seems to succeed, so I guess full build will succeed, too...

It doesn't imply that a full build would work.


Re: [PATCH] drivers: net: wireless: ath: ath9: dfs: remove VLA usage

2018-03-13 Thread Daniel Micay
> I don't think the difference between C and C++ const pointer
> conversions

I mean I don't think the difference between them was intended.


Re: [PATCH] drivers: net: wireless: ath: ath9: dfs: remove VLA usage

2018-03-13 Thread Daniel Micay
No, it's undefined behavior to write to a const variable. The `static`
and `const` on the variable both change the code generation in the
real world as permitted / encouraged by the standard. It's placed in
read-only memory. Trying to write to it will break. It's not
"implemented defined" to write to it, it's "undefined behavior" i.e.
it's considered incorrect. There a clear distinction between those in
the standard.

You're confusing having a real `const` for a variable with having it
applied to a pointer. It's well-defined to cast away const from a
pointer and write to what it points at if it's not actually const. If
it is const, that's broken.

There's nothing implementation defined about either case.

The C standard could have considered `static const` variables to work
as constant expressions just like the C++ standard. They borrowed it
from there but made it less useful than const in what became the C++
standard. They also used stricter rules for the permitted implicit
conversions of const pointers which made those much less usable, i.e.
converting `int **` to `const int *const *` wasn't permitted like C++.
I don't think the difference between C and C++ const pointer
conversions, it's a quirk of them being standardized on different
timelines and ending up with different versions of the same thing. On
the other hand, they might have left out having ever so slightly more
useful constant expressions on purpose since people can use #define.


Re: [PATCH] drivers: net: wireless: ath: ath9: dfs: remove VLA usage

2018-03-10 Thread Daniel Micay
> Just wondering. Is this actually a VLA. FFT_NUM_SAMPLES was static const so
> not really going to show a lot of variation. This array will always have the
> same size on the stack.

The issue is that unlike in C++, a `static const` can't be used in a
constant expression in C. It's unclear why C is defined that way but
it's how it is and there isn't currently a GCC extension making more
things into constant expressions like C++.


Re: VLA removal (was Re: [RFC 2/2] lustre: use VLA_SAFE)

2018-03-07 Thread Daniel Micay
On 7 March 2018 at 13:09, Linus Torvalds  wrote:
> On Wed, Mar 7, 2018 at 9:37 AM, Kees Cook  wrote:
>>
>> Building with -Wvla, I see 209 unique locations reported in 60 directories:
>> http://paste.ubuntu.com/p/srQxwPQS9s/
>
> Ok, that's not so bad. Maybe Greg could even add it to one of those
> things he encourages new people to do?
>
> Because at least *some* of them are pretty trivial. For example,
> looking at the core code, I was surprised to see something in
> lib/btree.c

Some are probably just the issue of technically having a VLA that's
not really a VLA:

static const int size = 5;

void foo(void) {
  int x[size];
}

% gcc -c -Wvla foo.c
foo.c: In function ‘foo’:
foo.c:4:3: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
   int x[size];
   ^~~

I don't really understand why the C standard didn't make `static
const` declarations usable as constant expressions like C++. They made
the pointer conversions more painful too.

It would be nice to get rid of those cases to use -Werror=vla though.


Re: [RFC PATCH] Randomization of address chosen by mmap.

2018-03-05 Thread Daniel Micay
On 5 March 2018 at 08:09, Ilya Smith  wrote:
>
>> On 4 Mar 2018, at 23:56, Matthew Wilcox  wrote:
>> Thinking about this more ...
>>
>> - When you call munmap, if you pass in the same (addr, length) that were
>>   used for mmap, then it should unmap the guard pages as well (that
>>   wasn't part of the patch, so it would have to be added)
>> - If 'addr' is higher than the mapped address, and length at least
>>   reaches the end of the mapping, then I would expect the guard pages to
>>   "move down" and be after the end of the newly-shortened mapping.
>> - If 'addr' is higher than the mapped address, and the length doesn't
>>   reach the end of the old mapping, we split the old mapping into two.
>>   I would expect the guard pages to apply to both mappings, insofar as
>>   they'll fit.  For an example, suppose we have a five-page mapping with
>>   two guard pages (MGG), and then we unmap the fourth page.  Now we
>>   have a three-page mapping with one guard page followed immediately
>>   by a one-page mapping with two guard pages (MMMGMGG).
>
> I’m analysing that approach and see much more problems:
> - each time you call mmap like this, you still  increase count of vmas as my
> patch did
> - now feature vma_merge shouldn’t work at all, until MAP_FIXED is set or
> PROT_GUARD(0)
> - the entropy you provide is like 16 bit, that is really not so hard to brute
> - in your patch you don’t use vm_guard at address searching, I see many roots
> of bugs here
> - if you unmap/remap one page inside region, field vma_guard will show head
> or tail pages for vma, not both; kernel don’t know how to handle it
> - user mode now choose entropy with PROT_GUARD macro, where did he gets it?
> User mode shouldn’t be responsible for entropy at all

I didn't suggest this as the way of implementing fine-grained
randomization but rather a small starting point for hardening address
space layout further. I don't think it should be tied to a mmap flag
but rather something like a personality flag or a global sysctl. It
doesn't need to be random at all to be valuable, and it's just a first
step. It doesn't mean there can't be switches between random pivots
like OpenBSD mmap, etc. I'm not so sure that randomly switching around
is going to result in isolating things very well though.

The VMA count issue is at least something very predictable with a
performance cost only for kernel operations.

> I can’t understand what direction this conversation is going to. I was talking
> about weak implementation in Linux kernel but got many comments about ASLR
> should be implemented in user mode what is really weird to me.

That's not what I said. I was saying that splitting things into
regions based on the type of allocation works really well and allows
for high entropy bases, but that the kernel can't really do that right
now. It could split up code that starts as PROT_EXEC into a region but
that's generally not how libraries are mapped in so it won't know
until mprotect which is obviously too late. Unless it had some kind of
type key passed from userspace, it can't really do that.

> I think it is possible  to add GUARD pages into my implementations, but 
> initially
> problem was about entropy of address choosing. I would like to resolve it 
> step by
> step.

Starting with fairly aggressive fragmentation of the address space is
going to be a really hard sell. The costs of a very spread out address
space in terms of TLB misses, etc. are unclear. Starting with enforced
gaps (1 page) and randomization for those wouldn't rule out having
finer-grained randomization, like randomly switching between different
regions. This needs to be cheap enough that people want to enable it,
and the goals need to be clearly spelled out. The goal needs to be
clearer than "more randomization == good" and then accepting a high
performance cost for that.

I'm not dictating how things should be done, I don't have any say
about that. I'm just trying to discuss it.


Re: [RFC PATCH] Randomization of address chosen by mmap.

2018-03-03 Thread Daniel Micay
On 3 March 2018 at 08:58, Ilya Smith  wrote:
> Hello Daniel, thanks for sharing you experience!
>
>> On 1 Mar 2018, at 00:02, Daniel Micay  wrote:
>>
>> I don't think it makes sense for the kernel to attempt mitigations to
>> hide libraries. The best way to do that is in userspace, by having the
>> linker reserve a large PROT_NONE region for mapping libraries (both at
>> initialization and for dlopen) including a random gap to act as a
>> separate ASLR base.
> Why this is the best and what is the limit of this large region?
> Let’s think out of the box.
> What you say here means you made a separate memory region for libraries 
> without
> changing kernel. But the basic idea - you have a separate region for libraries
> only. Probably you also want to have separate regions for any thread stack, 
> for
> mmaped files, shared memory, etc. This allows to protect memory regions of
> different types from each other. It is impossible to implement this without
> keeping the whole memory map. This map should be secure from any leak attack 
> to
> prevent ASLR bypass. The only one way to do it is to implement it in the 
> kernel
> and provide different syscalls like uselib or allocstack, etc. This one is
> really hard in current kernel implementation.

There's the option of reserving PROT_NONE regions and managing memory
within them using a similar best-fit allocation scheme to get separate
random bases. The kernel could offer something like that but it's
already possible to do it for libc mmap usage within libc as we did
for libraries.

The kernel's help is needed to cover non-libc users of mmap, i.e. not
the linker, malloc, etc. It's not possible for libc to assume that
everything goes through the libc mmap/mremap/munmap wrappers and it
would be a mess so I'm not saying the kernel doesn't have a part to
play. I'm only saying it makes sense to look at the whole picture and
if something can be done better in libc or the linker, to do it there
instead. There isn't an API for dividing stuff up into regions, so it
has to be done in userspace right now and I think it works a lot
better when it's an option.

>
> My approach was to hide memory regions from attacker and from each other.
>
>> If an attacker has library addresses, it's hard to
>> see much point in hiding the other libraries from them.
>
> In some cases attacker has only one leak for whole attack. And we should do 
> the best
> to make even this leak useless.
>
>> It does make
>> sense to keep them from knowing the location of any executable code if
>> they leak non-library addresses. An isolated library region + gap is a
>> feature we implemented in CopperheadOS and it works well, although we
>> haven't ported it to Android 7.x or 8.x.
> This one interesting to know and I would like to try to attack it, but it's 
> out of the
> scope of current conversation.

I don't think it's out-of-scope. There are different approaches to
this kind of finer-grained randomization and they can be done
together.

>> I don't think the kernel can
>> bring much / anything to the table for it. It's inherently the
>> responsibility of libc to randomize the lower bits for secondary
>> stacks too.
>
> I think any bit of secondary stack should be randomized to provide attacker as
> less information as we can.

The issue is that the kernel is only providing a mapping so it can add
a random gap or randomize it in other ways but it's ultimately up to
libc and other userspace code to do randomization without those
mappings.

A malloc implementation is similarly going to request fairly large
mappings from the kernel to manage a bunch of stuff within them
itself. The kernel can't protect against stuff like heap spray attacks
very well all by itself. It definitely has a part to play in that but
is a small piece of it (unless the malloc impl actually manages
virtual memory regions itself, which is already done by
performance-oriented allocators for very different reasons).

>> Fine-grained randomized mmap isn't going to be used if it causes
>> unpredictable levels of fragmentation or has a high / unpredictable
>> performance cost.
>
> Lets pretend any chosen address is pure random and always satisfies request. 
> At
> some time we failed to mmap new chunk with size N. What does this means? This
> means that all chunks with size of N are occupied and we even can’t find place
> between them. Now lets count already allocated memory. Let’s pretend on all of
> these occupied chunks lies one page minimum. So count of these pages is
> TASK_SIZE / N. Total bytes already allocated is PASGE_SIZE * TASK_SIZE / N. 
> Now
> we can calculate. TASK_SIZE is 2

Re: [RFC PATCH] Randomization of address chosen by mmap.

2018-02-28 Thread Daniel Micay
The option to add at least one guard page would be useful whether or
not it's tied to randomization. It's not feasible to do that in
userspace for mmap as a whole, only specific users of mmap like malloc
and it adds significant overhead vs. a kernel implementation. It could
optionally let you choose a minimum and maximum guard region size with
it picking random sizes if they're not equal. It's important for it to
be an enforced gap rather than something that can be filled in by
another allocation. It will obviously help a lot more when it's being
used with a hardened allocator designed to take advantage of this
rather than glibc malloc or jemalloc.

I don't think it makes sense for the kernel to attempt mitigations to
hide libraries. The best way to do that is in userspace, by having the
linker reserve a large PROT_NONE region for mapping libraries (both at
initialization and for dlopen) including a random gap to act as a
separate ASLR base. If an attacker has library addresses, it's hard to
see much point in hiding the other libraries from them. It does make
sense to keep them from knowing the location of any executable code if
they leak non-library addresses. An isolated library region + gap is a
feature we implemented in CopperheadOS and it works well, although we
haven't ported it to Android 7.x or 8.x. I don't think the kernel can
bring much / anything to the table for it. It's inherently the
responsibility of libc to randomize the lower bits for secondary
stacks too.

Fine-grained randomized mmap isn't going to be used if it causes
unpredictable levels of fragmentation or has a high / unpredictable
performance cost. I don't think it makes sense to approach it
aggressively in a way that people can't use. The OpenBSD randomized
mmap is a fairly conservative implementation to avoid causing
excessive fragmentation. I think they do a bit more than adding random
gaps by switching between different 'pivots' but that isn't very high
benefit. The main benefit is having random bits of unmapped space all
over the heap when combined with their hardened allocator which
heavily uses small mmap mappings and has a fair bit of malloc-level
randomization (it's a bitmap / hash table based slab allocator using
4k regions with a page span cache and we use a port of it to Android
with added hardening features but we're missing the fine-grained mmap
rand it's meant to have underneath what it does itself).

The default vm.max_map_count = 65530 is also a major problem for doing
fine-grained mmap randomization of any kind and there's the 32-bit
reference count overflow issue on high memory machines with
max_map_count * pid_max which isn't resolved yet.


Re: [RFC] Warn the user when they could overflow mapcount

2018-02-08 Thread Daniel Micay
I think there are likely legitimate programs mapping something a bunch of times.

Falling back to a global object -> count mapping (an rbtree / radix
trie or whatever) with a lock once it hits saturation wouldn't risk
breaking something. It would permanently leave the inline count
saturated and just use the address of the inline counter as the key
for the map to find the 64-bit counter. Once it gets to 0 in the map,
it can delete it from the map and do the standard freeing process,
avoiding leaks. It would really just make it a 64-bit reference count
heavily size optimized for the common case. It would work elsewhere
too, not just this case.


Re: [RFC] Warn the user when they could overflow mapcount

2018-02-08 Thread Daniel Micay
I guess it could saturate and then switch to tracking the count via an
object pointer -> count mapping with a global lock? Whatever the
solution is should probably be a generic one since it's a recurring
issue.


Re: [RFC] Warn the user when they could overflow mapcount

2018-02-08 Thread Daniel Micay
I don't think the kernel can get away with the current approach.
Object sizes and counts on 64-bit should be 64-bit unless there's a
verifiable reason they can get away with 32-bit. Having it use leak
memory isn't okay, just much less bad than vulnerabilities exploitable
beyond just denial of service.

Every 32-bit reference count should probably have a short comment
explaining why it can't overflow on 64-bit... if that can't be written
or it's too complicated to demonstrate, it probably needs to be
64-bit. It's one of many pervasive forms of integer overflows in the
kernel... :(


Re: [RFC] Warn the user when they could overflow mapcount

2018-02-08 Thread Daniel Micay
>> That seems pretty bad.  So here's a patch which adds documentation to the
>> two sysctls that a sysadmin could use to shoot themselves in the foot,
>> and adds a warning if they change either of them to a dangerous value.
>
> I have negative feelings about this patch, mostly because AFAICS:
>
>  - It documents an issue instead of fixing it.
>  - It likely only addresses a small part of the actual problem.

The standard map_max_count / pid_max are very low and there are many
situations where either or both need to be raised.

VM fragmentation in long-lived processes is a major issue. There are
allocators like jemalloc designed to minimize VM fragmentation by
never unmapping memory but they're relying on not having anything else
using mmap regularly so they can have all their ranges merged
together, unless they decide to do something like making a 1TB
PROT_NONE mapping up front to slowly consume. If you Google this
sysctl name, you'll find lots of people running into the limit. If
you're using a debugging / hardened allocator designed to use a lot of
guard pages, the standard map_max_count is close to unusable...

I think the same thing applies to pid_max. There are too many
reasonable reasons to increase it. Process-per-request is quite
reasonable if you care about robustness / security and want to sandbox
each request handler. Look at Chrome / Chromium: it's currently
process-per-site-instance, but they're moving to having more processes
with site isolation to isolate iframes into their own processes to
work towards enforcing the boundaries between sites at a process
level. It's way worse for fine-grained server-side sandboxing. Using a
lot of processes like this does counter VM fragmentation especially if
long-lived processes doing a lot of work are mostly avoided... but if
your allocators like using guard pages you're still going to hit the
limit.

I do think the default value in the documentation should be fixed but
if there's a clear problem with raising these it really needs to be
fixed. Google either of the sysctl names and look at all the people
running into issues and needing to raise them. It's only going to
become more common to raise these with people trying to use lots of
fine-grained sandboxing. Process-per-request is back in style.


Re: [kernel-hardening] Re: [PATCHv2 5/7] printk: allow kmsg to be encrypted using public key encryption

2018-01-16 Thread Daniel Micay
> Do you have any backing from makers of such devices? I'd like to hear
> from Google's Android team or whoever that would turn this feature on.

(I'm not a Google employee, but I work on Android security and
contribute some of that to AOSP.)

Android restricts access to dmesg with SELinux, so it's not really an
issue there. They previously used dmesg_restrict but switched to
SELinux for fine-grained control without using the capability. In
general, the access control features / toggles proposed on
kernel-hardening don't do much for Android because there's nothing
unconfined and there's no system administrator reconfiguring the
system and disabling SELinux to make it work. The toggles like
dmesg_restrict tend to be too coarse without a good way to make
exceptions.

Unprivileged processes including apps can't access dmesg, debugfs,
sysfs (other than a tiny set of exceptions), procfs outside of
/proc/net and /proc/PID (but it uses hidepid=2) and a few whitelisted
files, etc. That's mostly done with SELinux along with using it for
ioctl command whitelisting to have per-device per-domain whitelists of
commands instead of using their global seccomp-bpf policy for it. The
hidepid=2 usage is an example of a feature where SELinux doesn't quite
fit the task as an alternative since apps are all in the untrusted_app
/ isolated_app domains but each have a unique uid/gid. It works
because hidepid has a gid option to set fine-grained exceptions,
rather than only being able to bypass it with a capability.

They do have some out-of-tree access control features and finding ways
to upstream those might be useful:

There's the added perf_event_paranoid=3 level which is set by default.
It's toggled off when a developer using profiling tools via Android
Debug Bridge (USB) after enabling ADB support in the OS and
whitelisting their key. That was submitted upstream again for Android
but it didn't end up landing. An LSM hook for perf events might help
but Android uses a fully static SELinux policy and would need a way to
toggle it off / on.

They also have group-based control over sockets used to implement the
INTERNET permission but technically they could double the number of
app domains and use SELinux for that if they absolutely had to avoid
an out-of-tree patch. They might not design it the same way today
since it predates using SELinux.


Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules

2017-11-30 Thread Daniel Micay
It was suggested that the feature would only be adopted in niches like
Android and I pointed out that it's not really relevant to Android.

It's a waste of time to try convincing me that it's useful elsewhere. I
never said or implied that it wasn't.

On Thu, 2017-11-30 at 09:50 +0100, Djalal Harouni wrote:
> On Thu, Nov 30, 2017 at 7:51 AM, Daniel Micay 
> wrote:
> [...]
> > Lots of potential module attack surface also gets eliminated by
> > default
> > via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
> > commands, etc. The global seccomp whitelist might be relevant in
> > some
> > cases too.
> 
> In embedded systems we can't maintain a SELinux policy, distro man
> power hardly manage. We have abstracted seccomp etc, but the kernel
> inherited the difficult multiplex things, plus all other paths that
> trigger this.

It's cheaper to use an existing system like Android Things where device
makers only need to make their apps and perhaps some userspace hardware
drivers for cases not covered by mainline kernel drivers. I don't think
it makes sense for every device vendor to manage an OS and I seriously
doubt that's how the ecosystem is going to end up as it matures.

> > Android devices like to build everything into the kernel too, so
> > even if
> > they weren't using a module this feature wouldn't usually help them.
> > It
> > would need to work like this existing sysctl:
> > 
> > net.ipv4.tcp_available_congestion_control = cubic reno lp
> > 
> > i.e. whitelists for functionality offered by the modules, not just
> > whether they can be loaded.
> 
> Yes, but it is hard to maintain a whitelist policy, the code is hardly
> maintained... if you include everything you should have an LSM policy
> or something like that, and compiling kernels is expert thing.

I'm not talking about whitelist vs. blacklist, compiling kernels or
anything like that.

> Otherwise IMHO the kernel should provide default secure behaviour on
> how to load or add new functionality to the running one. From a user
> perspective, a switch "yes/no" that a privileged entity will
> *understand* and assume is what should be there, and the switch or
> flag as discussed here is local to processes, the sysctl will be
> removed. IMO it should come from userspace point of view, cause as an
> example the sysctl:
> 
> net.ipv4.tcp_available_congestion_control = cubic reno lp
> 
> Is kernel thing, too technical, userspace developers, admins or
> privileged entity will not understand what cubic or reno mean.

Congestion control algorithms are being used as an example in terms of
the mechanism being used to control which are available to unprivileged
users. The obscurity of congestion control algorithms is irrelevant.

>  Doing
> the same per functionality directly like this seems to much of a
> burden compared to the use case. The kernel maybe can do this to
> advance the art of the networking stack and for advanced cases, but in
> IMHO a sane default behaviour + an abstracted process/sandbox flag
> "yes/no" for most others, userspace developers and humans is what
> should be provided and we need the kernel to help here.

There are cases where unprivileged module auto-loading is relied upon
like network protocols. Having configuration for which protocols can be
used by unprivileged users is superior to limiting only which ones can
be auto-loaded. That's why I bought up the existing congestion control
knob. It works well in terms of having a whitelist of the sane, widely
used cases with exposing anything obscure requiring configuration. They
happen to be implemented as modules too.

Killing off unprivileged module loading other than a few cases like that
makes sense, and then those can provide similar control with similarly
sane defaults.


Re: [kernel-hardening] Re: [PATCH v5 next 5/5] net: modules: use request_module_cap() to load 'netdev-%s' modules

2017-11-29 Thread Daniel Micay
> And once you disable it by default, and it becomes purely opt-in, that
> means that nothing will change for most cases. Some embedded people
> that do their own thing (ie Android) might change, but normal
> distributions probably won't.
> 
> Yes, Android may be 99% of the users, and yes, the embedded world in
> general needs to be secure, but I'd still like this to be something
> that helps _everybody_.

Android devices won't get much benefit since they ship a tiny set of
modules chosen for the device. The kernels already get very stripped
down to the bare minimum vs. enabling every feature and driver available
and shipping it all by default on a traditional distribution.

Lots of potential module attack surface also gets eliminated by default
via their SELinux whitelists for /dev, /sys, /proc, debugfs, ioctl
commands, etc. The global seccomp whitelist might be relevant in some
cases too.

Android devices like to build everything into the kernel too, so even if
they weren't using a module this feature wouldn't usually help them. It
would need to work like this existing sysctl:

net.ipv4.tcp_available_congestion_control = cubic reno lp

i.e. whitelists for functionality offered by the modules, not just
whether they can be loaded.


Re: [kernel-hardening] Re: [PATCH resend 2/2] userns: control capabilities of some user namespaces

2017-11-06 Thread Daniel Micay
On Mon, 2017-11-06 at 16:14 -0600, Serge E. Hallyn wrote:
> Quoting Daniel Micay (danielmi...@gmail.com):
> > Substantial added attack surface will never go away as a problem.
> > There
> > aren't a finite number of vulnerabilities to be found.
> 
> There's varying levels of usefulness and quality.  There is code which
> I
> want to be able to use in a container, and code which I can't ever see
> a
> reason for using there.  The latter, especially if it's also in a
> staging driver, would be nice to have a toggle to disable.
> 
> You're not advocating dropping the added attack surface, only adding a
> way of dealing with an 0day after the fact.  Privilege raising 0days
> can
> exist anywhere, not just in code which only root in a user namespace
> can
> exercise.  So from that point of view, ksplice seems a more complete
> solution.  Why not just actually fix the bad code block when we know
> about it?

That's not what I'm advocating. I only care about it for proactive
attack surface reduction downstream. I have no interest in using it to
block access to known vulnerabilities.

> Finally, it has been well argued that you can gain many new caps from
> having only a few others.  Given that, how could you ever be sure
> that,
> if an 0day is found which allows root in a user ns to abuse
> CAP_NET_ADMIN against the host, just keeping CAP_NET_ADMIN from them
> would suffice?

I didn't suggest using it that way...

>  It seems to me that the existing control in
> /proc/sys/kernel/unprivileged_userns_clone might be the better duct
> tape
> in that case.

There's no such thing as unprivileged_userns_clone in mainline.

The advantage of this over unprivileged_userns_clone in Debian and maybe
some other distributions is not giving up unprivileged app containers /
sandboxes implemented via user namespaces.  For example, Chromium's user
namespace sandbox likely only needs to have CAP_SYS_CHROOT. Chromium
will be dropping their setuid sandbox, forcing usage of user namespaces
to avoid losing the sandbox which will greatly increase local kernel
attack surface on the host by exposing netfilter management, etc. to
unprivileged users.

The proposed approach isn't necessarily the best way to implement this
kind of mitigation but I think it's filling a real need.


Re: [kernel-hardening] Re: [PATCH resend 2/2] userns: control capabilities of some user namespaces

2017-11-06 Thread Daniel Micay
Substantial added attack surface will never go away as a problem. There
aren't a finite number of vulnerabilities to be found.


Re: [kernel-hardening] [PATCH v2 0/5] stackprotector: ascii armor the stack canary

2017-09-19 Thread Daniel Micay
> Brad trolls us all lightly with this trivia question:
> 
> https://twitter.com/grsecurity/status/905246423591084033

I'll respond to your proposed scenario rather than guessing at what is
being suggested there and if it's actually the same thing as what you've
brought up.

They've stated many times that stack canaries are near useless and
enabling PaX UDEREF on i386 actually disabled stack canaries and
presumably still does after the last public patches:

+   select HAVE_CC_STACKPROTECTOR   if X86_64 || !PAX_MEMORY_UDEREF

Making the kernel more vulnerable to remote stack overflow exploits like
this month's overhyped Bluetooth bug to improve security against local
exploits seems like much more of a compromise. Not that i386 has any
real relevance anymore, but I think that's some interesting context...

It's not like upstream has a monopoly on needing to make hard choices
for security features, I don't see this as one of those hard choices
like whether sacrificing availability for SIZE_OVERFLOW makes sense.

> "For #trivia can you describe one scenario where this change actually
> helps exploitation using similar C string funcs?"
> 
> I suppose the expected answer is:
> 
> The change helps exploitation when the overwriting string ends just
> before the canary.  Its NUL overwriting the NUL byte in the canary
> would
> go undetected.  Before this change, there would be a 255/256 chance of
> detection.
>
> I hope this was considered.  The change might still be a good
> tradeoff,
> or it might not, depending on which scenarios are more likely (leak of
> canary value or the string required in an exploit ending at that exact
> byte location), and we probably lack such statistics.

It was considered that guaranteeing some forms of string overflows are
contained within the frame can have a negative impact but I don't think
it's significant.

There would need to be something worth overwriting between the source of
the overflow and the canary. Since SSP reorders local variables within
the stack frame based on risk of overflow, the targets would be locals
that Clang/GCC considers to have a higher risk of overflow. An array can
only have other arrays between it and the canary and there's ordering by
size (small vs. large at least) within the array section. If there was a
zero byte in any of that other data between the source of the overflow
and the canary, this wouldn't make a difference anyway.

Since the canary is only checked on return, it's also only relevant if
control flow still makes it to the function epilogue and it still
matters after the attacker has accomplished their goal.

The chance of there being a zero as a leading byte is already 1/256, but
there could also be a NUL elsewhere in the canary compromising a subset
of the entropy for this kind of scenario too so it's a tiny bit higher
than a 1/256 chance already. Not much resistance to brute force if it
really does ever end up mattering, unless canaries with NUL in leading
positions were actually *rejected*...

There's a more substantial compromise between zero and non-zero poison
on free but for production use. This was part of a set of changes where
CopperheadOS switched from non-zero fill -> zero fill on free for the
kernel and userspace and added a leading zero byte on 64-bit for heap
and stack canaries in the production builds while keeping around the old
way of doing things for some forms of debug builds. Non-zero fill on
free ended up causing too much harm by turning benign bugs into bugs
that might have been exploitable even beyond DoS in some cases tied to
non-NUL-terminated strings where non-zero poisoning got rid of zeroes
that were otherwise all over the heap containing them or when it broke
code wrongly depending on uninitialized data being zero in practice. I
think there's more of an argument to be had about poison-on-free as a
mitigation. This leading zero byte? Not so much, beyond perhaps wanting
the option to turn it off for bug finding... but there are better tools
for that now.

> I am not proposing to revert the change.  I had actually contemplated
> speaking up when this was discussed, but did not for lack of a better
> suggestion.  We could put/require a NUL in the middle of the canary,
> but
> with the full canary being only 64-bit at most that would also make
> some
> attacks easier.
> 
> So this is JFYI.  No action needed on it, I think.

It might make sense to put it in the middle, but a scenario where it
matters seems unlikely and as you mention it would make it weaker it
some other cases. I think it's already the best choice right now.

I'd choose XOR canaries over a leading zero byte if it could only be one
or the other, but I'm not really convinced that there's any significant
loss compared to the normal canaries.


Re: nios2 crash due to 'init/main.c: extract early boot entropy from the passed cmdline'

2017-09-11 Thread Daniel Micay
On Mon, 2017-09-11 at 10:35 -0700, Guenter Roeck wrote:
> On Mon, Sep 11, 2017 at 09:36:00AM -0700, Kees Cook wrote:
> > On Sat, Sep 9, 2017 at 8:58 PM, Guenter Roeck 
> > wrote:
> > > Hi,
> > > 
> > > I noticed that nios2 images crash in mainline. Bisect points to
> > > commit
> > > 33d72f3822d7 ("init/main.c: extract early boot entropy from the
> > > passed
> > > cmdline").  Bisect log is attached.
> > > 
> > > As far as I can see, the problem is seen because
> > > add_device_randomness()
> > > calls random_get_entropy(). However, the underlying timer function
> > > used by the nios2 architecture (nios2_timer_read) is not yet
> > > initialized,
> > > causing a NULL pointer access and crash. A sample crash log is at
> > > http://kerneltests.org/builders/qemu-nios2-master/builds/1
> > > 75/steps/qemubuildcommand/logs/stdio
> > 
> > Oh, yikes. Do you have a full call trace? (Does this come through
> > get_cycles() or via the It seems like we could either initialize the
> > timer earlier or allow it to fall back when not initialized...
> > 
> 
> nios2 doesn't give me a traceback. I followed it by adding debug
> messages.
> The code path is through get_cycles().
> 
> On nios2:
> 
> static u64 nios2_timer_read(struct clocksource *cs)
> {
>   struct nios2_clocksource *nios2_cs = to_nios2_clksource(cs);
>   unsigned long flags;
>   u32 count;
> 
>   local_irq_save(flags);
>   count = read_timersnapshot(&nios2_cs->timer);   // <- not
> initialized
>   local_irq_restore(flags);
> 
>   /* Counter is counting down */
>   return ~count;
> }
> 
> cycles_t get_cycles(void)
> {
> return nios2_timer_read(&nios2_cs.cs);
> }
> EXPORT_SYMBOL(get_cycles);
> 
> Guenter

Maybe it should WARN and return 0 for now if that's NULL?


Re: [PATCHv3 2/2] extract early boot entropy from the passed cmdline

2017-08-17 Thread Daniel Micay
> I did say 'external attacker' but it could be made clearer.

Er, s/say/mean to imply/

I do think it will have some local value after Android 8 which should
start shipping in a few days though.

I'll look into having the kernel stash some entropy in pstore soon since
that seems like it could be a great improvement. I'm not sure how often
/ where it should hook into for regularly refreshing it though. Doing it
only on powering down isn't ideal.


Re: [kernel-hardening] [PATCHv2 2/2] extract early boot entropy from the passed cmdline

2017-08-16 Thread Daniel Micay
On Wed, 2017-08-16 at 21:58 -0700, Kees Cook wrote:
> On Wed, Aug 16, 2017 at 9:56 PM, Nick Kralevich 
> wrote:
> > On Wed, Aug 16, 2017 at 3:46 PM, Laura Abbott 
> > wrote:
> > > From: Daniel Micay 
> > > 
> > > Existing Android bootloaders usually pass data useful as early
> > > entropy
> > > on the kernel command-line. It may also be the case on other
> > > embedded
> > > systems. Sample command-line from a Google Pixel running
> > > CopperheadOS:
> > > 
> > 
> > Why is it better to put this into the kernel, rather than just rely
> > on
> > the existing userspace functionality which does exactly the same
> > thing? This is what Android already does today:
> > https://android-review.googlesource.com/198113
> 
> That's too late for setting up the kernel stack canary, among other
> things. The kernel will also be generating some early secrets for slab
> cache canaries, etc. That all needs to happen well before init is
> started.
> 
> -Kees
> 

It's also unfortunately the kernel's global stack canary for the entire
boot since unlike x86 there aren't per-task canaries. GCC / Clang access
it via a segment register on x86 vs. a global on other architectures.

In theory it could be task-local elsewhere but doing it efficiently
would imply reserving a register to store the random value. I think that
may actually end up helping performance more than it hurts by not
needing to read the global stack canary value from cache repeatedly. If
stack canaries were augmented into something more (XOR in the retaddr
and offer the option of more coverage than STRONG) it would be more
important.


Re: [PATCHv3 2/2] extract early boot entropy from the passed cmdline

2017-08-16 Thread Daniel Micay
On Wed, 2017-08-16 at 23:31 -0400, Theodore Ts'o wrote:
> On Wed, Aug 16, 2017 at 04:14:58PM -0700, Laura Abbott wrote:
> > From: Daniel Micay 
> > 
> > Existing Android bootloaders usually pass data useful as early
> > entropy
> > on the kernel command-line. It may also be the case on other
> > embedded
> > systems.
> 
> May I suggest a slight adjustment to the beginning commit description?
> 
>Feed the boot command-line as to the /dev/random entropy pool
> 
>Existing Android bootloaders usually pass data which may not be
>known by an external attacker on the kernel command-line.  It may
>also be the case on other embedded systems.  Sample command-line
>from a Google Pixel running CopperheadOS
> 
> The idea here is to if anything, err on the side of under-promising
> the amount of security we can guarantee that this technique will
> provide.  For example, how hard is it really for an attacker who has
> an APK installed locally to get the device serial number?  Or the OS
> version?  And how much variability is there in the bootloader stages
> in milliseconds?

The serial number is currently accessible to local apps up until Android
7.x so it doesn't have value if the adversary has local access. Access
to it without the READ_PHONE_STATE permission is being removed for apps
targeting Android 8.0 and will presumably be restructed for all apps at
some point in the future:

https://android-developers.googleblog.com/2017/04/changes-to-device-identifiers-in.html

Some bootloader stages vary a bit in time each boot. There's not much
variance or measurement precision so there's only a small amount of
entropy from this. The ones that consistently vary in timing do so
independently from each other so that helps a bit. Also worth noting
that before Android 8.0+, local apps can access the boot times since
it's written to a system property. After Android 8.0+, all that stuff is
inaccessible to them (no permission to get them) since there's a
whitelisting model for system property access.

> I think we should definitely do this.  So this is more of a request to
> be very careful what we promise in the commit description, not an
> objection to the change itself.

I did say 'external attacker' but it could be made clearer. It's
primarily aimed at getting a tiny bit of extra entropy for the kernel
stack canary and other probabilistic exploit mitigations set up in early
boot. On non-x86 archs, i.e. 99.9% of Android devices, the kernel stack
canary remains the same after it's set up in that early boot code.

Android devices almost all have a hardware RNG and Android init blocks
until a fair bit of data is read from it along with restoring entropy
that's regularly saved while running, but unfortunately that's not
available at this point in the boot process.

The kernel could save / restore entropy using pstore (which at least
Nexus / Pixel devices have - not sure about others). I don't know how
early that could feasibly be done. Ideally it would do that combined
with early usage of the hwrng.


Re: drivers/tty/serial/8250/8250_fintek.c:364: warning: 'probe_data' is used uninitialized in this function

2017-08-09 Thread Daniel Micay
On Wed, 2017-08-09 at 17:32 +0200, Arnd Bergmann wrote:
> On Wed, Aug 9, 2017 at 5:07 PM, kbuild test robot
>  wrote:
> > tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin
> > ux.git master
> > head:   bfa738cf3dfae2111626650f86135f93c5ff0a22
> > commit: 6974f0c4555e285ab217cee58b6e874f776ff409
> > include/linux/string.h: add the option of fortified string.h
> > functions
> > date:   4 weeks ago
> > config: x86_64-randconfig-v0-08092220 (attached as .config)
> > compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
> > reproduce:
> > git checkout 6974f0c4555e285ab217cee58b6e874f776ff409
> > # save the attached .config to linux build tree
> > make ARCH=x86_64
> > 
> > All warnings (new ones prefixed by >>):
> > 
> >In file included from include/linux/bitmap.h:8,
> > from include/linux/cpumask.h:11,
> > from arch/x86/include/asm/cpumask.h:4,
> > from arch/x86/include/asm/msr.h:10,
> > from arch/x86/include/asm/processor.h:20,
> > from arch/x86/include/asm/cpufeature.h:4,
> > from arch/x86/include/asm/thread_info.h:52,
> > from include/linux/thread_info.h:37,
> > from arch/x86/include/asm/preempt.h:6,
> > from include/linux/preempt.h:80,
> > from include/linux/spinlock.h:50,
> > from include/linux/seqlock.h:35,
> > from include/linux/time.h:5,
> > from include/linux/stat.h:18,
> > from include/linux/module.h:10,
> > from drivers/tty/serial/8250/8250_fintek.c:11:
> >include/linux/string.h: In function 'strcpy':
> >include/linux/string.h:209: warning: '__f' is static but
> > declared in inline function 'strcpy' which is not static
> >include/linux/string.h:211: warning: '__f' is static but
> > declared in inline function 'strcpy' which is not static
> 
> 
> This clearly comes from __trace_if() when CONFIG_PROFILE_ALL_BRANCHES
> is enabled. I did not see the warning with gcc-7.1.1, and I guess this
> only
> happens on older compilers like the gcc-4.4 that was used here.
> 
> What is the reason for __FORTIFY_INLINE to be "extern __always_inline"
> rather than "static __always_inline"? If they cannot just be 'static',
> maybe
> this can be changed to depend on the compiler version.
> 
>Arnd

It's important to get the semantics of using extern. It means if you do
something like &memcpy, it resolves to the address of the direct symbol
instead of generating a useless thunk for that object file. It might
also be required for Clang compatibility, I don't remember.

It could have a compiler version dependency or maybe one specifically
tied to old compiler && CONFIG_PROFILE_ALL_BRANCHES / other options that
conflict with it like that.


Re: [PATCH] infiniband: avoid overflow warning

2017-07-31 Thread Daniel Micay
On Mon, 2017-07-31 at 14:18 -0700, Kees Cook wrote:
> On Mon, Jul 31, 2017 at 2:10 PM, Arnd Bergmann  wrote:
> > On Mon, Jul 31, 2017 at 10:58 PM, Kees Cook 
> > wrote:
> > > On Mon, Jul 31, 2017 at 12:30 AM, Arnd Bergmann 
> > > wrote:
> > > > On Mon, Jul 31, 2017 at 9:08 AM, Moni Shoua 
> > > > wrote:
> > > > > > break;
> > > > > > default:
> > > > > > return -EINVAL;
> > > > > 
> > > > > what happens if you replace 16 with sizeof(struct in6_addr)?
> > > > 
> > > > Same thing: the problem is that gcc already knows the size of
> > > > the structure we
> > > > pass in here, and it is in fact shorter.
> > > 
> > > So gcc is ignoring both the cast (to 16 byte struct in6_addr) and
> > > the
> > > caller's actual 128 byte struct sockaddr_storage, and looking only
> > > at
> > > struct sockaddr? That seems really weird.
> > 
> > Using a sockaddr_storage on the stack would address the warning, but
> > the question was about just changing the hardcoded 16 to a sizeof()
> > operation, and that has no effect.
> 
> Right, I didn't mean that; I was curious why the fortify macro
> resulted in an error at all. The callers are casting from struct
> sockaddr_storage (large enough) to struct sockaddr (not large enough),
> and then the inline is casting back to sockaddr_in6 (large enough). I
> would have expected fortify to check either sockaddr_storage or
> sockaddr_in6, but not sockaddr.
> 
> -Kees
> 

I don't think that's quite what's happening. It will report unknown if
it can't find allocation sites or other guarantees of size. There are no
alloc_size markers yet so it *mostly* does stack / global checks. It
won't infer sizes based on pointer types. It's not a heuristic.

Hoowever, fortify / -fsanitize=object-size can indirectly uncover other
forms of undefined behavior though. Code may rely on doing something
undefined that GCC actively assumes can't happen for optimization. It
can have false positives due to dead code with the compile-time errors
but if the code is actually called with the size that it rejects as
invalid, then it's unlikely there isn't a bug.


Re: [PATCH] infiniband: avoid overflow warning

2017-07-31 Thread Daniel Micay
On Mon, 2017-07-31 at 21:19 +0200, Arnd Bergmann wrote:
> On Mon, Jul 31, 2017 at 6:17 PM, Bart Van Assche  om> wrote:
> > On Mon, 2017-07-31 at 18:04 +0200, Arnd Bergmann wrote:
> > > On Mon, Jul 31, 2017 at 5:32 PM, Bart Van Assche  > > dc.com> wrote:
> > > > So inetaddr_event() assigns AF_INET so .sin_family and gcc warns
> > > > about code
> > > > that is only executed if .sin_family == AF_INET6? Since this
> > > > warning is the
> > > > result of incorrect interprocedural analysis by gcc, shouldn't
> > > > this be
> > > > reported as a bug to the gcc authors?
> > > 
> > > I think the interprocedural analysis here is just a little worse
> > > than it could
> > > be, but it's not actually correct.  It's not gcc that prints the
> > > warning (if
> > > it did, then I'd agree it would be a gcc bug) but the warning is
> > > triggered
> > > intentionally by the fortified version of memcpy in
> > > include/linux/string.h.
> > > 
> > > The problem as I understand it is that gcc cannot guarantee that
> > > it
> > > tracks the value of addr->sa_family at  least as far as the size
> > > of the
> > > stack object, and it has no strict reason to do so, so the inlined
> > > rdma_ip2gid() will still contain both cases.
> > 
> > Hello Arnd,
> > 
> > Had you already considered to uninline the rdma_ip2gid() function?
> 
> Not really, that would prevent the normal optimization from happening,
> so that would be worse than uninlining addr_event() as I tried.
> 
> It would of course get rid of the warning, so if you think that's a
> better
> solution, I won't complain.
> 
>Arnd

You could also use __memcpy but using a struct assignment seems cleaner.

The compile-time fortify errors aren't perfect since they rely on GCC
being able to optimize them out and there can be dead code that has
intentional overflows, etc. Their purpose is just to move many runtime
errors (which don't have these false positives) to compile-time since
it's a lot less painful to deal with a few false positives like this
than errors slipping through to runtime (although it's less important
since it's going to be using WARN for the time being).

If the compile-time errors would removed, all of the real overflows
would still be detected at runtime. One option would be having something
like #define __NO_FORTIFY but *only* disabling the compile-time part of
the checks to work around false positives. *shrug*


Re: [PATCH] fortify: Use WARN instead of BUG for now

2017-07-27 Thread Daniel Micay
I think the 'else' added in the proposed patch makes it too complicated
for GCC to optimize out the __attribute__((error)) checks before they're
considered to be errors. It's not needed so it's probably best to just
avoid doing something like that. The runtime checks can't get false
positives from overly complex code but the compile-time ones depend on
GCC being able to reliably optimize them out.

This might be easier for GCC:

if (__builtin_constant_p(size) && condition_a) {
compiletimeerror();
}

if (__builtin_constant_p(size) && condition_b) {
compiletimeerror();
}

than the current:

if (__builtin_constant_p(size)) {
if (condition_a) {
compiletimeerror();
}

if (condition_b) {
compiletimeerror();
}
}

but it hasn't had a false positive like that with the current code.

Removing __noreturn is making the inline code more complex from GCC's
perspective too, but hopefully it's neither reducing coverage (i.e. not
making it less able to resolve __builtin_object_size - intuitively it
shouldn't impact it much but you never know) or making GCC unable to
deal with the compile-time checks.


Re: [PATCH] fortify: Use WARN instead of BUG for now

2017-07-26 Thread Daniel Micay
> Maybe we could do two phases? One to s/BUG/WARN/ and the second to
> improve the message?

s/fortify_panic/fortify_overflow/ + use WARN + remove __noreturn makes
sense as one commit. Still think the *option* of __noreturn + BUG should
be kept there even just for measuring the size overhead. !COMPILE_TIME
&& EXPERT if it needs to be for now. If you're fully removing __noreturn
then the entry in tools/objtool/check.c for __noreturn functions also
won't make sense (either way it needs to use the new name).

I think improving error messages should be done a bit differently though
and it'll be easier to not tie these things together.


Re: [PATCH] fortify: Use WARN instead of BUG for now

2017-07-26 Thread Daniel Micay
It should just be renamed from fortify_panic -> fortify_error, including
in arch/x86/boot/compressed/misc.c and arch/x86/boot/compressed/misc.c.
It can use WARN instead of BUG by with a 'default n', !COMPILE_TEST
option to use BUG again. Otherwise it needs to be patched downstream
when that's wanted.

I don't think splitting it is the right approach to improving the
runtime error handling. That only makes sense for the compile-time
errors due to the limitations of __attribute__((error)). Can we think
about that before changing it? Just make it use WARN for now.

The best debugging experience would be passing along the sizes and
having the fortify_error function convert that into nice error messages.
For memcpy(p, q, n), n can be larger than both the detected sizes of p
and q, not just either one. The error should just be saying the function
name and printing the copy size and maximum sizes of p and q. That's
going to increase the code size too but I think splitting it will be
worse and it goes in the wrong direction in terms of complexity. It's
going to make future extensions / optimization harder if it's split.


Re: [lkp-robot] [include/linux/string.h] 6974f0c455: kernel_BUG_at_lib/string.c

2017-07-25 Thread Daniel Micay
It was known that there are going to be bugs to work through, many of
them relatively benign like the leaks of data near string constants
(probably other string constants) in rodata. It makes sense to have it
default to WARN with BUG / noreturn as a non-default configuration
option for it, I guess with !COMPILE_TEST like UBSAN_SANITIZE_ALL. I
don't think there's any sane way to bound the length of either reads /
writes. It needs to either WARN + continue on into doing the overflow or
use BUG. Trying to correct it might make things worse and would make
this more complicated / bug-prone. It already has enough subtle edge
cases to deal with.

I think 'benign' is a better term than 'false positive' because there
hasn't been a non-bug found yet. They're mostly not security vulns but
they're undefined behavior.


Re: [lkp-robot] [include/linux/string.h] 6974f0c455: kernel_BUG_at_lib/string.c

2017-07-19 Thread Daniel Micay
> So the fortify_string code has decided that only a single-byte (or
> empty) memcpy is ok.
> 
> And that, in turn, seems to be because we're copying from
> optprobe_template_entry, which is declared as
> 
> extern __visible kprobe_opcode_t optprobe_template_entry;
> 
> so the fortify code decides it's a single character.
> 
> Does just changing all those things to be declared as arrays fix
> things?

Yeah, that fixes it because GCC will consider the size of 'char foo[]'
unknown (i.e. (size_t)-1 from __builtin_object_size).

GCC doesn't know this essentially constant value at compile-time so it
wasn't a compile-time error:

#define TMPL_END_IDX \
((long)&optprobe_template_end - (long)&optprobe_template_entry)

-fsanitize=object-size works the same way for pointer dereferences so
replacing might fix some issues for CONFIG_UBSAN_SANITIZE_ALL. I guess
that's way too noisy at the moment thus the !COMPILE_TEST.


Re: [lkp-robot] [include/linux/string.h] 6974f0c455: kernel_BUG_at_lib/string.c

2017-07-19 Thread Daniel Micay
> [8.134886]  arch_prepare_optimized_kprobe+0xd5/0x171
> [8.134886]  arch_prepare_optimized_kprobe+0xd5/0x171

Probably this:

/* Copy arch-dep-instance from template */
memcpy(buf, &optprobe_template_entry, TMPL_END_IDX);

Not a real bug, just technically undefined because these are u32:

typedef u32 kprobe_opcode_t;

extern __visible kprobe_opcode_t optprobe_template_entry;
extern __visible kprobe_opcode_t optprobe_template_val;
extern __visible kprobe_opcode_t optprobe_template_call;
extern __visible kprobe_opcode_t optprobe_template_end;
extern __visible kprobe_opcode_t optprobe_template_sub_sp;
extern __visible kprobe_opcode_t optprobe_template_add_sp;
extern __visible kprobe_opcode_t optprobe_template_restore_begin;
extern __visible kprobe_opcode_t optprobe_template_restore_orig_insn;
extern __visible kprobe_opcode_t optprobe_template_restore_end;

Could be switched to unknown size arrays like optprobe_template_entry[]
but it might be best to just mark the kprobe code with #define
__NO_FORTIFY.


Re: [PATCH] replace incorrect strscpy use in FORTIFY_SOURCE

2017-07-14 Thread Daniel Micay
On Fri, 2017-07-14 at 16:51 -0700, Kees Cook wrote:
> On Fri, Jul 14, 2017 at 2:28 PM, Daniel Micay 
> wrote:
> > Using strscpy was wrong because FORTIFY_SOURCE is passing the
> > maximum
> > possible size of the outermost object, but strscpy defines the count
> > parameter as the exact buffer size, so this could copy past the end
> > of
> > the source. This would still be wrong with the planned usage of
> > __builtin_object_size(p, 1) for intra-object overflow checks since
> > it's
> > the maximum possible size of the specified object with no guarantee
> > of
> > it being that large.
> > 
> > Reuse of the fortified functions like this currently makes the
> > runtime
> > error reporting less precise but that can be improved later on.
> > 
> > Signed-off-by: Daniel Micay 
> 
> Thanks for fixing this! Linus, do you want to take this directly or
> have it go via -mm where fortify landed originally?
> 
> Acked-by: Kees Cook 
> 
> As far as testing goes, was the NFS tree not in -next, or was a test
> not running against -next? I'm curious why it took until the NFS tree
> landed in Linus's tree for this to get noticed. Fortify was in -next
> for a while...
> 
> -Kees

Not sure but one issue is that v1 wasn't broken and that's what I most
heavily tested myself. I then switched to testing with intra-object size
checks on top of it, and there would have needed to be a case like this
to break with those stricter checks:

char a[2];
char b[3];
char *p = cond ? a : b;
strcpy(a, c);

__builtin_object_size(p, 0) / __builtin_object_size(p, 1) will both
return 3 there, which is wrong with that incorrect strscpy usage.

I wouldn't have found it via NFS since I've been testing with the string
(but not memory) functions switched to the stricter type 1.

I started on some test cases for FORTIFY_SOURCE but I was testing to
make sure it catches all the bugs it's supposed to catch, it's probably
a good idea to write some more generic string edge case tests too.

One of the somewhat subtle cases (which is handled properly already):

struct foo {
char a[2];
char b;
}

struct foo x;
x.a[0] = '1';
x.a[1] = '2';
x.b = '\0';

strlen(x.a); // BUG with stricter intra-object overflow checks on
strnlen(x.a, 3); // BUG with stricter intra-object overflow checks on
strnlen(x.a, 2); // no overflow

Anyway, it'd be really good if other people looked closely at these. I
wasn't sure what to do with test cases that I've made though. It seems
ones that are *supposed* to BUG should go in lkdtm, and should the other
tests just be together with those? Some should also pass w/o the intra
object overflow checks, but BUG with them enabled.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).

Agree, it's very important for this code to be correct and the string
functions have some subtleties so it needs scrutiny. I messed up strcpy
between v1 and v2 trying to add a proper read overflow check. My fault
for not looking more closely at strscpy before adopting it based on my
misinterpretation of the API.

This is primarily a bug finding feature right now and it has gotten a
few fixed that actually matter (most were unimportant memcpy read past
end of string constant but not all). I don't think it has another bug
like this strscpy misuse itself, but there will need to be some more
fixes for minor read overflows, etc. elsewhere in the tree before it'll
actually make sense as a hardening feature because it can turn a benign
read overflow into a DoS via BUG(). I think it will be fine for 4.13,
but I definitely wouldn't propose 'default y' for a while, even if there
was no performance cost (and there is).

Fix for this issue is here in case anyone just looks only at this thread
(realized I should have passed send-email a reply id):

http://marc.info/?l=linux-fsdevel&m=150006772418003&w=2


[PATCH] replace incorrect strscpy use in FORTIFY_SOURCE

2017-07-14 Thread Daniel Micay
Using strscpy was wrong because FORTIFY_SOURCE is passing the maximum
possible size of the outermost object, but strscpy defines the count
parameter as the exact buffer size, so this could copy past the end of
the source. This would still be wrong with the planned usage of
__builtin_object_size(p, 1) for intra-object overflow checks since it's
the maximum possible size of the specified object with no guarantee of
it being that large.

Reuse of the fortified functions like this currently makes the runtime
error reporting less precise but that can be improved later on.

Signed-off-by: Daniel Micay 
---
 include/linux/string.h | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/linux/string.h b/include/linux/string.h
index 96f5a5fd0377..049866760e8b 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -202,17 +202,6 @@ void __read_overflow2(void) __compiletime_error("detected 
read beyond size of ob
 void __write_overflow(void) __compiletime_error("detected write beyond size of 
object passed as 1st parameter");
 
 #if !defined(__NO_FORTIFY) && defined(__OPTIMIZE__) && 
defined(CONFIG_FORTIFY_SOURCE)
-__FORTIFY_INLINE char *strcpy(char *p, const char *q)
-{
-   size_t p_size = __builtin_object_size(p, 0);
-   size_t q_size = __builtin_object_size(q, 0);
-   if (p_size == (size_t)-1 && q_size == (size_t)-1)
-   return __builtin_strcpy(p, q);
-   if (strscpy(p, q, p_size < q_size ? p_size : q_size) < 0)
-   fortify_panic(__func__);
-   return p;
-}
-
 __FORTIFY_INLINE char *strncpy(char *p, const char *q, __kernel_size_t size)
 {
size_t p_size = __builtin_object_size(p, 0);
@@ -391,6 +380,18 @@ __FORTIFY_INLINE void *kmemdup(const void *p, size_t size, 
gfp_t gfp)
fortify_panic(__func__);
return __real_kmemdup(p, size, gfp);
 }
+
+/* defined after fortified strlen and memcpy to reuse them */
+__FORTIFY_INLINE char *strcpy(char *p, const char *q)
+{
+   size_t p_size = __builtin_object_size(p, 0);
+   size_t q_size = __builtin_object_size(q, 0);
+   if (p_size == (size_t)-1 && q_size == (size_t)-1)
+   return __builtin_strcpy(p, q);
+   memcpy(p, q, strlen(q) + 1);
+   return p;
+}
+
 #endif
 
 #endif /* _LINUX_STRING_H_ */
-- 
2.13.3



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> The reason q_size isn't used is because it doesn't yet prevent read
> overflow. The commit message mentions that among the current
> limitations
> along with __builtin_object_size(ptr, 1).

Er rather, in strlcat, the q_size is unused after the fast path is
because strnlen obtains the constant again itself.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
On Fri, 2017-07-14 at 13:50 -0700, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 1:38 PM, Daniel Micay 
> wrote:
> > 
> > If strscpy treats the count parameter as a *guarantee* of the dest
> > size
> > rather than a limit,
> 
> No, it's a *limit*.
> 
> And by a *limit*, I mean that we know that we can access both source
> and destination within that limit.

FORTIFY_SOURCE needs to be able to pass a limit without implying that
there's a minimum. That's the distinction I was trying to make. It's
wrong to use anything where it's interpreted as a minimum here. Using
__builtin_object_size(ptr, 0) vs. __builtin_object_size(ptr, 1) doesn't
avoid the problem. __builtin_object_size(ptr, 1) returns a maximum among
the possible buffer sizes just like 0. It's just stricter, i.e. catches
intra-object overflow, which isn't desirable for the first take since it
will cause compatibility issues. There's code using memcpy,
copy_to_user, etc. to read / write multiple fields with a pointer to the
first one passed as the source / destination.

> > My initial patch used strlcpy there, because I wasn't aware of
> > strscpy
> > before it was suggested:
> 
> Since I'm looking at this, I note that the "strlcpy()" code is
> complete garbage too, and has that same
> 
>  p_size == (size_t)-1 && q_size == (size_t)-1
> 
> check which is wrong.  Of course, in strlcpy, q_size is never actually
> *used*, so the whole check seems bogus.

That check is only an optimization. __builtin_object_size always returns
a constant, and it's (size_t)-1 when no limit could be found.

The reason q_size isn't used is because it doesn't yet prevent read
overflow. The commit message mentions that among the current limitations
along with __builtin_object_size(ptr, 1).

> But no, strlcpy() is complete garbage, and should never be used. It is
> truly a shit interface, and anybody who uses it is by definition
> buggy.
> 
> Why? Because the return value of "strlcpy()" is defined to be ignoring
> the limit, so you FUNDAMENTALLY must not use that thing on untrusted
> source strings.
> 
> But since the whole *point* of people using it is for untrusted
> sources, it by definition is garbage.
> 
> Ergo: don't use strlcpy(). It's unbelievable crap. It's wrong. There's
> a reason we defined "strscpy()" as the way to do safe copies
> (strncpy(), of course, is broken for both lack of NUL termination
> _and_ for excessive NUL termination when a NUL did exist).

Sure, it doesn't prevent read overflow (but it's not worse than strcpy,
which is the purpose) which is why I said this:

"The fortified string functions should place a limit on reads from the
source. For strcat/strcpy, these could just be a variant of strlcat /
strlcpy using the size limit as a bound on both the source and
destination, with the return value only reporting whether truncation
occurred rather than providing the source length. It would be an easier
API for developers to use too and not that it really matters but it
would be more efficient for the case where truncation is intended."

That's why strscpy was suggested and I switched to that + updated the
commit message to only mention strcat, but it's wrong to use it here
because __builtin_object_size(p, 0) / __builtin_object_size(p, 1) are
only a guaranteed maximum length with no minimum guarantee.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> My initial patch used strlcpy there, because I wasn't aware of strscpy
> before it was suggested:
> 
> http://www.openwall.com/lists/kernel-hardening/2017/05/04/11
> 
> I was wrong to move it to strscpy. It could be switched back to
> strlcpy
> again unless the kernel considers the count parameter to be a
> guarantee
> that could be leveraged in the future. Using the fortified strlen +
> memcpy would provide the improvement that strscpy was meant to provide
> there over strlcpy.

Similarly, the FORTIFY_SOURCE strcat uses strlcat with the assumption
that the count parameter is a limit, not a guarantee for a optimization.
There's only a C implementation and it currently doesn't, but if it's
meant to be a guarantee then the strcat needs to be changed too.

I'll make a fix moving away both from the existing functions.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
On Fri, 2017-07-14 at 12:58 -0700, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 12:43 PM, Andrey Ryabinin
>  wrote:
> > 
> > > yet when I look at the generated code for __ip_map_lookup, I see
> > > 
> > >movl$32, %edx   #,
> > >movq%r13, %rsi  # class,
> > >leaq48(%rax), %rdi  #, tmp126
> > >callstrscpy #
> > > 
> > > what's the bug here? Look at that third argume8nt - %rdx. It is
> > > initialized to 32.
> > 
> > It's not a compiler bug, it's a bug in our strcpy().
> > Whoever wrote this strcpy() into strscpy() code apparently didn't
> > read carefully
> > enough gcc manual about __builtin_object_size().
> > 
> > Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking
> > .html :
> > 
> > __builtin_object_size(ptr, type) returns a constant number
> > of bytes from 'ptr' to the end of the object 'ptr'
> > pointer points to. "type" is an integer constant from 0 to
> > 3. If the least significant bit is clear, objects
> > are whole variables, if it is set, a closest surrounding
> > subobject is considered the object a pointer points to.
> > The second bit determines if maximum or minimum of remaining
> > bytes is computed.
> > 
> > We have type = 0 in strcpy(), so the least significant bit is clear.
> > So the 'ptr' is considered as a pointer to the whole
> > variable i.e. pointer to struct ip_map ip;
> > And the number of bytes from 'ip.m_class' to the end of the ip
> > object is exactly 32.
> > 
> > I suppose that changing the type to 1 should fix this bug.
> 
> Oh, that absolutely needs to be done.
> 
> Because that "strcpy() -> strscpy()" conversion really depends on that
> size being the right size (well, in this case minimal safe size) for
> the actual accesses, exactly because "strscpy()" is perfectly willing
> to write *past* the end of the destination string within that given
> size limit (ie it reads and writes in the same 8-byte chunks).
> 
> So if you have a small target string that is contained in a big
> object, then the "hardened" strcpy() code can actually end up
> overwriting things past the end of the strring, even if the string
> itself were to have fit in the buffer.
> 
> I note that every single use in string.h is buggy, and it worries me
> that __compiletime_object_size() does this too. The only user of that
> seems to be check_copy_size(), and now I'm a bit worried what that bug
> may have hidden.
> 
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).
> 
> There is something actively *evil* about it. Daniel, Kees, please jump
> on this.
> 
> Andrey, thanks for noticing this thing,
> 
>   Linus

The issue is the usage of strscpy then, not the __builtin_object_size
type parameter. The type is set 0 rather than 1 to be more lenient by
not detecting intra-object overflow, which is going to come later.

If strscpy treats the count parameter as a *guarantee* of the dest size
rather than a limit, it's wrong to use it there, whether or not the type
parameter for __builtin_object_size is 0 or 1 since it can still return
a larger size. It's a limit with no guaranteed minimum.

My initial patch used strlcpy there, because I wasn't aware of strscpy
before it was suggested:

http://www.openwall.com/lists/kernel-hardening/2017/05/04/11

I was wrong to move it to strscpy. It could be switched back to strlcpy
again unless the kernel considers the count parameter to be a guarantee
that could be leveraged in the future. Using the fortified strlen +
memcpy would provide the improvement that strscpy was meant to provide
there over strlcpy.


Re: [kernel-hardening] Re: [PATCH v3] mm: Add SLUB free list pointer obfuscation

2017-07-06 Thread Daniel Micay
On Thu, 2017-07-06 at 10:55 -0500, Christoph Lameter wrote:
> On Thu, 6 Jul 2017, Kees Cook wrote:
> 
> > On Thu, Jul 6, 2017 at 6:43 AM, Christoph Lameter 
> > wrote:
> > > On Wed, 5 Jul 2017, Kees Cook wrote:
> > > 
> > > > @@ -3536,6 +3565,9 @@ static int kmem_cache_open(struct
> > > > kmem_cache *s, unsigned long flags)
> > > >  {
> > > >   s->flags = kmem_cache_flags(s->size, flags, s->name, s-
> > > > >ctor);
> > > >   s->reserved = 0;
> > > > +#ifdef CONFIG_SLAB_FREELIST_HARDENED
> > > > + s->random = get_random_long();
> > > > +#endif
> > > > 
> > > >   if (need_reserve_slab_rcu && (s->flags &
> > > > SLAB_TYPESAFE_BY_RCU))
> > > >   s->reserved = sizeof(struct rcu_head);
> > > > 
> > > 
> > > So if an attacker knows the internal structure of data then he can
> > > simply
> > > dereference page->kmem_cache->random to decode the freepointer.
> > 
> > That requires a series of arbitrary reads. This is protecting
> > against
> > attacks that use an adjacent slab object write overflow to write the
> > freelist pointer. This internal structure is very reliable, and has
> > been the basis of freelist attacks against the kernel for a decade.
> 
> These reads are not arbitrary. You can usually calculate the page
> struct
> address easily from the address and then do a couple of loads to get
> there.

You're describing an arbitrary read vulnerability: an attacker able to
read the value of an address of their choosing. Requiring a powerful
additional primitive rather than only a small fixed size overflow or a
weak use-after-free vulnerability to use a common exploit vector is
useful.

A deterministic mitigation would be better, but I don't think an extra
slab allocator for hardened kernels would be welcomed. Since there isn't
a separate allocator for that niche, SLAB or SLUB are used. The ideal
would be bitmaps in `struct page` but that implies another allocator,
using single pages for the smallest size classes and potentially needing
to bloat `struct page` even with that.

There's definitely a limit to the hardening that can be done for SLUB,
but unless forking it into a different allocator is welcome that's what
will be suggested. Similarly, the slab freelist randomization feature is
a much weaker mitigation than it could be without these constraints
placed on it. This is much lower complexity than that and higher value
though...

> Ok so you get rid of the old attacks because we did not have that
> hardening in effect when they designed their approaches?
> 
> > It is a probabilistic defense, but then so is the stack protector.
> > This is a similar defense; while not perfect it makes the class of
> > attack much more difficult to mount.
> 
> Na I am not convinced of the "much more difficult". Maybe they will
> just
> have to upgrade their approaches to fetch the proper values to decode.

To fetch the values they would need an arbitrary read vulnerability or
the ability to dump them via uninitialized slab allocations as an extra
requirement.

An attacker can similarly bypass the stack canary by reading them from
stack frames via a stack buffer read overflow or uninitialized variable
usage leaking stack data. On non-x86, at least with SMP, the stack
canary is just a global variable that remains the same after
initialization too. That doesn't make it useless, although the kernel
doesn't have many linear overflows on the stack which is the real issue
with it as a mitigation. Despite that, most people are using kernels
with stack canaries, and that has a significant performance cost unlike
these kinds of changes.


Re: [PATCH v2] kref: Avoid null pointer dereference after WARN

2017-06-27 Thread Daniel Micay
On Tue, 2017-06-27 at 12:34 -0700, Kees Cook wrote:
> On Tue, Jun 27, 2017 at 12:26 PM, Jason A. Donenfeld 
> wrote:
> > On Tue, Jun 27, 2017 at 9:22 PM, Andi Kleen 
> > wrote:
> > > Who would actually set mman_min_addr incorrectly?
> > 
> > Historically there have been quite a few bypasses of mmap_min_addr,
> > actually. This is well-trodden ground.
> 
> Targeting things in /proc/sys via confused privileged helpers is
> extremely common. See Chrome OS pwn2own exploits (targetting modprobe
> sysctl), and plenty of others. Modern attack methodology is rarely a
> single-bug attack, but rather a chain of bugs, which may include
> producing or exploiting weak userspace configurations to soften the
> kernel.
> 
> Regardless, it's a fair point that checking this unconditionally is
> wasteful. Strangely this doesn't help:
> 
> -   BUG_ON(release == NULL);
> +   if (!__builtin_constant_p(release))
> +   BUG_ON(release == NULL);
> 
> When nearly all callers pass a function directly:
> 
> ...
> drivers/block/rbd.c:kref_put(&spec->kref, rbd_spec_free);
> drivers/char/hw_random/core.c:  kref_put(&rng->ref,
> cleanup_rng);
> drivers/char/ipmi/ipmi_msghandler.c:
> kref_put(&e->intf->refcount, intf_free);
> ...
> 
> Hmmm
> 
> -Kees

It doesn't mean the address is constant if there's a fixed function
being passed to it. It's not known at compile-time and if the code can
be relocated it's not known at link-time.

I don't personally care about checks like this but I split it out with
some others just because it was there already.

Clang has a nullability attribute which is similar to nonnull but it
doesn't cause UB when violated, so if GCC picked that up it could be
added all over the place as an annotation on parameters to trigger
warnings. There's a sanitizer for it, so it can be made to trap with
-fsanitize=nullability -fsanitize-trap=nullability.


Re: [PATCH v2] binfmt_elf: Use ELF_ET_DYN_BASE only for PIE

2017-06-27 Thread Daniel Micay
On Tue, 2017-06-27 at 16:49 +0200, Michal Hocko wrote:
> On Wed 21-06-17 10:32:01, Kees Cook wrote:
> > The ELF_ET_DYN_BASE position was originally intended to keep loaders
> > away from ET_EXEC binaries. (For example, running "/lib/ld-
> > linux.so.2
> > /bin/cat" might cause the subsequent load of /bin/cat into where the
> > loader had been loaded.) With the advent of PIE (ET_DYN binaries
> > with
> > an INTERP Program Header), ELF_ET_DYN_BASE continued to be used
> > since
> > the kernel was only looking at ET_DYN. However, since
> > ELF_ET_DYN_BASE
> > is traditionally set at the top 1/3rd of the TASK_SIZE, a
> > substantial
> > portion of the address space is unused.
> > 
> > For 32-bit tasks when RLIMIT_STACK is set to RLIM_INFINITY, programs
> > are loaded below the mmap region. This means they can be made to
> > collide
> > (CVE-2017-1000370) or nearly collide (CVE-2017-1000371) with
> > pathological
> > stack regions. Lowering ELF_ET_DYN_BASE solves both by moving
> > programs
> > above the mmap region in all cases, and will now additionally avoid
> > programs falling back to the mmap region by enforcing MAP_FIXED for
> > program loads (i.e. if it would have collided with the stack, now it
> > will fail to load instead of falling back to the mmap region).
> 
> I do not understand this part. MAP_FIXED will simply unmap whatever
> was under the requested range, how it could help failing anything? So
> what would happen if something was mapped in that region, or is this
> impossible? Moreover MAP_FIXED close to stack will inhibit the stack
> gap
> protection.

I don't think there's a reason to use MAP_FIXED. PaX likely ignores the
address hint with RANDMMAP in that code, which would explain it there.


Re: [kernel-hardening] non-x86 per-task stack canaries

2017-06-26 Thread Daniel Micay
On Mon, 2017-06-26 at 14:04 -0700, Kees Cook wrote:
> Hi,
> 
> The stack protector functionality on x86_64 uses %gs:0x28 (%gs is the
> percpu area) for __stack_chk_guard, and all other architectures use a
> global variable instead. This means we never change the stack canary
> on non-x86 architectures which allows for a leak in one task to expose
> the canary in another task.
> 
> I'm curious what thoughts people may have about how to get this
> correctly implemented. Teaching the compiler about per-cpu data sounds
> exciting. :)
> 
> -Kees

arm64 has many integer registers so I don't think reserving one would
hurt performance, especially in the kernel where hot numeric loops
barely exist. It would reduce the cost of SSP by getting rid of the
memory read for the canary value. On the other hand, using per-cpu data
would likely be higher cost than the global. x86 has segment registers
but most archs probably need to do something more painful.

It's safe as long as it's a callee-saved register. It should be enforced
that there's no assembly spilling it and calling into C code without the
random canary. There's very little assembly using registers like x28 so
it wouldn't be that bad. It's possible there's one where nothing needs
to be changed, there only needs to be a check to make sure it stays that
way.

It would be a step towards making SSP cheap enough to expand it into a
feature like the StackGuard XOR canaries.

Samsung has a return address XOR feature based on reserving a register
and while RAP's probabilistic return address mitigation isn't open-
source, it was stated that it reserves a register on x86_64 where they
aren't as plentiful as arm64.


Re: [kernel-hardening] [PATCH] [RFC] binfmt_elf: Use ELF_ET_DYN_BASE only for PIE

2017-06-21 Thread Daniel Micay
On Wed, 2017-06-21 at 10:28 -0700, Kees Cook wrote:
> On Wed, Jun 21, 2017 at 10:27 AM, Daniel Micay 
> wrote:
> > On Wed, 2017-06-21 at 10:23 -0700, Kees Cook wrote:
> > > On Wed, Jun 21, 2017 at 5:07 AM, Rik van Riel 
> > > wrote:
> > > > On Tue, 2017-06-20 at 22:58 -0700, Kees Cook wrote:
> > > > > +/* This is the base location for PIE (ET_DYN with INTERP)
> > > > > loads.
> > > > > */
> > > > > +#define ELF_ET_DYN_BASE  0x40UL
> > > > 
> > > > This value is good for 32 bit binaries, but for 64
> > > > bit binaries you probably want to put ELF_ET_DYN_BASE
> > > > at 4GB or higher.
> > > > 
> > > > The latter is necessary because Android uses the
> > > > lower 4GB of address space for its JVM runtime,
> > > > with 32 bit pointers inside that part of the otherwise
> > > > 64 bit address space.
> > > > 
> > > > In other words:
> > > > 
> > > > #define ELF_ET_DYN_BASE (mmap_is_ia32() ? 0x40UL :
> > > > 0x1UL)
> > > 
> > > Ah, interesting. Okay, that should be fine. I'll adjust it.
> > > 
> > > > > +++ b/fs/binfmt_elf.c
> > > > > 
> > > > > +  * Therefore, programs are loaded offset
> > > > > from
> > > > > +  * ELF_ET_DYN_BASE and loaders are
> > > > > loaded
> > > > > into the
> > > > > +  * independently randomized mmap region
> > > > > (0
> > > > > load_bias
> > > > > +  * without MAP_FIXED).
> > > > > +  */
> > > > > + if (elf_interpreter) {
> > > > > + load_bias = ELF_ET_DYN_BASE;
> > > > > + if (current->flags &
> > > > > PF_RANDOMIZE)
> > > > > + load_bias +=
> > > > > arch_mmap_rnd();
> > > > > + elf_flags |= MAP_FIXED;
> > > > > + } else
> > > > > + load_bias = 0;
> > > > > +
> > > > > + load_bias -= vaddr;
> > > > 
> > > > I like this, and the big comment telling people how it
> > > > works :)
> > > 
> > > Thanks! It looks like your patch for commenting load_bias never
> > > got
> > > picked up, so I've added some more comments for that and some
> > > other
> > > things too. (Mostly for all the stuff I have to read a few times
> > > when
> > > I look at this code.)
> > > 
> > > -Kees
> > > 
> > 
> > The stack rlimit calculation fix for space potentially lost to ASLR
> > is
> > probably still needed too, right?
> 
> Yes. Was that picked up by akpm already?
> 
> -Kees

I think it was dropped when the ET_DYN changes were dropped.


Re: [kernel-hardening] [PATCH] [RFC] binfmt_elf: Use ELF_ET_DYN_BASE only for PIE

2017-06-21 Thread Daniel Micay
On Wed, 2017-06-21 at 10:23 -0700, Kees Cook wrote:
> On Wed, Jun 21, 2017 at 5:07 AM, Rik van Riel  wrote:
> > On Tue, 2017-06-20 at 22:58 -0700, Kees Cook wrote:
> > > +/* This is the base location for PIE (ET_DYN with INTERP) loads.
> > > */
> > > +#define ELF_ET_DYN_BASE  0x40UL
> > 
> > This value is good for 32 bit binaries, but for 64
> > bit binaries you probably want to put ELF_ET_DYN_BASE
> > at 4GB or higher.
> > 
> > The latter is necessary because Android uses the
> > lower 4GB of address space for its JVM runtime,
> > with 32 bit pointers inside that part of the otherwise
> > 64 bit address space.
> > 
> > In other words:
> > 
> > #define ELF_ET_DYN_BASE (mmap_is_ia32() ? 0x40UL :
> > 0x1UL)
> 
> Ah, interesting. Okay, that should be fine. I'll adjust it.
> 
> > > +++ b/fs/binfmt_elf.c
> > > 
> > > +  * Therefore, programs are loaded offset
> > > from
> > > +  * ELF_ET_DYN_BASE and loaders are loaded
> > > into the
> > > +  * independently randomized mmap region (0
> > > load_bias
> > > +  * without MAP_FIXED).
> > > +  */
> > > + if (elf_interpreter) {
> > > + load_bias = ELF_ET_DYN_BASE;
> > > + if (current->flags & PF_RANDOMIZE)
> > > + load_bias +=
> > > arch_mmap_rnd();
> > > + elf_flags |= MAP_FIXED;
> > > + } else
> > > + load_bias = 0;
> > > +
> > > + load_bias -= vaddr;
> > 
> > I like this, and the big comment telling people how it
> > works :)
> 
> Thanks! It looks like your patch for commenting load_bias never got
> picked up, so I've added some more comments for that and some other
> things too. (Mostly for all the stuff I have to read a few times when
> I look at this code.)
> 
> -Kees
> 

The stack rlimit calculation fix for space potentially lost to ASLR is
probably still needed too, right?


Re: [kernel-hardening] [PATCH 23/23] mm: Allow slab_nomerge to be set at build time

2017-06-19 Thread Daniel Micay
On Mon, 2017-06-19 at 16:36 -0700, Kees Cook wrote:
> Some hardened environments want to build kernels with slab_nomerge
> already set (so that they do not depend on remembering to set the
> kernel
> command line option). This is desired to reduce the risk of kernel
> heap
> overflows being able to overwrite objects from merged caches,
> increasing
> the difficulty of these attacks. By keeping caches unmerged, these
> kinds
> of exploits can usually only damage objects in the same cache (though
> the
> risk to metadata exploitation is unchanged).

It also further fragments the ability to influence slab cache layout,
i.e. primitives to do things like filling up slabs to set things up for
an exploit might not be able to deal with the target slabs anymore. It
doesn't need to be mentioned but it's something to think about too. In
theory, disabling merging can make it *easier* to get the right layout
too if there was some annoyance that's now split away. It's definitely a
lot more good than bad for security though, but allocator changes have
subtle impact on exploitation. This can make caches more deterministic.


[PATCH v5] add the option of fortified string.h functions

2017-06-18 Thread Daniel Micay
This adds support for compiling with a rough equivalent to the glibc
_FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer
overflow checks for string.h functions when the compiler determines the
size of the source or destination buffer at compile-time. Unlike glibc,
it covers buffer reads in addition to writes.

GNU C __builtin_*_chk intrinsics are avoided because they would force a
much more complex implementation. They aren't designed to detect read
overflows and offer no real benefit when using an implementation based
on inline checks. Inline checks don't add up to much code size and allow
full use of the regular string intrinsics while avoiding the need for a
bunch of _chk functions and per-arch assembly to avoid wrapper overhead.

This detects various overflows at compile-time in various drivers and
some non-x86 core kernel code. There will likely be issues caught in
regular use at runtime too.

Future improvements left out of initial implementation for simplicity,
as it's all quite optional and can be done incrementally:

* Some of the fortified string functions (strncpy, strcat), don't yet
  place a limit on reads from the source based on __builtin_object_size of
  the source buffer.

* Extending coverage to more string functions like strlcat.

* It should be possible to optionally use __builtin_object_size(x, 1) for
  some functions (C strings) to detect intra-object overflows (like
  glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative
  approach to avoid likely compatibility issues.

* The compile-time checks should be made available via a separate config
  option which can be enabled by default (or always enabled) once enough
  time has passed to get the issues it catches fixed.

Signed-off-by: Daniel Micay 
---
Changes since v4:
- avoid overly aggressive strnlen check for non-null-terminated strings

 arch/arm64/include/asm/string.h  |   5 +
 arch/x86/boot/compressed/misc.c  |   5 +
 arch/x86/include/asm/string_32.h |   9 ++
 arch/x86/include/asm/string_64.h |   7 ++
 include/linux/string.h   | 200 +++
 lib/string.c |   6 ++
 security/Kconfig |   6 ++
 7 files changed, 238 insertions(+)

diff --git a/arch/arm64/include/asm/string.h b/arch/arm64/include/asm/string.h
index 2eb714c4639f..d0aa42907569 100644
--- a/arch/arm64/include/asm/string.h
+++ b/arch/arm64/include/asm/string.h
@@ -63,6 +63,11 @@ extern int memcmp(const void *, const void *, size_t);
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
 #define memmove(dst, src, len) __memmove(dst, src, len)
 #define memset(s, c, n) __memset(s, c, n)
+
+#ifndef __NO_FORTIFY
+#define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
+#endif
+
 #endif
 
 #endif
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b3c5a5f030ce..43691238a21d 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -409,3 +409,8 @@ asmlinkage __visible void *extract_kernel(void *rmode, 
memptr heap,
debug_putstr("done.\nBooting the kernel.\n");
return output;
 }
+
+void fortify_panic(const char *name)
+{
+   error("detected buffer overflow");
+}
diff --git a/arch/x86/include/asm/string_32.h b/arch/x86/include/asm/string_32.h
index 3d3e8353ee5c..e9ee84873de5 100644
--- a/arch/x86/include/asm/string_32.h
+++ b/arch/x86/include/asm/string_32.h
@@ -142,7 +142,9 @@ static __always_inline void *__constant_memcpy(void *to, 
const void *from,
 }
 
 #define __HAVE_ARCH_MEMCPY
+extern void *memcpy(void *, const void *, size_t);
 
+#ifndef CONFIG_FORTIFY_SOURCE
 #ifdef CONFIG_X86_USE_3DNOW
 
 #include 
@@ -195,11 +197,15 @@ static inline void *__memcpy3d(void *to, const void 
*from, size_t len)
 #endif
 
 #endif
+#endif /* !CONFIG_FORTIFY_SOURCE */
 
 #define __HAVE_ARCH_MEMMOVE
 void *memmove(void *dest, const void *src, size_t n);
 
+extern int memcmp(const void *, const void *, size_t);
+#ifndef CONFIG_FORTIFY_SOURCE
 #define memcmp __builtin_memcmp
+#endif
 
 #define __HAVE_ARCH_MEMCHR
 extern void *memchr(const void *cs, int c, size_t count);
@@ -321,6 +327,8 @@ void *__constant_c_and_count_memset(void *s, unsigned long 
pattern,
 : __memset_generic((s), (c), (count)))
 
 #define __HAVE_ARCH_MEMSET
+extern void *memset(void *, int, size_t);
+#ifndef CONFIG_FORTIFY_SOURCE
 #if (__GNUC__ >= 4)
 #define memset(s, c, count) __builtin_memset(s, c, count)
 #else
@@ -330,6 +338,7 @@ void *__constant_c_and_count_memset(void *s, unsigned long 
pattern,
 (count))   \
 : __memset((s), (c), (count)))
 #endif
+#endif /* !CONFIG_FORTIFY_SOURCE */
 
 /*
  * find the first occurrence of byte 'c', or 1 past the area if none
diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index 733bae07fb29..309fe644569f 100644
--- a/a

Re: linux-next: build failure after merge of the kspp tree

2017-06-15 Thread Daniel Micay
> https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/ in the
> 4.13/scsi-queue and for-next branches. I think that's why Kees didn't
> include it but I get he needs to add that.

s/get/guess/

Or is that repo supposed to get pulled into next?


Re: linux-next: build failure after merge of the kspp tree

2017-06-15 Thread Daniel Micay
On Fri, 2017-06-16 at 11:30 +1000, Stephen Rothwell wrote:
> Hi Kees,
> 
> After merging the kspp tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> In file included from include/linux/bitmap.h:8:0,
>  from include/linux/cpumask.h:11,
>  from arch/x86/include/asm/cpumask.h:4,
>  from arch/x86/include/asm/msr.h:10,
>  from arch/x86/include/asm/processor.h:20,
>  from arch/x86/include/asm/cpufeature.h:4,
>  from arch/x86/include/asm/thread_info.h:52,
>  from include/linux/thread_info.h:37,
>  from arch/x86/include/asm/preempt.h:6,
>  from include/linux/preempt.h:80,
>  from include/linux/spinlock.h:50,
>  from include/linux/mmzone.h:7,
>  from include/linux/gfp.h:5,
>  from include/linux/slab.h:14,
>  from drivers/scsi/csiostor/csio_lnode.c:37:
> In function 'memcpy',
> inlined from 'csio_append_attrib' at
> drivers/scsi/csiostor/csio_lnode.c:248:2,
> inlined from 'csio_ln_fdmi_dprt_cbfn' at
> drivers/scsi/csiostor/csio_lnode.c:471:2:
> include/linux/string.h:309:4: error: call to '__read_overflow2'
> declared with attribute error: detected read beyond size of object
> passed as 2nd parameter
> __read_overflow2();
> ^
> In function 'memcpy',
> inlined from 'csio_append_attrib' at
> drivers/scsi/csiostor/csio_lnode.c:248:2,
> inlined from 'csio_ln_fdmi_rhba_cbfn' at
> drivers/scsi/csiostor/csio_lnode.c:337:2:
> include/linux/string.h:309:4: error: call to '__read_overflow2'
> declared with attribute error: detected read beyond size of object
> passed as 2nd parameter
> __read_overflow2();
> ^
> 
> Caused by commit
> 
>   b90d6eba50d7 ("include/linux/string.h: add the option of fortified
> string.h functions")
> 
> I have reverted that commit for today.

That's this one: https://lkml.org/lkml/2017/5/9/613, which is in
https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/ in the
4.13/scsi-queue and for-next branches. I think that's why Kees didn't
include it but I get he needs to add that.


Re: linux-next: build failure after merge of the akpm-current tree

2017-06-15 Thread Daniel Micay
On Thu, 2017-06-15 at 16:46 -0700, Kees Cook wrote:
> On Thu, Jun 15, 2017 at 12:12 PM, Andrew Morton
>  wrote:
> > On Wed, 14 Jun 2017 18:56:30 -0700 Kees Cook 
> > wrote:
> > 
> > > > > Caused by commit
> > > > > 
> > > > >   088a5ecf7581 ("include/linux/string.h: add the option of
> > > > > fortified string.h functions")
> > > > > 
> > > > > We really need to fix all the known problems it detects
> > > > > *before*
> > > > > merging this commit ...
> > > > > 
> > > > > I have reverted it for today.
> > > > 
> > > > I am still needing to revert this every day ...
> > > 
> > > I sent a series for -mm (or maintainers) to merge that should
> > > catch
> > > everything. Do you want me to carry it in my kspp tree instead?
> > > (My
> > > original intention was to carry all the fixes and the fortify
> > > patch in
> > > kspp but akpm took it into -mm somewhat unexpectedly, not that I'm
> > > complaining.)
> > 
> > This is all getting a bit foggy in my mind.  Can we please have a
> > full
> > resend of everything?  Sufficient to hopefully produce a tree which
> > has
> > no build-time or run-time regressions?  Including the buildbot's
> > recently-reported alpha and xtensa issues?
> 
> It's been sent a few times (and a few fixes have been collected in
> other trees already). What I've got in my for-next/kspp tree right now
> is all the fixes that haven't already been picked up by other tree
> maintainers, and I added the fortify patch itself to the end of the
> tree too now since sfr asked for that a few hours ago.
> 
> Merged with latest -next, this passes x86_64, i386, arm64, and powerpc
> allmodconfig builds for me. It doesn't pass arm, though. Perhaps we
> need to add an ARCH_HAS_FORTIFY_SOURCE to gate the all*config builds?
> 
> Should we let the dust settle first? I'm happy to do whatever makes
> the most sense, I'm just following what (I understand) sfr suggested
> most recently. :)
> 
> -Kees
> 

If it needs to build and boot on every architecture, I think we should
gate it on i386, x86_64, arm64 or powerpc where it has been tested.

I think I know what has to be fixed for alpha and xtensa but there might
be more problems. It's better to wait for someone willing / able to do
it properly by building it themselves and doing basic runtime testing.




Re: [PATCH] objtool: Add fortify_panic as __noreturn function

2017-06-14 Thread Daniel Micay
> So after that the errors (x86_64 allmodconfig build) are only:
> 
> In file included from include/linux/bitmap.h:8:0,
>  from include/linux/cpumask.h:11,
>  from arch/x86/include/asm/cpumask.h:4,
>  from arch/x86/include/asm/msr.h:10,
>  from arch/x86/include/asm/processor.h:20,
>  from arch/x86/include/asm/cpufeature.h:4,
>  from arch/x86/include/asm/thread_info.h:52,
>  from include/linux/thread_info.h:37,
>  from arch/x86/include/asm/preempt.h:6,
>  from include/linux/preempt.h:80,
>  from include/linux/spinlock.h:50,
>  from include/linux/mmzone.h:7,
>  from include/linux/gfp.h:5,
>  from arch/x86/power/hibernate_64.c:11:
> In function 'memcpy',
> inlined from 'relocate_restore_code' at
> arch/x86/power/hibernate_64.c:150:2,
> inlined from 'swsusp_arch_resume' at
> arch/x86/power/hibernate_64.c:186:8:
> include/linux/string.h:309:4: error: call to '__read_overflow2'
> declared with attribute error: detected read beyond size of object
> passed as 2nd parameter
> __read_overflow2();
> ^
> In file included from include/linux/bitmap.h:8:0,
>  from include/linux/cpumask.h:11,
>  from arch/x86/include/asm/cpumask.h:4,
>  from arch/x86/include/asm/msr.h:10,
>  from arch/x86/include/asm/processor.h:20,
>  from arch/x86/include/asm/cpufeature.h:4,
>  from arch/x86/include/asm/thread_info.h:52,
>  from include/linux/thread_info.h:37,
>  from arch/x86/include/asm/preempt.h:6,
>  from include/linux/preempt.h:80,
>  from include/linux/spinlock.h:50,
>  from include/linux/mmzone.h:7,
>  from include/linux/gfp.h:5,
>  from include/linux/mm.h:9,
>  from kernel/kexec_file.c:15:
> In function 'memcmp',
> inlined from 'kexec_load_purgatory' at kernel/kexec_file.c:900:6:
> include/linux/string.h:348:4: error: call to '__read_overflow'
> declared with attribute error: detected read beyond size of object
> passed as 1st parameter
> __read_overflow();
> ^
> 

Kees has the remaining ones here now:

https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=for-next/kspp

Not entirely sure what happened to the powerpc bits though.


Re: linux-next: build failure after merge of the akpm-current tree

2017-06-14 Thread Daniel Micay
On Wed, 2017-06-14 at 19:16 -0700, Kees Cook wrote:
> On Wed, Jun 14, 2017 at 7:06 PM, Stephen Rothwell  u> wrote:
> > Hi Kees,
> > 
> > On Wed, 14 Jun 2017 18:56:30 -0700 Kees Cook 
> > wrote:
> > > 
> > > I sent a series for -mm (or maintainers) to merge that should
> > > catch
> > > everything. Do you want me to carry it in my kspp tree instead?
> > > (My
> > > original intention was to carry all the fixes and the fortify
> > > patch in
> > > kspp but akpm took it into -mm somewhat unexpectedly, not that I'm
> > > complaining.)
> > 
> > Andrew is currently not responding to stuff (he warned us last
> > week),
> > so if you do put the series in your tree, I will remove it from my
> > copy
> > of Andrew's quilt series.  All the fixes necessary to make this
> > usable
> > should really be in the same tree if possible.
> > 
> > (I am assuming that the fix patches have been reviewed and acked by
> > appropriate people.)
> 
> Mostly, yes. But they're usually trivial. All have been sent out for
> review (some a few times).
> 
> Given that the fixes are already scattered between various trees and
> you merge -mm last in -next, I'll just carry the remaining fixes and
> you can leave akpm's fortify patch in -mm.
> 
> -Kees

I think compile-time fixes should be totally done for 32-bit x86,
x86_64, arm64 and powerpc but the patches just aren't all queued up in
-mm/-next.

The arch mailing list was pinged about this which is how the powerpc
folks got involved and fixed the issues there, including at least one
runtime one. Not sure where (if anywhere) those are queued up, but Kees
could pick those up too.

For other archs, these might end up being issues requiring #ifndef
CONFIG_FORTIFY_SOURCE wrapped around them as was done for 32-bit x86, or
just removal because it's obsolete with GCC from the past 6+ years:

arch/alpha/include/asm/string.h:21:#define memcpy __builtin_memcpy
arch/m68k/include/asm/string.h:63:#define memcmp(d, s, n) __builtin_memcmp(d, 
s, n)
arch/m68k/include/asm/string.h:67:#define memset(d, c, n) __builtin_memset(d, 
c, n)
arch/m68k/include/asm/string.h:71:#define memcpy(d, s, n) __builtin_memcpy(d, 
s, n)

I didn't do anything about those because I can't compile it or test it
right now, but maybe someone can be pinged about it. Might be a one or
two other archs with the same issue. An -mm build bot complained about
alpha (probably that) and xtensa (seems to define inline functions in
arch/xtensa/include/asm/string.h).


Re: linux-next: build failure after merge of the akpm-current tree

2017-06-14 Thread Daniel Micay
On Wed, 2017-06-14 at 19:19 -0700, Kees Cook wrote:
> On Wed, Jun 14, 2017 at 7:12 PM, Daniel Micay 
> wrote:
> > On Thu, 2017-06-15 at 12:04 +1000, Stephen Rothwell wrote:
> > > Hi Daniel,
> > > 
> > > On Wed, 14 Jun 2017 21:58:42 -0400 Daniel Micay  > > .com
> > > > wrote:
> > > > 
> > > > They're false positive warnings. I think objtool has a list of
> > > > __noreturn functions and needs fortify_panic added there. It's
> > > > just
> > > > a
> > > > warning so it doesn't need to happen immediately.
> > > 
> > > There are *lots* of them, so it does need to be fixed (Linus will
> > > be
> > > very irritated if it hits his tree like that).  Create a patch to
> > > objtool and get the x86 guys to Ack it?
> > 
> > Okay, I'll send a patch. It turns out that it's very
> > straightforward.
> > 
> > c1fad9ef7ed14aad464972e6444e7a3bd5670f26 is an example of an earlier
> > one.
> 
> Oops, I just sent one too! :P I can use whatever; like you said, it's
> a trivial fix.
> 
> -Kees

Less work for me is better ;).


Re: linux-next: build failure after merge of the akpm-current tree

2017-06-14 Thread Daniel Micay
On Thu, 2017-06-15 at 12:04 +1000, Stephen Rothwell wrote:
> Hi Daniel,
> 
> On Wed, 14 Jun 2017 21:58:42 -0400 Daniel Micay  > wrote:
> > 
> > They're false positive warnings. I think objtool has a list of
> > __noreturn functions and needs fortify_panic added there. It's just
> > a
> > warning so it doesn't need to happen immediately.
> 
> There are *lots* of them, so it does need to be fixed (Linus will be
> very irritated if it hits his tree like that).  Create a patch to
> objtool and get the x86 guys to Ack it?

Okay, I'll send a patch. It turns out that it's very straightforward.

c1fad9ef7ed14aad464972e6444e7a3bd5670f26 is an example of an earlier
one.


Re: linux-next: build failure after merge of the akpm-current tree

2017-06-14 Thread Daniel Micay
On Wed, 2017-06-14 at 18:56 -0700, Kees Cook wrote:
> On Wed, Jun 14, 2017 at 6:35 PM, Stephen Rothwell  u> wrote:
> > Hi all,
> > 
> > On Mon, 5 Jun 2017 17:01:17 +1000 Stephen Rothwell  > g.au> wrote:
> > > 
> > > After merging the akpm-current tree, today's linux-next build
> > > (x86_64
> > > allmodconfig) failed like this:
> > > 
> > > sound/pcmcia/pdaudiocf/pdaudiocf.o: warning: objtool: .text:
> > > unexpected end of section
> > > arch/x86/ras/mce_amd_inj.o: warning: objtool: inj_readme_read()
> > > falls through to next function extcpu_fops_open()
> > > sound/sound_core.o: warning: objtool:
> > > register_sound_special_device() falls through to next function
> > > register_sound_special()
> 
> Are these related to the fortify patch? I wouldn't expect that...

They're false positive warnings. I think objtool has a list of
__noreturn functions and needs fortify_panic added there. It's just a
warning so it doesn't need to happen immediately.


Re: [kernel-hardening] Re: [PATCH v3 04/13] crypto/rng: ensure that the RNG is ready before using

2017-06-07 Thread Daniel Micay
On Wed, 2017-06-07 at 18:26 +0100, Mark Rutland wrote:
> On Wed, Jun 07, 2017 at 01:00:25PM -0400, Daniel Micay wrote:
> > > On the better bootloaders, an initramfs segment can be loaded
> > > independently (and you can have as many as required), which makes
> > > an
> > > early_initramfs a more palatable vector to inject large amounts of
> > > entropy into the next boot than, say, modifying the kernel image
> > > directly at every boot/shutdown to stash entropy in there
> > > somewhere.
> 
> [...]
>  
> > I didn't really understand the device tree approach and mentioned a
> > few times before. Passing via the kernel cmdline is a lot simpler
> > than
> > modifying the device tree in-memory and persistent modification
> > isn't
> > an option unless verified boot is missing anyway.
> 
> I might be missing something here, but the command line is inside of
> the
> device tree, at /chosen/bootargs, so modifying the kernel command line
> *is* modifying the device tree in-memory.
> 
> For arm64, we have a /chosen/kaslr-seed property that we hope
> FW/bootloaders fill in, and similar could be done for some initial
> entropy, provided appropriate HW/FW support.
> 
> Thanks,
> Mark.

I was assuming it was simpler since bootloaders are already setting it,
but it seems I'm wrong about that.


Re: [kernel-hardening] Re: [PATCH v3 04/13] crypto/rng: ensure that the RNG is ready before using

2017-06-07 Thread Daniel Micay
> On the better bootloaders, an initramfs segment can be loaded
> independently (and you can have as many as required), which makes an
> early_initramfs a more palatable vector to inject large amounts of
> entropy into the next boot than, say, modifying the kernel image
> directly at every boot/shutdown to stash entropy in there somewhere.

Modifying the kernel image on storage isn't compatible with verified
boot so it's not really a solution. The kernel, initrd and rest of the
OS are signed and verified on operating systems like Android, Android
Things, ChromeOS and many embedded devices, etc. putting some basic
effort into security. I didn't really understand the device tree
approach and mentioned a few times before. Passing via the kernel
cmdline is a lot simpler than modifying the device tree in-memory and
persistent modification isn't an option unless verified boot is missing
anyway.


Re: linux-next: build failure after merge of the akpm-current tree

2017-06-05 Thread Daniel Micay
> It also probably finds more architecture-specific issues and may need
> compatibility fixes for them. I could mark it as compatible with only
> arm64 and x86(_64) since they're what I've tested to build and work at
> runtime and the compile-time errors could be turned into warnings for
> now, if it's mandatory that FORTIFY_SOURCE=y doesn't find problems at
> compile-time anywhere. If it's a warning, it will still catch the
> issue
> at runtime like the rest where the size isn't a constant.

I'm already leaving out intra-object overflow checks and the alloc_size
attributes in this initial submission to make it easier to land so
scaling it back a bit more (errors -> warnings, gating on archs) isn't a
problem if it's needed to get started.


Re: linux-next: build failure after merge of the akpm-current tree

2017-06-05 Thread Daniel Micay
On Mon, 2017-06-05 at 17:01 +1000, Stephen Rothwell wrote:
> Hi Andrew,
> 
> After merging the akpm-current tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> sound/pcmcia/pdaudiocf/pdaudiocf.o: warning: objtool: .text:
> unexpected end of section
> arch/x86/ras/mce_amd_inj.o: warning: objtool: inj_readme_read() falls
> through to next function extcpu_fops_open()
> sound/sound_core.o: warning: objtool: register_sound_special_device()
> falls through to next function register_sound_special()
> 
> and many more like those.

It looks like that's complaining about the __noreturn fortify_panic. It
isn't a compile-time error, just a false positive warning which needs to
be fixed in objtool. I guess it needs another symbol added to a hard-
wired list of __noreturn functions which sounds like it exists.

> In file included from /home/sfr/next/next/include/linux/bitmap.h:8:0,
>  from /home/sfr/next/next/include/linux/cpumask.h:11,
>  from
> /home/sfr/next/next/arch/x86/include/asm/cpumask.h:4,
>  from
> /home/sfr/next/next/arch/x86/include/asm/msr.h:10,
>  from
> /home/sfr/next/next/arch/x86/include/asm/processor.h:20,
>  from
> /home/sfr/next/next/arch/x86/include/asm/cpufeature.h:4,
>  from
> /home/sfr/next/next/arch/x86/include/asm/thread_info.h:52,
>  from
> /home/sfr/next/next/include/linux/thread_info.h:37,
>  from
> /home/sfr/next/next/arch/x86/include/asm/preempt.h:6,
>  from /home/sfr/next/next/include/linux/preempt.h:80,
>  from /home/sfr/next/next/include/linux/spinlock.h:50,
>  from /home/sfr/next/next/include/linux/mmzone.h:7,
>  from /home/sfr/next/next/include/linux/gfp.h:5,
>  from
> /home/sfr/next/next/arch/x86/power/hibernate_64.c:11:
> In function 'memcpy',
> inlined from 'relocate_restore_code' at
> /home/sfr/next/next/arch/x86/power/hibernate_64.c:150:2,
> inlined from 'swsusp_arch_resume' at
> /home/sfr/next/next/arch/x86/power/hibernate_64.c:185:8:
> /home/sfr/next/next/include/linux/string.h:309:4: error: call to
> '__read_overflow2' declared with attribute error: detected read beyond
> size of object passed as 2nd parameter
> __read_overflow2();
> ^
> 
> Caused by commit
> 
>   088a5ecf7581 ("include/linux/string.h: add the option of fortified
> string.h functions")
> 
> We really need to fix all the known problems it detects *before*
> merging this commit ...
> 
> I have reverted it for today.

The errors caught at compile-time including these are all fixed, but not
all the fixes have landed in -next yet. They're in various other trees.
GCC 7 came out which found an extra bug or two which are now fixed too.
I guess those need to be queued up with it in -mm if it's going to land
with -mm but I wasn't sure how things were going to work.

It also probably finds more architecture-specific issues and may need
compatibility fixes for them. I could mark it as compatible with only
arm64 and x86(_64) since they're what I've tested to build and work at
runtime and the compile-time errors could be turned into warnings for
now, if it's mandatory that FORTIFY_SOURCE=y doesn't find problems at
compile-time anywhere. If it's a warning, it will still catch the issue
at runtime like the rest where the size isn't a constant.


Re: [kernel-hardening] [PATCH] powerpc: Increase ELF_ET_DYN_BASE to 1TB for 64-bit applications

2017-06-05 Thread Daniel Micay
Rather than doing this, the base should just be split for an ELF
interpreter like PaX. It makes sense for a standalone executable to be
as low in the address space as possible. Doing that ASAP fixes issues
like this and opens up the possibility of fixing stack mapping ASLR
entropy on various architectures. It should be a pretty small change.


Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-03 Thread Daniel Micay
On 3 June 2017 at 18:54, Jeffrey Walton  wrote:
> On Sat, Jun 3, 2017 at 5:45 PM, Sandy Harris  wrote:
>> ...
>> Of course this will fail on systems with no high-res timer. Are there
>> still some of those? It might be done in about 1000 times as long on a
>> system that lacks the realtime library's nanosecond timer but has the
>> Posix standard microsecond timer, implying a delay time in the
>> milliseconds. Would that be acceptable in those cases?
>
> A significant portion of the use cases should include mobile devices.
> Device sales outnumbered desktop and server sales several years ago.
>
> Many devices are sensor rich. Even the low-end ones come with
> accelorometers for gaming. A typical one has 3 or 4 sensors, and
> higher-end ones have 7 or 8 sensors. An Evo 4G has 7 of them.
>
> There's no wanting for entropy in many of the use cases. The thing
> that is lacking seems to be taking advantage of it.
>
> Jeff

Hardware random number generator support is also standard on even
low-end mobile devices. The Linux kernel now knows to feed some of the
entropy from those hardware random generators into the kernel CSPRNG
when the driver is initialized but that doesn't happen until fairly
late in the kernel's boot process. The sensors present the same issue.
They aren't available when the kernel starts needing entropy for
features like SSP and KASLR or other early boot uses, unlike
RDRAND/RDSEED on modern x86_64 CPUs.

For userspace, Android's init system blocks until a certain amount of
entropy is obtained from one for the kernel CSPRNG. It's possible for
there to be no hwrandom but I think that's very rare now since the
standard SoCs used everywhere have it available. The device vendor
would probably need to go out of the way to break it. Android also
regularly saves a persistent random seed and restores it on boot. It
also mixes in entropy from the hardware generator regularly since the
kernel didn't know how to do that before, just like it didn't know how
to grab any initial entropy from the hardware generator.

I don't think it's worth worrying too much about mobile. Slimmer
embedded devices that probably don't even save / restore a seed in
many cases or generate keys on first boot before that helps are the
real issue. At least if you're not focused on KASLR and other early
probabilistic kernel exploit mitigations where there's a lack of a way
to get entropy in early boot right now unless the bootloader helps.


Re: [kernel-hardening] [PATCH 3/6] x86/mmap: properly account for stack randomization in mmap_base

2017-06-03 Thread Daniel Micay
On Fri, 2017-06-02 at 21:46 -0700, Kees Cook wrote:
> On Fri, Jun 2, 2017 at 8:20 AM,   wrote:
> > From: Rik van Riel 
> > 
> > When RLIMIT_STACK is, for example, 256MB, the current code results
> > in
> > a gap between the top of the task and mmap_base of 256MB, failing to
> > take into account the amount by which the stack address was
> > randomized.
> > In other words, the stack gets less than RLIMIT_STACK space.
> 
> Is this entirely accurate? The top of the task would be task_size, but
> this code is using task_size / 6 * 5 as the bottom of stack / top of
> mmap gap_max. Is there a reason for this?

MIN_GAP / MAX_GAP are only the boundaries that this gap is clamped to.

If it's not smaller than MIN_GAP, MIN_GAP isn't used. If it's not larger
than MAX_GAP, MAX_GAP isn't used. The stack randomization is currently
only taken into account for MIN_GAP. This only fixes that bug by always
taking it into account. It's not a subjective design change.

The MAX_GAP value is 5/6 of the address space which is overly large but
that's a separate bug.


Re: [kernel-hardening] [PATCH 0/6] move mmap_area and PIE binaries away from the stack

2017-06-03 Thread Daniel Micay
On Fri, 2017-06-02 at 21:37 -0700, Kees Cook wrote:
> 
> Patches 3 and 4 need some (minor?) adjustments

It's currently a bug fix. Doing something further would be more than
fixing the bug and should probably be separate.


Re: [PATCH 2/6] x86/elf: move 32 bit ELF_ET_DYN_BASE to 256MB

2017-06-03 Thread Daniel Micay
On Fri, 2017-06-02 at 21:22 -0700, Kees Cook wrote:
> On Fri, Jun 2, 2017 at 8:20 AM,   wrote:
> > From: Rik van Riel 
> > 
> > When setting up mmap_base, we take care to start the mmap base
> > below the maximum extent to which the stack will grow. However,
> > we take no such precautions with PIE binaries, which are placed
> > at 5/6 of TASK_SIZE plus a random offset. As a result, 32 bit PIE
> > binaries can end up smack in the middle of where the stack (which
> > is randomized down) is supposed to go.
> > 
> > That problem can be avoided by putting the 32 bit ELF_ET_DYN_BASE
> > at 256MB, which is a value linux-hardened and grsecurity have used
> > for a long time now without any known (to me) bug reports.
> > 
> > Signed-off-by: Rik van Riel 
> > ---
> >  arch/x86/include/asm/elf.h | 23 ---
> >  1 file changed, 12 insertions(+), 11 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
> > index e8ab9a46bc68..dafa098cc05a 100644
> > --- a/arch/x86/include/asm/elf.h
> > +++ b/arch/x86/include/asm/elf.h
> > @@ -245,12 +245,23 @@ extern int force_personality32;
> >  #define CORE_DUMP_USE_REGSET
> >  #define ELF_EXEC_PAGESIZE  4096
> > 
> > +/*
> > + * True on X86_32 or when emulating IA32 on X86_64
> > + */
> > +static inline int mmap_is_ia32(void)
> > +{
> > +   return IS_ENABLED(CONFIG_X86_32) ||
> > +  (IS_ENABLED(CONFIG_COMPAT) &&
> > +   test_thread_flag(TIF_ADDR32));
> > +}
> > +
> >  /* This is the location that an ET_DYN program is loaded if
> > exec'ed.  Typical
> > use of this is to invoke "./ld.so someprog" to test out a new
> > version of
> > the loader.  We need to make sure that it is out of the way of
> > the program
> > that it will "exec", and that there is sufficient room for the
> > brk.  */
> > 
> > -#define ELF_ET_DYN_BASE(TASK_SIZE / 3 * 2)
> > +#define ELF_ET_DYN_BASE(mmap_is_ia32() ?
> > 0x1000UL : \
> > +   (TASK_SIZE / 3 * 2))
> 
> I like the idea of this change, but I don't like this implementation
> for a few reasons.
> 
> Before that, let me just to state what I think our obvious goal is: we
> want to have the userspace memory layout wasting as little space as
> possible, and have regions separated enough to maximize the amount of
> ASLR entropy we can use, all without allowing them to collide.
> 
> I the previous non-PIE world common execution case, the kernel loaded
> statically located ET_EXEC programs and had brk, stack, and mmap
> randomized. The only ET_DYN program that got directly executed was the
> loader, but that was rare. In normal execution, it would get mapped
> after the ET_EXEC. When testing new loaders and running it directly
> like the comment above hints at (e.g. "/lib64/ld-linux-x86-64.so.2
> /bin/some-et-exec"), the loader would get loaded first, so it might
> collide with the static ET_EXEC if it wasn't moved away from the
> ET_EXEC range, which is why ELF_ET_DYN_BASE exists and a test for
> ET_DYN is done. It would be better to force ET_DYN into mmap base when
> no PT_INTERP is found (see binfmt_elf.c's elf_interpreter).)
> 
> For the PIE case we end up with the ET_DYN at one end of the mmap
> range, and mmap base at the high-address end (growing down toward the
> ET_DYN). As a result, I'd like to see the executable ET_DYN (PIE) case
> use the lowest possible base address. The (TASK_SIZE / 3 * 2) thing
> has always bothered me as being highly wasteful of address space now
> that most userspaces are doing full PIE. It made sense for the rare
> directly loaded loaders, but now it's not great. (My first step toward
> fixing this was getting all the architectures to agree on how
> randomization was happening, and I think the next step is to fix the
> loader collision issue and then bring all the architecture's
> ELF_ET_DYN_BASE way way down (and likely rename the define to be more
> clear).)
> 
> As a result of these things, I'm not sure I like the 256MB change, as
> I'd rather we get the PIE base as low as possible. I *think* this
> should match the ET_EXEC addresses: 32-bit x86 ET_EXEC loads at
> 0x8048000. 64-bit x86 ET_EXEC loads at 0x40 (though maybe we can
> take 32-bit lower still). However, the kernel needs to notice that
> when running a loader, it should go into the mmap base like a .so,
> which will keep out out of the way no matter what. However, then the
> question becomes "can the brk be placed sanely?" since the brk is
> allocated with whatever is mapped first.
>
> So, specifically:
> 
> I don't believe this is safe in the face of a loader running an
> ET_EXEC. If the loader is mapped at 0x1000 (or otherwise very low
> value of PIE ASLR), and the ET_EXEC is mapped at 0x8048000, so the
> ET_EXEC had better not be 128MB or larger...
> 
> I think loading an ET_DYN that lacks PT_INTERP should be mapped into
> the mmap base so that we can lower ELF_ET_DYN_BASE as far as possible

Re: [PATCH v4] add the option of fortified string.h functions

2017-06-02 Thread Daniel Micay
On Fri, 2017-06-02 at 14:07 -0700, Andrew Morton wrote:
> On Fri, 26 May 2017 05:54:04 -0400 Daniel Micay  > wrote:
> 
> > This adds support for compiling with a rough equivalent to the glibc
> > _FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer
> > overflow checks for string.h functions when the compiler determines
> > the
> > size of the source or destination buffer at compile-time. Unlike
> > glibc,
> > it covers buffer reads in addition to writes.
> 
> Did we find a bug in drivers/infiniband/sw/rxe/rxe_resp.c?
> 
> i386 allmodconfig:
> 
> In file included from ./include/linux/bitmap.h:8:0,
>  from ./include/linux/cpumask.h:11,
>  from ./include/linux/mm_types_task.h:13,
>  from ./include/linux/mm_types.h:4,
>  from ./include/linux/kmemcheck.h:4,
>  from ./include/linux/skbuff.h:18,
>  from drivers/infiniband/sw/rxe/rxe_resp.c:34:
> In function 'memcpy',
> inlined from 'send_atomic_ack.constprop' at
> drivers/infiniband/sw/rxe/rxe_resp.c:998:2,
> inlined from 'acknowledge' at
> drivers/infiniband/sw/rxe/rxe_resp.c:1026:3,
> inlined from 'rxe_responder' at
> drivers/infiniband/sw/rxe/rxe_resp.c:1286:10:
> ./include/linux/string.h:309:4: error: call to '__read_overflow2'
> declared with attribute error: detected read beyond size of object
> passed as 2nd parameter
> __read_overflow2();
> 
> 
> If so, can you please interpret this for the infiniband developers?

It copies sizeof(skb->cb) bytes with memcpy which is 48 bytes since cb
is a 48 byte char array in `struct sk_buff`. The source buffer is a
`struct rxe_pkt_info`:

struct rxe_pkt_info {
struct rxe_dev  *rxe;   /* device that owns packet */
struct rxe_qp   *qp;/* qp that owns packet */
struct rxe_send_wqe *wqe;   /* send wqe */
u8  *hdr;   /* points to bth */
u32 mask;   /* useful info about pkt */
u32 psn;/* bth psn of packet */
u16 pkey_index; /* partition of pkt */
u16 paylen; /* length of bth - icrc */
u8  port_num;   /* port pkt received on */
u8  opcode; /* bth opcode of packet */
u8  offset; /* bth offset from pkt->hdr */
};

That looks like 32 bytes (1 byte of padding) on 32-bit and 48 bytes on
64-bit (1 byte of padding), so on 32-bit there's a read overflow of 16
bytes from the stack here.


Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Daniel Micay
On Fri, 2017-06-02 at 17:53 +0200, Jason A. Donenfeld wrote:
> (Meanwhile...)
> 
> In my own code, I'm currently playing with a workaround that looks
> like this:
> 
> --- a/src/main.c
> +++ b/src/main.c
> 
> +#include 
> +#include 
> 
> +struct rng_initializer {
> +   struct completion done;
> +   struct random_ready_callback cb;
> +};
> +static void rng_initialized_callback(struct random_ready_callback
> *cb)
> +{
> +   complete(&container_of(cb, struct rng_initializer, cb)->done);
> +}
> +
> static int __init mod_init(void)
> {
>int ret;
> +   struct rng_initializer rng = {
> +   .done = COMPLETION_INITIALIZER(rng.done),
> +   .cb = { .owner = THIS_MODULE, .func =
> rng_initialized_callback }
> +   };
> +
> +   ret = add_random_ready_callback(&rng.cb);
> +   if (!ret)
> +   wait_for_completion(&rng.done);
> +   else if (ret != -EALREADY)
> +   return ret;
> 
>do_things_with_get_random_bytes_maybe();
> 
> Depending on the situation, however, I could imagine that
> wait_for_completion never returning, if its blocking activity that
> contributes to the seed actually being available, if this is called
> from a compiled-in module, so I find this a bit sub-optimal...

One of the early uses is initializing the stack canary value for SSP in
very early boot. If that blocks, it's going to be blocking nearly
anything else from happening.

On x86, that's only the initial canary since the per-task canaries end
up being used, but elsewhere at least without SMP disabled or changes to
GCC that's all there is so the entropy matters.


Re: [kernel-hardening] Re: [PATCH v7 2/2] security: tty: make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-05-30 Thread Daniel Micay
On Tue, 2017-05-30 at 19:00 -0400, Matt Brown wrote:
> On 5/30/17 4:22 PM, Daniel Micay wrote:
> > > Thanks, I didn't know that android was doing this. I still think
> > > this
> > > feature
> > > is worthwhile for people to be able to harden their systems
> > > against
> > > this attack
> > > vector without having to implement a MAC.
> > 
> > Since there's a capable LSM hook for ioctl already, it means it
> > could go
> > in Yama with ptrace_scope but core kernel code would still need to
> > be
> > changed to track the owning tty. I think Yama vs. core kernel
> > shouldn't
> > matter much anymore due to stackable LSMs.
> > 
> 
> What does everyone think about a v8 that moves this feature under Yama
> and uses
> the file_ioctl LSM hook?

It would only make a difference if it could be fully contained there, as
in not depending on tracking the tty owner.


Re: [kernel-hardening] Re: [PATCH v7 2/2] security: tty: make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-05-30 Thread Daniel Micay
> Thanks, I didn't know that android was doing this. I still think this
> feature
> is worthwhile for people to be able to harden their systems against
> this attack
> vector without having to implement a MAC.

Since there's a capable LSM hook for ioctl already, it means it could go
in Yama with ptrace_scope but core kernel code would still need to be
changed to track the owning tty. I think Yama vs. core kernel shouldn't
matter much anymore due to stackable LSMs.

Not the case for perf_event_paranoid=3 where a) there's already a sysctl
exposed which would be unfortunate to duplicate, b) there isn't an LSM
hook yet (AFAIK).

The toggles for ptrace and perf events are more useful though since
they're very commonly used debugging features vs. this obscure, rarely
used ioctl that in practice no one will notice is missing. It's still
friendlier to have a toggle than a seccomp policy requiring a reboot to
get rid of it, or worse compiling it out of the kernel.


Re: [kernel-hardening] Re: [PATCH v7 2/2] security: tty: make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-05-30 Thread Daniel Micay
> Seccomp requires the program in question to "opt-in" so to speak and set
> certain restrictions on itself. However as you state above, any TIOCSTI
> protection doesn't matter if the program correctly allocates a tty/pty pair.
> This protections seeks to protect users from programs that don't do things
> correctly. Rather than killing bugs, this feature attempts to kill an entire
> bug class that shows little sign of slowing down in the world of containers 
> and
> sandboxes.

It's possible to do it in PID1 as root without NO_NEW_PRIVS set, but
there isn't an existing implementation of that. It's not included in
init systems like systemd. There's no way to toggle that off at
runtime one that's done like this sysctl though. If a system
administrator wants to enable it, they'll need to modify a
configuration file and reboot if it was even supported by the init
system. It's the same argument that was used against
perf_event_paranoid=3. Meanwhile, perf_event_paranoid=3 is a mandatory
requirement for every Android device and toggling it at runtime is
*necessary* since that's exposed as a system property writable by the
Android Debug Bridge shell user (i.e. physical access via USB + ADB
enabled within the OS + ADB key of the ADB client accepted). There's
less use case for TIOCSTI so toggling it on at runtime isn't as
important, but a toggle like this is a LOT friendlier than a seccomp
blacklist even if that was supported by common init systems, and it's
not.


[PATCH v4] add the option of fortified string.h functions

2017-05-26 Thread Daniel Micay
This adds support for compiling with a rough equivalent to the glibc
_FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer
overflow checks for string.h functions when the compiler determines the
size of the source or destination buffer at compile-time. Unlike glibc,
it covers buffer reads in addition to writes.

GNU C __builtin_*_chk intrinsics are avoided because they would force a
much more complex implementation. They aren't designed to detect read
overflows and offer no real benefit when using an implementation based
on inline checks. Inline checks don't add up to much code size and allow
full use of the regular string intrinsics while avoiding the need for a
bunch of _chk functions and per-arch assembly to avoid wrapper overhead.

This detects various overflows at compile-time in various drivers and
some non-x86 core kernel code. There will likely be issues caught in
regular use at runtime too.

Future improvements left out of initial implementation for simplicity,
as it's all quite optional and can be done incrementally:

* Some of the fortified string functions (strncpy, strcat), don't yet
  place a limit on reads from the source based on __builtin_object_size of
  the source buffer.

* Extending coverage to more string functions like strlcat.

* It should be possible to optionally use __builtin_object_size(x, 1) for
  some functions (C strings) to detect intra-object overflows (like
  glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative
  approach to avoid likely compatibility issues.

* The compile-time checks should be made available via a separate config
  option which can be enabled by default (or always enabled) once enough
  time has passed to get the issues it catches fixed.

Signed-off-by: Daniel Micay 
---
Changes since v3:
- Worked around conflicts with x86 string function defines, which were mostly
  legacy optimizations not relevant to current GCC. At some point these headers
  could be stripped down to drop all these hacks not needed with GCC from the
  past 6+ years.

 arch/arm64/include/asm/string.h  |   5 +
 arch/x86/boot/compressed/misc.c  |   5 +
 arch/x86/include/asm/string_32.h |   9 ++
 arch/x86/include/asm/string_64.h |   7 ++
 include/linux/string.h   | 200 +++
 lib/string.c |   6 ++
 security/Kconfig |   6 ++
 7 files changed, 238 insertions(+)

diff --git a/arch/arm64/include/asm/string.h b/arch/arm64/include/asm/string.h
index 2eb714c4639f..d0aa42907569 100644
--- a/arch/arm64/include/asm/string.h
+++ b/arch/arm64/include/asm/string.h
@@ -63,6 +63,11 @@ extern int memcmp(const void *, const void *, size_t);
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
 #define memmove(dst, src, len) __memmove(dst, src, len)
 #define memset(s, c, n) __memset(s, c, n)
+
+#ifndef __NO_FORTIFY
+#define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
+#endif
+
 #endif
 
 #endif
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b3c5a5f030ce..43691238a21d 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -409,3 +409,8 @@ asmlinkage __visible void *extract_kernel(void *rmode, 
memptr heap,
debug_putstr("done.\nBooting the kernel.\n");
return output;
 }
+
+void fortify_panic(const char *name)
+{
+   error("detected buffer overflow");
+}
diff --git a/arch/x86/include/asm/string_32.h b/arch/x86/include/asm/string_32.h
index 3d3e8353ee5c..e9ee84873de5 100644
--- a/arch/x86/include/asm/string_32.h
+++ b/arch/x86/include/asm/string_32.h
@@ -142,7 +142,9 @@ static __always_inline void *__constant_memcpy(void *to, 
const void *from,
 }
 
 #define __HAVE_ARCH_MEMCPY
+extern void *memcpy(void *, const void *, size_t);
 
+#ifndef CONFIG_FORTIFY_SOURCE
 #ifdef CONFIG_X86_USE_3DNOW
 
 #include 
@@ -195,11 +197,15 @@ static inline void *__memcpy3d(void *to, const void 
*from, size_t len)
 #endif
 
 #endif
+#endif /* !CONFIG_FORTIFY_SOURCE */
 
 #define __HAVE_ARCH_MEMMOVE
 void *memmove(void *dest, const void *src, size_t n);
 
+extern int memcmp(const void *, const void *, size_t);
+#ifndef CONFIG_FORTIFY_SOURCE
 #define memcmp __builtin_memcmp
+#endif
 
 #define __HAVE_ARCH_MEMCHR
 extern void *memchr(const void *cs, int c, size_t count);
@@ -321,6 +327,8 @@ void *__constant_c_and_count_memset(void *s, unsigned long 
pattern,
 : __memset_generic((s), (c), (count)))
 
 #define __HAVE_ARCH_MEMSET
+extern void *memset(void *, int, size_t);
+#ifndef CONFIG_FORTIFY_SOURCE
 #if (__GNUC__ >= 4)
 #define memset(s, c, count) __builtin_memset(s, c, count)
 #else
@@ -330,6 +338,7 @@ void *__constant_c_and_count_memset(void *s, unsigned long 
pattern,
 (count))   \
 : __memset((s), (c), (count)))
 #endif
+#endif /* !CONFIG_FORTIFY_SOURCE */
 
 /*
  * find the first occurre

Re: [PATCH v3] add the option of fortified string.h functions

2017-05-26 Thread Daniel Micay
On Thu, 2017-05-25 at 20:40 -0700, Kees Cook wrote:
> On Mon, May 22, 2017 at 4:10 PM, Daniel Micay 
> wrote:
> > diff --git a/arch/x86/include/asm/string_64.h
> > b/arch/x86/include/asm/string_64.h
> > index 733bae07fb29..3c5b26e07b85 100644
> > --- a/arch/x86/include/asm/string_64.h
> > +++ b/arch/x86/include/asm/string_64.h
> > @@ -77,6 +77,11 @@ int strcmp(const char *cs, const char *ct);
> >  #define memcpy(dst, src, len) __memcpy(dst, src, len)
> >  #define memmove(dst, src, len) __memmove(dst, src, len)
> >  #define memset(s, c, n) __memset(s, c, n)
> > +
> > +#ifndef __NO_FORTIFY
> > +#define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc.
> > */
> > +#endif
> > +
> >  #endif
> > 
> >  #define __HAVE_ARCH_MEMCPY_MCSAFE 1
> 
> Ah-ha, this same KASAN exclusion is missing for string_32.h, which is
> what I think akpm tripped over in build tests.
> 
> -Kees

It's not KASAN-related but rather some cruft that's still around in the
32-bit x86 header. It unnecessarily defines memcpy as __builtin_memcpy
even though the built-in is already used on modern GCC, while the 64-bit 
header only does a similar define for GCC < 4.3. I'll just make it stop
doing that with fortify enabled.


Re: [PATCH v3] add the option of fortified string.h functions

2017-05-24 Thread Daniel Micay
On Tue, 2017-05-23 at 19:12 -0700, Kees Cook wrote:
> On Tue, May 23, 2017 at 3:48 PM, Andrew Morton
>  wrote:
> > On Mon, 22 May 2017 19:10:25 -0400 Daniel Micay  > om> wrote:
> > 
> > > This adds support for compiling with a rough equivalent to the
> > > glibc
> > > _FORTIFY_SOURCE=1 feature, providing compile-time and runtime
> > > buffer
> > > overflow checks for string.h functions when the compiler
> > > determines the
> > > size of the source or destination buffer at compile-time. Unlike
> > > glibc,
> > > it covers buffer reads in addition to writes.
> > > 
> > > GNU C __builtin_*_chk intrinsics are avoided because they would
> > > force a
> > > much more complex implementation. They aren't designed to detect
> > > read
> > > overflows and offer no real benefit when using an implementation
> > > based
> > > on inline checks. Inline checks don't add up to much code size and
> > > allow
> > > full use of the regular string intrinsics while avoiding the need
> > > for a
> > > bunch of _chk functions and per-arch assembly to avoid wrapper
> > > overhead.
> > > 
> > > This detects various overflows at compile-time in various drivers
> > > and
> > > some non-x86 core kernel code. There will likely be issues caught
> > > in
> > > regular use at runtime too.
> > > 
> > > Future improvements left out of initial implementation for
> > > simplicity,
> > > as it's all quite optional and can be done incrementally:
> > > 
> > > * Some of the fortified string functions (strncpy, strcat), don't
> > > yet
> > >   place a limit on reads from the source based on
> > > __builtin_object_size of
> > >   the source buffer.
> > > 
> > > * Extending coverage to more string functions like strlcat.
> > > 
> > > * It should be possible to optionally use __builtin_object_size(x,
> > > 1) for
> > >   some functions (C strings) to detect intra-object overflows
> > > (like
> > >   glibc's _FORTIFY_SOURCE=2), but for now this takes the
> > > conservative
> > >   approach to avoid likely compatibility issues.
> > > 
> > > * The compile-time checks should be made available via a separate
> > > config
> > >   option which can be enabled by default (or always enabled) once
> > > enough
> > >   time has passed to get the issues it catches fixed.
> > 
> > Confused by the final paragraph.  The patch adds
> > CONFIG_FORTIFY_SOURCE
> > so isn't that to-do item completed?
> 
> Right now FORTIFY_SOURCE performs two checks: compile-time (when both
> sizes can be determined) and run-time (when only destination size can
> be determined). They could be split, if it ever makes sense to do so.
> For example, the compile-time checks could be made mandatory once all
> fixes everywhere else have stabilized, and split the run-time checks
> as a new CONFIG (since they technically do add a small bump to .text
> size). I think maybe the best option for the future would be to just
> make the compile-time checks mandatory and have CONFIG_FORTIFY_SOURCE
> just cover the runtime checks. But let's see how far we get as-is.

It'll add a bit of complexity and we'll need to make sure to avoid
causing any performance hit for the compile-time-only mode.

The compile-time checks might also be incredibly annoying with Clang. I
think it might currently ignore the error attribute, so instead of an
error comprehensible to a human it will instead cause a linker error
later on with no context tied to the source code. The runtime checks
should work fine with Clang already unlike the approach in glibc.

> > Also, what does __NO_FORTIFY do?  Nothing ever defines it?
> 
> It's used to disable FORTIFY coverage as needed in the build. It's
> used in this patch already to avoid collision with KASan.

KASan instruments the memcpy, memmove and memset symbols to perform the
sanitizer checking for them.

It needs to bypass that when instrumentation is supposed to be disabled
so the kernel uses a hack where it #defines memcpy as __memcpy, which
bypasses the instrumented KASan functions. So __NO_FORTIFY gets used
there to avoid the KASan hack from causing strange interactions with the
fortify code. It ends up turning the fortified memcpy into fortified
__memcpy with that define and makes it call memcpy via __builtin_memcpy
resulting in KASan false positives by instrumenting calls that it is
trying to leave uninstrumented.

Someone might be able to come up with

[PATCH v3] add the option of fortified string.h functions

2017-05-22 Thread Daniel Micay
This adds support for compiling with a rough equivalent to the glibc
_FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer
overflow checks for string.h functions when the compiler determines the
size of the source or destination buffer at compile-time. Unlike glibc,
it covers buffer reads in addition to writes.

GNU C __builtin_*_chk intrinsics are avoided because they would force a
much more complex implementation. They aren't designed to detect read
overflows and offer no real benefit when using an implementation based
on inline checks. Inline checks don't add up to much code size and allow
full use of the regular string intrinsics while avoiding the need for a
bunch of _chk functions and per-arch assembly to avoid wrapper overhead.

This detects various overflows at compile-time in various drivers and
some non-x86 core kernel code. There will likely be issues caught in
regular use at runtime too.

Future improvements left out of initial implementation for simplicity,
as it's all quite optional and can be done incrementally:

* Some of the fortified string functions (strncpy, strcat), don't yet
  place a limit on reads from the source based on __builtin_object_size of
  the source buffer.

* Extending coverage to more string functions like strlcat.

* It should be possible to optionally use __builtin_object_size(x, 1) for
  some functions (C strings) to detect intra-object overflows (like
  glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative
  approach to avoid likely compatibility issues.

* The compile-time checks should be made available via a separate config
  option which can be enabled by default (or always enabled) once enough
  time has passed to get the issues it catches fixed.

Signed-off-by: Daniel Micay 
---
Changes since v2:
- add fortified strlcpy
- split the compile-time errors for reads and writes, and specify the parameter
- avoid redefinition of __NO_FORTIFY in KASan-uninstrumented code already 
defining __NO_FORTIFY

 arch/arm64/include/asm/string.h  |   5 +
 arch/x86/boot/compressed/misc.c  |   5 +
 arch/x86/include/asm/string_64.h |   5 +
 include/linux/string.h   | 200 +++
 lib/string.c |   6 ++
 security/Kconfig |   6 ++
 6 files changed, 227 insertions(+)

diff --git a/arch/arm64/include/asm/string.h b/arch/arm64/include/asm/string.h
index 2eb714c4639f..d0aa42907569 100644
--- a/arch/arm64/include/asm/string.h
+++ b/arch/arm64/include/asm/string.h
@@ -63,6 +63,11 @@ extern int memcmp(const void *, const void *, size_t);
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
 #define memmove(dst, src, len) __memmove(dst, src, len)
 #define memset(s, c, n) __memset(s, c, n)
+
+#ifndef __NO_FORTIFY
+#define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
+#endif
+
 #endif
 
 #endif
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b3c5a5f030ce..43691238a21d 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -409,3 +409,8 @@ asmlinkage __visible void *extract_kernel(void *rmode, 
memptr heap,
debug_putstr("done.\nBooting the kernel.\n");
return output;
 }
+
+void fortify_panic(const char *name)
+{
+   error("detected buffer overflow");
+}
diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index 733bae07fb29..3c5b26e07b85 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -77,6 +77,11 @@ int strcmp(const char *cs, const char *ct);
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
 #define memmove(dst, src, len) __memmove(dst, src, len)
 #define memset(s, c, n) __memset(s, c, n)
+
+#ifndef __NO_FORTIFY
+#define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
+#endif
+
 #endif
 
 #define __HAVE_ARCH_MEMCPY_MCSAFE 1
diff --git a/include/linux/string.h b/include/linux/string.h
index 537918f8a98e..66dc841e4c5e 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -187,4 +187,204 @@ static inline const char *kbasename(const char *path)
return tail ? tail + 1 : path;
 }
 
+#define __FORTIFY_INLINE extern __always_inline __attribute__((gnu_inline))
+#define __RENAME(x) __asm__(#x)
+
+void fortify_panic(const char *name) __noreturn __cold;
+void __read_overflow(void) __compiletime_error("detected read beyond size of 
object passed as 1st parameter");
+void __read_overflow2(void) __compiletime_error("detected read beyond size of 
object passed as 2nd parameter");
+void __write_overflow(void) __compiletime_error("detected write beyond size of 
object passed as 1st parameter");
+
+#if !defined(__NO_FORTIFY) && defined(__OPTIMIZE__) && 
defined(CONFIG_FORTIFY_SOURCE)
+__FORTIFY_INLINE char *strcpy(char *p, const char *q)
+{
+   size_t p_size = __builtin_object_size(p, 0);
+   size_t 

Re: stackprotector: ascii armor the stack canary

2017-05-19 Thread Daniel Micay
On Fri, 2017-05-19 at 17:26 -0400, r...@redhat.com wrote:
> Zero out the first byte of the stack canary value on 64 bit systems,
> in order to prevent unterminated C string overflows from being able
> to successfully overwrite the canary, even if an attacker somehow
> guessed or obtained the canary value.
> 
> Inspired by execshield ascii-armor and PaX/grsecurity.
> 
> Thanks to Daniel Micay for extracting code of similar functionality
> from PaX/grsecurity and making it easy to find in his linux-hardened
> git tree on https://github.com/thestinger/linux-hardened/

To clarify something this part isn't from PaX / grsecurity. I marked the
commits from PaX / grsecurity as such in their commit messages and these
are among the ones that aren't from there.

This is from a set of changes that I did for CopperheadOS and forward
ported to mainline recently to start linux-hardened. It was only arm64
for CopperheadOS. The overlap with PaX is that when adding the leading
zero byte for x86, I needed to first fix get_random_int being used for
the per-task canary value. I didn't know PaX fixed it way back in 2007.

I implemented heap canaries for our userspace malloc implementation and
then later did the same thing for slub in the kernel. I added a leading
zero byte to both of those heap canaries later on and then did the SSP
implementation in Bionic and the kernel's arm64 code. I took the idea
from glibc but limited it to 64-bit where there's entropy to spare. The
glibc leading zero might have come from execshield, but I don't know.


Re: [kernel-hardening] Re: [PATCH v6 0/2] security: tty: make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-05-17 Thread Daniel Micay
On Wed, 2017-05-17 at 17:41 +0100, Alan Cox wrote:
> > If we're adjusting applications, they should be made to avoid
> > TIOSCTI
> > completely. This looks to me a lot like the symlink restrictions:
> > yes,
> > userspace should be fixed to the do the right thing, but why not
> > provide support to userspace to avoid the problem entirely?
> 
> We do it's called pty/tty. There isn't any other way to do this
> correctly
> because TIOCSTI is just one hundreds of things the attacker can do to
> make your life miserable in the case you create a child process of
> lower
> security privilege and give it your tty file handle or worse (like
> some
> container crapware) your X11 socket fd.
> 
> Does it really matter any more or less if I reprogram your enter key,
> use
> TIOCSTI, set the baud rate, change all your fonts ?
> 
> The mainstream tools like sudo get this right (*). Blocking TIOCSTI
> fixes
> nothing and breaks apps. If it magically fixed the problem it might
> make
> sense but it doesn't. You actually have to get an adult to write the
> relevant code.

It gets it right because it was reported as a vulnerability and fixed,
which is the cycle many of these tools have gone through. It looks like
sudo and coreutils / shadow su were fixed in 2005, but there are more of
these tools.

CVE-2005-4890 (coreutils su, shadow su, sudo), CVE-2016-7545 (SELinux
sandbox utility), CVE-2016-2781 (chroot --userspec), CVE-2016-2779
(util-linux runuser), CVE-2016-2568 (pkexec)

I am not sure if the pkexec vulnerability is even fixed:

https://bugzilla.redhat.com/show_bug.cgi?id=1300746

CVE-2017-5226 is for bubblewrap/flatpak this year, and I'm sure there
were a lot more as I didn't bother to dig very deep.

I completely agree that it should be solved by doing things properly in
each application. However, it isn't solved properly everywhere and each
new application is making the same mistake. Providing this feature does
not break anything that people use in practice and it doesn't need to
solve every issue to be quite useful, it only needs to prevent local
privilege escalation in the form of code execution in many cases. Is
there another way to get code execution via that bubblewrap/flatpak bug
with this feature implemented? As far as I can tell, there isn't. It's a
problem even with this feature but a significantly less serious one.


Re: [kernel-hardening] Re: [PATCH v9 1/4] syscalls: Verify address limit before returning to user-mode

2017-05-12 Thread Daniel Micay
> overflow into adjacent allocations (fixed by VMAP_STACK).

99% fixed, but it's possible to skip over the guard page without
-fstack-check enabled (plus some edge cases need to be fixed in GCC),
unless VLAs were forbidden in addition to the existing large frame size
warning.

I'm not sure about in-tree code, but Qualcomm had some of these
improperly bounded VLA vulnerabilities in their MSM kernel...


Re: [kernel-hardening] Re: [PATCH v9 1/4] syscalls: Verify address limit before returning to user-mode

2017-05-12 Thread Daniel Micay
On Fri, 2017-05-12 at 22:06 +0100, Al Viro wrote:
> On Fri, May 12, 2017 at 09:21:06PM +0100, Russell King - ARM Linux
> wrote:
> > On Fri, May 12, 2017 at 12:30:02PM -0700, Kees Cook wrote:
> > > I'm clearly not explaining things well enough. I shouldn't say
> > > "corruption", I should say "malicious manipulation". The
> > > methodology
> > > of attacks against the stack are quite different from the other
> > > kinds
> > > of attacks like use-after-free, heap overflow, etc. Being able to
> > > exhaust the kernel stack (either due to deep recursion or
> > > unbounded
> > > alloca())
> > 
> > I really hope we don't have alloca() use in the kernel.  Do you have
> > evidence to support that assertion?
> > 
> > IMHO alloca() (or similar) should not be present in any kernel code
> > because we have a limited stack - we have kmalloc() etc for that
> > kind
> > of thing.
> 
> No alloca(), but there are VLAs.  Said that, the whole "what if they
> can bugger thread_info and/or task_struct and go after set_fs() state"
> is idiocy, of course - in that case the box is fucked, no matter what.

VMAP_STACK + -fstack-check would prevent exploiting even an unbounded
VLA / alloca size vs. it being an arbitrary write. -fstack-check
guarantees that there's one byte per page as the stack grows, although
there are some unfortunate GCC bugs making it less than perfect right
now... but they recently started caring about it more including making
it near zero overhead as it was always supposed to be.


Re: [kernel-hardening] Re: [PATCH v6 0/2] security: tty: make TIOCSTI ioctl require CAP_SYS_ADMIN

2017-05-10 Thread Daniel Micay
On Wed, 2017-05-10 at 21:29 +0100, Alan Cox wrote:
> 
> In addition your change to allow it to be used by root in the guest
> completely invalidates any protection you have because I can push
> 
> "rm -rf /\n"
> 
> as root in my namespace and exit
> 
> The tty buffers are not flushed across the context change so the shell
> you return to gets the input and oh dear
> 
> Alan

I might be missing something, but it looks like the patch tracks where
the tty was created and only allows this with CAP_SYS_ADMIN in the ns
where the tty came from.


Re: [PATCH] kexec_file: Adjust type of kexec_purgatory

2017-05-09 Thread Daniel Micay
On Tue, 2017-05-09 at 16:06 -0700, Kees Cook wrote:
> Defining kexec_purgatory as a zero-length char array upsets compile
> time size checking. Since this is entirely runtime sized, switch
> this to void *. This silences the warning generated by the future
> CONFIG_FORTIFY_SOURCE, which did not like the memcmp() of a "0 byte"
> array.
> 
> Cc: Daniel Micay 
> Signed-off-by: Kees Cook 
> ---
>  kernel/kexec_file.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b118735fea9d..bc86f85f1329 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -30,7 +30,7 @@
>   * Declare these symbols weak so that if architecture provides a
> purgatory,
>   * these will be overridden.
>   */
> -char __weak kexec_purgatory[0];
> +void * __weak kexec_purgatory;
>  size_t __weak kexec_purgatory_size = 0;
>  
>  static int kexec_calculate_store_digests(struct kimage *image);
> -- 
> 2.7.4

It seems more correct to use char `char __weak kexec_purgatory[]`,
otherwise isn't __builtin_object_size ending up as 8, which is still
wrong?


Re: [kernel-hardening] Re: [PATCH v9 1/4] syscalls: Verify address limit before returning to user-mode

2017-05-08 Thread Daniel Micay
On Mon, 2017-05-08 at 09:52 +0200, Ingo Molnar wrote:
> 
> ... it's just not usable in that form for a regular maintenance flow.
> 
> So what would be more useful is to add a specific Sparse check that
> only checks 
> KERNEL_DS, to add it as a regular (.config driven) build option and
> make sure the 
> kernel build has zero warnings.
> 
> From that point on we can declare that this kind of bug won't occur
> anymore, if 
> the Sparse implementation of the check is correct.
> 
> But there's a (big) problem with that development model: Sparse is not
> part of the 
> kernel tree and adding a feature to it while making the kernel depend
> on that 
> brand new feature is a logistical nightmare. The overhead is quite
> similar to 
> adding new features to a compiler - it happens at a glacial pace and
> is only done 
> for major features really, at considerable expense. I don't think this
> is an 
> adequate model for 'extended syntax checking' of the kernel,
> especially when it 
> comes to correctness that has such obvious security impact.
> 
> Thanks,
> 
>   Ingo

There's the option of using GCC plugins now that the infrastructure was
upstreamed from grsecurity. It can be used as part of the regular build
process and as long as the analysis is pretty simple it shouldn't hurt
compile time much.

The problem with doing that is I don't think there are people with much
experience with GCC contributing upstream and it's going to be more work
to develop/maintain than some kind of specialized DSL for analysis. I
think a few static analysis plugins are used as part of maintaining
grsecurity for solving issues like finding false positives for the
REFCOUNT overflow checking feature, so it's something that's already
being done in practice elsewhere.


[tip:core/urgent] stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms

2017-05-05 Thread tip-bot for Daniel Micay
Commit-ID:  5ea30e4e58040cfd6434c2f33dc3ea76e2c15b05
Gitweb: http://git.kernel.org/tip/5ea30e4e58040cfd6434c2f33dc3ea76e2c15b05
Author: Daniel Micay 
AuthorDate: Thu, 4 May 2017 09:32:09 -0400
Committer:  Ingo Molnar 
CommitDate: Fri, 5 May 2017 08:05:13 +0200

stackprotector: Increase the per-task stack canary's random range from 32 bits 
to 64 bits on 64-bit platforms

The stack canary is an 'unsigned long' and should be fully initialized to
random data rather than only 32 bits of random data.

Signed-off-by: Daniel Micay 
Acked-by: Arjan van de Ven 
Acked-by: Rik van Riel 
Acked-by: Kees Cook 
Cc: Arjan van Ven 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: kernel-harden...@lists.openwall.com
Cc: sta...@vger.kernel.org
Link: http://lkml.kernel.org/r/20170504133209.3053-1-danielmi...@gmail.com
Signed-off-by: Ingo Molnar 
---
 kernel/fork.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 3a4343c..d681f8f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -536,7 +536,7 @@ static struct task_struct *dup_task_struct(struct 
task_struct *orig, int node)
set_task_stack_end_magic(tsk);
 
 #ifdef CONFIG_CC_STACKPROTECTOR
-   tsk->stack_canary = get_random_int();
+   tsk->stack_canary = get_random_long();
 #endif
 
/*


[PATCH] use get_random_long for the per-task stack canary

2017-05-04 Thread Daniel Micay
The stack canary is an unsigned long and should be fully initialized to
random data rather than only 32 bits of random data.

Cc: sta...@vger.kernel.org
Signed-off-by: Daniel Micay 
---
 kernel/fork.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 56d85fd81411..ff84ff82f56a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -537,7 +537,7 @@ static struct task_struct *dup_task_struct(struct 
task_struct *orig, int node)
set_task_stack_end_magic(tsk);
 
 #ifdef CONFIG_CC_STACKPROTECTOR
-   tsk->stack_canary = get_random_int();
+   tsk->stack_canary = get_random_long();
 #endif
 
/*
-- 
2.12.2



Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()

2017-04-08 Thread Daniel Micay
> grsecurity and PaX are great projects.  They have a lot of good ideas,
> and they're put together quite nicely.  The upstream kernel should
> *not* do things differently from they way they are in grsecurity/PaX
> just because it wants to be different.  Conversely, the upstream
> kernel should not do things the same way as PaX just to be more like
> PaX.
>
> Keep in mind that the upstream kernel and grsecurity/PaX operate under
> different constraints.  The upstream kernel tries to keep itself clean
> and to make tree-wide updates rather that keeping compatibility stuff
> around.  PaX and grsecurity presumably want to retain some degree of
> simplicity when porting to newer upstream versions.
>
> In the context of virtually mapped stacks / KSTACKOVERFLOW, this
> naturally leads to different solutions.  The upstream kernel had a
> bunch of buggy drivers that played badly with virtually mapped stacks.
> grsecurity sensibly went for the approach where the buggy drivers kept
> working.  The upstream kernel went for the approach of fixing the
> drivers rather than keeping a compatibility workaround.  Different
> constraints, different solutions.

Sure, but nothing is currently landed right now and the PaX
implementation is a known quantity with a lot of testing. The
submitted code is aimed at rare writes to globals, but this feature is
more than that and design decisions shouldn't be based on just the
short term. Kees is sensibly landing features by submitting them a
little bit at a time, but a negative side effect of that is too much
focus on just the initial proposed usage.

As a downstream that's going to be making heavy use of mainline
security features as a base due to PaX and grsecurity becoming
commercial only private patches (*because* of upstream and the
companies involved), I hope things go in the right direction i.e. not
weaker/slower implementations than PaX. For example, while USERCOPY
isn't entirely landed upstream (missing slab whitelisting and user
offset/size), it's the base for the full feature and is going to get
more testing. On the other hand, refcount_t and the slab/page
sanitization work is 100% useless if you're willing to incorporate the
changes to do it without needless performance loss and complexity. PAN
emulation on 64-bit ARM was fresh ground while on ARMv7 a weaker
implementation was used for no better reason than clashing egos. The
upstream virtual mapped stacks will probably be perfectly good, but I
just find it to be such a waste of time and effort to reinvent it and
retread the same ground in terms of finding the bugs.

I actually care a lot more about 64-bit ARM support than I do x86, but
using a portable API for pax_open_kernel (for the simple uses at
least) is separate from choosing the underlying implementation. There
might not be a great way to do it on the architectures I care about
but that doesn't need to hinder x86. It's really not that much code...
A weaker/slower implementation for x86 also encourages the same
elsewhere.

> In the case of rare writes or pax_open_kernel [1] or whatever we want
> to call it, CR3 would work without arch-specific code, and CR0 would
> not.  That's an argument for CR3 that would need to be countered by
> something.  (Sure, avoiding leaks either way might need arch changes.
> OTOH, a *randomized* CR3-based approach might not have as much of a
> leak issue to begin with.)

By randomized do you mean just ASLR? Even if the window of opportunity
to exploit it is small, it's really not the same and this has a lot
more use than just rare writes to small global variables.

I wouldn't consider stack ASLR to be a replacement for making them
readable/writable only by the current thread either (required for
race-free return CFI on x86, at least without not using actual
returns).

> [1] Contrary to popular belief, I don't sit around reading grsecurity
> code or config options, so I really don't know what this thing is
> called.

Lots of the features aren't actually named. Maybe this could be
considered part of KERNEXEC but I don't think it is.


Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()

2017-04-07 Thread Daniel Micay
>> Fair enough. However, placing a BUG_ON(!(read_cr0() & X86_CR0_WP))
>> somewhere sensible should make those "leaks" visible fast -- and their
>> exploitation impossible, i.e. fail hard.
>
> The leaks surely exist and now we'll just add an exploitable BUG.

That didn't seem to matter for landing a rewrite of KSTACKOVERFLOW
with a bunch of *known* DoS bugs dealt with in grsecurity and those
were known issues that were unfixed for no apparent reason other than
keeping egos intact. It looks like there are still some left...

In that case, there also wasn't a security/performance advantage.


Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()

2017-04-07 Thread Daniel Micay
> Not too late to rename it. Scoped write? I think it makes change to

s/change/sense/


Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()

2017-04-07 Thread Daniel Micay
> I probably chose the wrong name for this feature (write rarely).
> That's _usually_ true, but "sensitive_write()" was getting rather
> long. The things that we need to protect with this are certainly stuff
> that doesn't get much writing, but some things are just plain
> sensitive (like page tables) and we should still try to be as fast as
> possible with them.

Not too late to rename it. Scoped write? I think it makes change to
use a different API than PaX for portability too, but not a different
x86 implementation. It's quite important to limit the writes to the
calling thread and it needs to perform well to be introduced widely.

> I'm all for a general case for the infrastructure (as Andy and Mark
> has mentioned), but I don't want to get into the situation where
> people start refusing to use it because it's "too slow" (for example,
> see refcount_t vs net-dev right now).

Meanwhile, the PaX implementation has improved to avoid the issues
that were brought up while only introducing a single always-predicted
(due to code placement) branch on the overflow flag. That seems to
have gone unnoticed upstream, where there's now a much slower
implementation that's not more secure, and is blocked from
introduction in areas where it's most needed based on the performance.
Not to mention that it's opt-in... which is never going to work.


Re: [PATCH] gcc-plugins: Add structleak for more stack initialization

2017-01-16 Thread Daniel Micay
On Mon, 2017-01-16 at 15:24 +, Mark Rutland wrote:
> Hi,
> 
> On Sat, Jan 14, 2017 at 11:03:14AM +0100, PaX Team wrote:
> > On 13 Jan 2017 at 14:02, Kees Cook wrote:
> > 
> > > +config GCC_PLUGIN_STRUCTLEAK_VERBOSE
> > > + bool "Report initialized variables"
> > > + depends on GCC_PLUGIN_STRUCTLEAK
> > > + depends on !COMPILE_TEST
> > > + help
> > > +   This option will cause a warning to be printed each
> > > time the
> > > +   structleak plugin finds a variable it thinks needs to
> > > be
> > > +   initialized. Since not all existing initializers are
> > > detected
> > > +   by the plugin, this can produce false positive
> > > warnings.
> > 
> > there are no false positives, a variable either has a constructor or
> > it does not ;)
> 
> ... or it has no constructor, but is clobbered by a memset before it
> is
> possibly copied. ;)
> 
> For example:
> 
> arch/arm64/kernel/fpsimd.c: In function 'do_fpsimd_exc':
> arch/arm64/kernel/fpsimd.c:106:12: note: userspace variable will be
> forcibly initialized
>   siginfo_t info;
> 
> ... where the code looks like:
> 
> void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs)
> {
>   siginfo_t info;
>   unsigned int si_code = 0;
> 
>   if (esr & FPEXC_IOF)
>   si_code = FPE_FLTINV;
>   else if (esr & FPEXC_DZF)
>   si_code = FPE_FLTDIV;
>   else if (esr & FPEXC_OFF)
>   si_code = FPE_FLTOVF;
>   else if (esr & FPEXC_UFF)
>   si_code = FPE_FLTUND;
>   else if (esr & FPEXC_IXF)
>   si_code = FPE_FLTRES;
> 
>   memset(&info, 0, sizeof(info));
>   info.si_signo = SIGFPE;
>   info.si_code = si_code;
>   info.si_addr = (void __user *)instruction_pointer(regs);
> 
>   send_sig_info(SIGFPE, &info, current);
> }
> 
> ... so it's clear to a human that info is initialised prior to use,
> though not by an explicit field initializer.

It's obvious to the compiler too, not just a human. The compiler can
optimize it out when it's so clearly not required, although translation
units will get in the way without LTO. There's existing sophisticated
analysis for automatic initialization with full coverage. That's part of
why I think it makes sense to have a toggle for heuristics vs. full
coverage (incl. non-function-scope locals, that's just a heuristic too).

> > > +/* unused C type flag in all versions 4.5-6 */
> > > +#define TYPE_USERSPACE(TYPE) TYPE_LANG_FLAG_5(TYPE)
> > 
> > FYI, this is a sort of abuse/hack of tree flags and should not be
> > implemented this
> > way in the upstream kernel as it's a finite resource and needs
> > careful verification
> > against all supported gcc versions (these flags are meant for
> > language fronteds, i
> > kinda got lucky to have a few of them unusued but it's not a robust
> > future-proof
> > approach). instead an attribute should be used to mark these types.
> > whether that
> > can/should be __user itself is a good question since that's another
> > hack where the
> > plugin 'hijacks' a sparse address space atttribute (for which gcc
> > 4.6+ has its own
> > facilities and that the checker gcc plugin makes use of thus it's
> > not compatible
> > with structleak as is).
> 
> To me, it seems that the __user annotation can only be an indicator of
> an issue by chance. We have structures with __user pointers in structs
> that will never be copied to userspace, and conversely we have structs
> that don't contain a __user field, but will be copied to userspace.
> 
> Maybe it happens that structs in more complex systems are more likely
> to
> contain some __user pointer. Was that part of the rationale?
> 
> I wonder if there's any analysis we can do of data passing into
> copy_to_user() and friends. I guess we can't follow the data flow
> across
> compilation units, but we might be able to follow it well enough if we
> added a new attribute that described whether data was to be copied to
> userspace.
> 
> Thanks,
> Mark.



signature.asc
Description: This is a digitally signed message part


Re: [kernel-hardening] [PATCH v4 2/2] procfs/tasks: add a simple per-task procfs hidepid= field

2017-01-16 Thread Daniel Micay
> This should permit Linux distributions to more comprehensively lock
> down
> their services, as it allows an isolated opt-in for hidepid= for
> specific services. Previously hidepid= could only be set system-wide,
> and then specific services had to be excluded by group membership,
> essentially a more complex concept of opt-out.

I think it's a lot easier for them to introduce a proc group and then
figure out the very few exceptions that are needed vs. requiring a huge
number of opt-ins. I don't think the issue is difficulty in deploying
it, it's lack of interest. Android deployed it in 7.x without any major
issues. A good way to get people to use it would be adding proc groups
to major distributions and getting systemd to expose a simple toggle for
this, instead of requiring users to add /proc to fstab (not there by
default with systemd) and hard-wired the correct proc gid for that
distribution. Can then file bugs for packages needing the proc group.
For systemd itself, logind needs it since it drops the capability that
allows bypassing it. Other than that, it's mostly just polkit.

signature.asc
Description: This is a digitally signed message part


Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF

2016-12-16 Thread Daniel Micay
On Fri, 2016-12-16 at 11:47 -0800, Tom Herbert wrote:
> 
> That's about 3x of jhash speed (7 nsecs). So that might closer
> to a more palatable replacement for jhash. Do we lose any security
> advantages with halfsiphash?

Have you tested a lower round SipHash? Probably best to stick with the
usual construction for non-DoS mitigation, but why not try SipHash 1-3,
1-2, etc. for DoS mitigation?

Rust and Swift both went with SipHash 1-3 for hash tables.

signature.asc
Description: This is a digitally signed message part


Re: [kernel-hardening] [PATCH 3/4] Make static usermode helper binaries constant

2016-12-15 Thread Daniel Micay
> So for statics, I think `static const char *` wins due to allowing
> merging (although it doesn't matter here). For non-statics, you end up
> with extra pointer constants. Those could get removed, but Linux
> doesn't
> have -fvisibility=hidden and I'm not sure how clever linkers are.
> Maybe
> setting up -fvisibility=hidden to work with monolithic non-module-
> enabled builds could actually be realistic. Expect it'd remove a fair
> bit of bloat but not sure how much would need to be marked as non-
> hidden 
> other than the userspace ABI.

-fvisibility=hidden + LTO would be really awesome though, since that
doesn't depend on the cleverness of linkers. So much that could be
ripped out of real world monolithic builds. Kinda getting off-topic now
though. LTO is pretty scary from a security perspective due to how much
worse undefined behavior that was previously harmless can get.

signature.asc
Description: This is a digitally signed message part


Re: [kernel-hardening] [PATCH 3/4] Make static usermode helper binaries constant

2016-12-15 Thread Daniel Micay
> Thanks for the explanation.  I don't think we need to worry about
> merging these strings, but I'll keep it in mind.
> 
> However, the "folklore" of the kernel was to never do:
>   char *foo = "bar";
> but instead do:
>   char foo[] = "bar";
> to save on the extra variable that the former creates.  Is that no
> longer the case and we really should be using '*' to allow gcc to be
> smarter about optimizations?
> 
> > The 'const' qualifier for pointers doesn't really do anything, it's
> > when
> > it's used on the variable (after the pointer) that it can do more
> > than
> > acting as a programming guide.
> 
> Many thanks for the explanations,
> 
> greg k-h

Can see what the compiler has to work with pretty easily from LLVM IR:

char *const constant_string_constant = "string";
char *const constant_string_constant2 = "string";
char *non_constant_string_constant = "string";
char *non_constant_string_constant2 = "string";
char non_constant_string_array[] = "string";
char non_constant_string_array2[] = "string";
const char constant_string_array[] = "string";
const char constant_string_array2[] = "string";

Becomes:

@.str = private unnamed_addr constant [7 x i8] c"string\00", align 1
@constant_string_constant = constant i8* getelementptr inbounds ([7 x i8], 
[7 x i8]* @.str, i32 0, i32 0), align 8
@constant_string_constant2 = constant i8* getelementptr inbounds ([7 x i8], 
[7 x i8]* @.str, i32 0, i32 0), align 8
@non_constant_string_constant = global i8* getelementptr inbounds ([7 x 
i8], [7 x i8]* @.str, i32 0, i32 0), align 8
@non_constant_string_constant2 = global i8* getelementptr inbounds ([7 x 
i8], [7 x i8]* @.str, i32 0, i32 0), align 8
@non_constant_string_array = global [7 x i8] c"string\00", align 1
@non_constant_string_array2 = global [7 x i8] c"string\00", align 1
@constant_string_array = constant [7 x i8] c"string\00", align 1
@constant_string_array2 = constant [7 x i8] c"string\00", align 1

And with optimization:

@constant_string_constant = local_unnamed_addr constant i8* getelementptr 
inbounds ([7 x i8], [7 x i8]* @constant_string_array, i64 0, i64 0), align 8
@constant_string_constant2 = local_unnamed_addr constant i8* getelementptr 
inbounds ([7 x i8], [7 x i8]* @constant_string_array, i64 0, i64 0), align 8
@non_constant_string_constant = local_unnamed_addr global i8* getelementptr 
inbounds ([7 x i8], [7 x i8]* @constant_string_array, i64 0, i64 0), align 8
@non_constant_string_constant2 = local_unnamed_addr global i8* 
getelementptr inbounds ([7 x i8], [7 x i8]* @constant_string_array, i64 0, i64 
0), align 8
@non_constant_string_array = local_unnamed_addr global [7 x i8] 
c"string\00", align 1
@non_constant_string_array2 = local_unnamed_addr global [7 x i8] 
c"string\00", align 1
@constant_string_array = local_unnamed_addr constant [7 x i8] c"string\00", 
align 1
@constant_string_array2 = local_unnamed_addr constant [7 x i8] 
c"string\00", align 1

If they're static though, the compiler can see that nothing takes the
address (local_unnamed_addr == unnamed_addr if it's internal) so it
doesn't need separate variables anyway:

static char *const constant_string_constant = "string";
static char *const constant_string_constant2 = "string";

char *foo() {
  return constant_string_constant;
}

char *bar() {
  return constant_string_constant2;
}

Becomes (with optimization):

@.str = private unnamed_addr constant [7 x i8] c"string\00", align 1

; Function Attrs: norecurse nounwind readnone uwtable
define i8* @foo() local_unnamed_addr #0 {
  ret i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0)
}

; Function Attrs: norecurse nounwind readnone uwtable
define i8* @bar() local_unnamed_addr #0 {
  ret i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0)
}

So for statics, I think `static const char *` wins due to allowing
merging (although it doesn't matter here). For non-statics, you end up
with extra pointer constants. Those could get removed, but Linux doesn't
have -fvisibility=hidden and I'm not sure how clever linkers are. Maybe
setting up -fvisibility=hidden to work with monolithic non-module-
enabled builds could actually be realistic. Expect it'd remove a fair
bit of bloat but not sure how much would need to be marked as non-hidden 
other than the userspace ABI.

signature.asc
Description: This is a digitally signed message part


Re: [kernel-hardening] [PATCH 3/4] Make static usermode helper binaries constant

2016-12-15 Thread Daniel Micay
> To follow up on this, and after staring at too many outputs of the
> compiler, I think what this really should be is:
>   static char const critical_overtemp_path[] =
> "/sbin/critical_overtemp";
> right?
> 
> That way both the variable, and the data, end up in read-only memory
> from what I can tell.
> 
> But, if I do:
>   static char const char critical_overtemp_path[] =
> "/sbin/critical_overtemp";
> then sparse complains to me about:
>   warning: duplicate const
> 
> Is that just sparse being dense, or is the latter one really better
> here?  It seems that both of the above put the data and variable into
> the same segment (.rodata).
> 
> thanks,
> 
> greg k-h

Either 'char *const foo = "bar"' or 'const char *const foo = "bar" will
also be a string constant in rodata with a pointer in rodata referring
to them. Duplicate string constants get merged without any analysis as
there's no guarantee of a unique address for the data itself since it's
not a variable. 'const char foo[] = "bar"' goes into rodata too, but is
the toolchain can't assume it can't safely merge strings + sizeof works
but gcc/clang know how to optimize constant strlen anyway.

The 'const' qualifier for pointers doesn't really do anything, it's when
it's used on the variable (after the pointer) that it can do more than
acting as a programming guide.

signature.asc
Description: This is a digitally signed message part


Re: [kernel-hardening] Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread Daniel Micay
On Thu, 2016-12-15 at 15:57 +0800, Herbert Xu wrote:
> Jason A. Donenfeld  wrote:
> > 
> > Siphash needs a random secret key, yes. The point is that the hash
> > function remains secure so long as the secret key is kept secret.
> > Other functions can't make the same guarantee, and so nervous
> > periodic
> > key rotation is necessary, but in most cases nothing is done, and so
> > things just leak over time.
> 
> Actually those users that use rhashtable now have a much more
> sophisticated defence against these attacks, dyanmic rehashing
> when bucket length exceeds a preset limit.
> 
> Cheers,

Key independent collisions won't be mitigated by picking a new secret.

A simple solution with clear security properties is ideal.

signature.asc
Description: This is a digitally signed message part


Re: [kernel-hardening] rowhammer protection [was Re: Getting interrupt every million cache misses]

2016-11-01 Thread Daniel Micay
On Tue, 2016-11-01 at 07:33 +0100, Ingo Molnar wrote:
> * Pavel Machek  wrote:
> 
> > I'm not going to buy broken hardware just for a test.
> 
> Can you suggest a method to find heavily rowhammer affected hardware?
> Only by 
> testing it, or are there some chipset IDs ranges or dmidecode info
> that will 
> pinpoint potentially affected machines?
> 
> Thanks,
> 
>   Ingo

You can read the memory timing values, but you can't know if they're
reasonable for that hardware. Higher quality memory can have better
timings without being broken. The only relevant information would be the
memory model, combined with an expensive / time consuming effort to
build a blacklist based on testing. It doesn't seem realistic, unless
it's done in a coarse way based on brand and the date information.

I don't know how to get this data on Linux. The CPU-Z tool for Windows
knows how to obtain it but it's based on a proprietary library.

You definitely don't need to buy broken hardware to test a broken
hardware setup though. You just need a custom computer build where
motherboards expose the memory timing configuration. You can make it
more vulnerable by raising the refresh period (tREF). I wanted to play
around with that but haven't gotten around to it.

signature.asc
Description: This is a digitally signed message part


Re: [kernel-hardening] Re: [PATCH] fork: make whole stack_canary random

2016-10-31 Thread Daniel Micay
On Mon, 2016-10-31 at 22:38 +0100, Florian Weimer wrote:
> * Daniel Micay:
> 
> > -fstack-stack is supposed to handle a single guard by default, and
> > that's all there is for thread stacks by default.
> 
> Okay, then I'll really have to look at the probing offsets again.
> It's been on my to-do list since about 2012, and arguably, it *is* a
> user-space thing.

This is concerning too:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66479

It might be prevented for VLAs by using -fsanitize=vla-bound -fsanitize-
trap=vla-bound but probably not alloca (or the older -fsanitize-
undefined-trap-on-error for GCC,  since for some reason it doesn't seem
to have the new way).

> And I just realized that we should probably fail at dlopen time if
> some tries to open a DSO which needs an executable stack, rather than
> silently changing all thread stacks to executable. *sigh*
> 
> > > > Note: talking about userspace after the entropy bit. The kernel
> > > > doesn't
> > > > really -fstack-check, at least in even slightly sane code...
> > > 
> > >  
> > > There used to be lots of discussions about kernel stack sizes ...
> > 
> > It should just be banning VLAs, alloca and large stack frames
> > though, if
> > it's not already. There wasn't even support for guard pages with
> > kernel
> > stacks until recently outside grsecurity,
> 
> Which is not surprising, considering that one prime motivation for
> small stacks was to conserve 32-bit address space.  But I'm glad that
> there is now a guard page.  Hopefully, it does not affect performance,
> and on 64-bit, at least there isn't the address space limit to worry
> about.

I think it might actually improve performance overall.

> > and -fstack-check relies on them so it doesn't seem like a great
> > solution for the kernel.
> 
> -fsplit-stack could enforce stack usage limits even without guard
> pages, but of course, there is some run-time overhead, and the limit
> has to come from somewhere (typically the TCB).

Yeah, that's how Rust provided stack safety on LLVM. They ended up
removing it and they only have stack safety on Windows now, since LLVM
doesn't yet provide stack probing outside Windows. So much for being a
memory safe language... it's ultimately LLVM's fault though. There were
actually working patches submitted for this by people working on Rust
but *it never landed* due to endless bikeshedding + scope creep.

signature.asc
Description: This is a digitally signed message part


Re: [kernel-hardening] Re: [PATCH] fork: make whole stack_canary random

2016-10-31 Thread Daniel Micay
On Mon, 2016-10-31 at 22:22 +0100, Jann Horn wrote:
> On Mon, Oct 31, 2016 at 10:10:41PM +0100, Florian Weimer wrote:
> > * Daniel Micay:
> > 
> > > > It makes a lot of sense on x86_64 where it means the canary is
> > > > still 56 bits. Also, you want -fstack-check for protecting again
> > > > stack overflows rather than stack *buffer* overflow. SSP won't
> > > > really help you in that regard. Sadly, while -fstack-check now
> > > > works well in GCC 6 with little performance cost, it's not
> > > > really a
> > 
> > I think GCC still does not treat the return address push on
> > architectures which have such a CALL instruction as an implicit
> > stack
> > probe.
> > 
> > > > complete feature (and Clang impls it as a no-op!).
> > 
> > How many guard pages at the end of the stack does the kernel
> > guarantee?  I saw some -fstack-check-generated code which seemed to
> > jump over a single guard page.
> 
> Until recently: Zero, no guard pages below stacks, stack overflow
> goes straight into some other allocation.
> Now: One guard page, thanks to a lot of work by Andy Lutomirski.
> (I think that change is in the current 4.9-rc3 kernel, but not in
> any stable kernel yet.)

I think Florian is talking about the main thread stack in userspace.

Thus the next part about how it doesn't actually *guarantee* a guard
page at all right now for that. Userspace has to work around that if
it's worried about it (it can't really happen in practice though, but
it's not great that mmap with a hint + no MAP_FIXED can break it).

signature.asc
Description: This is a digitally signed message part


  1   2   >