> On Feb 16, 2019, at 2:50 PM, Andy Lutomirski <[email protected]> wrote: > > > >>> On Feb 16, 2019, at 12:18 PM, Thomas Gleixner <[email protected]> wrote: >>> >>>> On Sat, 16 Feb 2019, Jann Horn wrote: >>>> On Sat, Feb 16, 2019 at 12:59 AM <[email protected]> wrote: >>>> When extracting an initramfs, a filename may be near an allocation >>>> boundary. >>>> Should that happen, strncopy_from_user will invoke unsafe_get_user which >>>> may cross the allocation boundary. Should that happen, unsafe_get_user will >>>> trigger a page fault, and strncopy_from_user would then bailout to >>>> byte_at_a_time behavior. >>>> >>>> unsafe_get_user is unsafe by nature, and rely on pagefault to detect >>>> boundaries. >>>> After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel >>>> addresses") >>>> it may no longer rely on pagefault as the new page fault handler would >>>> trigger a BUG(). >>>> >>>> This commit allows unsafe_get_user to explicitly trigger pagefaults and >>>> handle them directly with the error target label. >>> >>> Oof. So basically the init code is full of things that just call >>> syscalls instead of using VFS functions (which don't actually exist >>> for everything), and the VFS syscalls use getname_flags(), which uses >>> strncpy_from_user(), which can access out-of-bounds pages on >>> architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and >>> that in summary means that all the init code is potentially prone to >>> tripping over this? >> >> Not all init code. It should be only the initramfs decompression. >> >>> I don't particularly like this approach to fixing it, but I also don't >>> have any better ideas, so I guess unless someone else has a bright >>> idea, this patch might have to go in. >> >> So we know that this happens in the context of decompress() which calls >> flush_buffer() for every chunk. flush_buffer() gets the start_address and >> the length. We also know that the fault can only happen within: >> >> start_address <= fault_address < start_address + length + 8; >> >> So something like the untested workaround below should cover the initramfs >> oddity and avoid to weaken the protection for all other cases. > > What is the actual problem? We’re not actually demand-faulting this data, > are we? Are we just overrunning the buffer because the from_user helpers are > too clever? Can we fix it for real by having the fancy helpers do *aligned* > loads so that they don’t overrun the buffer? Heck, this might be faster, too.
Indeed. I would argue that the current code is a bug even in the normal case. If I lay out my user address space so that I have f,o,o,o,\0 at the end of a page and I have non-side-effect-free memory after it (MMIO, userfaultfd, etc), then passing a pointer to that “fooo” string to a syscall should *not* overrun the buffer. If I have some time this evening, I’ll see if I can whip up a credible fix.

