Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Mon, Oct 5, 2020 at 6:01 PM Alexander Potapenko wrote: > > On Tue, Sep 29, 2020 at 5:06 PM Mark Rutland wrote: > > > > On Tue, Sep 29, 2020 at 04:51:29PM +0200, Marco Elver wrote: > > > On Tue, 29 Sep 2020 at 16:24, Mark Rutland wrote: > > > [...] > > > > > > > > From other sub-threads it sounds like these addresses are not part of > > > > the linear/direct map. Having kmalloc return addresses outside of the > > > > linear map is going to break anything that relies on virt<->phys > > > > conversions, and is liable to make DMA corrupt memory. There were > > > > problems of that sort with VMAP_STACK, and this is why kvmalloc() is > > > > separate from kmalloc(). > > > > > > > > Have you tested with CONFIG_DEBUG_VIRTUAL? I'd expect that to scream. > > > > > > > > I strongly suspect this isn't going to be safe unless you always use an > > > > in-place carevout from the linear map (which could be the linear alias > > > > of a static carevout). > > > > > > That's an excellent point, thank you! Indeed, on arm64, a version with > > > naive static-pool screams with CONFIG_DEBUG_VIRTUAL. > > > > > > We'll try to put together an arm64 version using a carveout as you > > > suggest. > > > > Great, thanks! > > > > Just to be clear, the concerns for DMA and virt<->phys conversions also > > apply to x86 (the x86 virt<->phys conversion behaviour is more forgiving > > in the common case, but still has cases that can go wrong). > > To clarify, shouldn't kmalloc/kmem_cache allocations used with DMA be > allocated with explicit GFP_DMA? > If so, how practical would it be to just skip such allocations in > KFENCE allocator? AFAIK GFP_DMA doesn't really mean "I will use this allocation for DMA"; it means "I will use this allocation for DMA using some ancient hardware (e.g. stuff on the ISA bus?) that only supports 16-bit physical addresses (or maybe different limits on other architectures)". There's also GFP_DMA32, which means the same thing, except with 32-bit physical addresses. You can see in e.g. __dma_direct_alloc_pages() that the GFP_DMA32 and GFP_DMA flags are only used if the hardware can't address the full physical address space supported by the CPU.
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Tue, Sep 29, 2020 at 5:06 PM Mark Rutland wrote: > > On Tue, Sep 29, 2020 at 04:51:29PM +0200, Marco Elver wrote: > > On Tue, 29 Sep 2020 at 16:24, Mark Rutland wrote: > > [...] > > > > > > From other sub-threads it sounds like these addresses are not part of > > > the linear/direct map. Having kmalloc return addresses outside of the > > > linear map is going to break anything that relies on virt<->phys > > > conversions, and is liable to make DMA corrupt memory. There were > > > problems of that sort with VMAP_STACK, and this is why kvmalloc() is > > > separate from kmalloc(). > > > > > > Have you tested with CONFIG_DEBUG_VIRTUAL? I'd expect that to scream. > > > > > > I strongly suspect this isn't going to be safe unless you always use an > > > in-place carevout from the linear map (which could be the linear alias > > > of a static carevout). > > > > That's an excellent point, thank you! Indeed, on arm64, a version with > > naive static-pool screams with CONFIG_DEBUG_VIRTUAL. > > > > We'll try to put together an arm64 version using a carveout as you suggest. > > Great, thanks! > > Just to be clear, the concerns for DMA and virt<->phys conversions also > apply to x86 (the x86 virt<->phys conversion behaviour is more forgiving > in the common case, but still has cases that can go wrong). To clarify, shouldn't kmalloc/kmem_cache allocations used with DMA be allocated with explicit GFP_DMA? If so, how practical would it be to just skip such allocations in KFENCE allocator?
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Tue, Sep 29, 2020 at 05:51:58PM +0200, Alexander Potapenko wrote: > On Tue, Sep 29, 2020 at 4:24 PM Mark Rutland wrote: > > > > On Mon, Sep 21, 2020 at 03:26:02PM +0200, Marco Elver wrote: > > > From: Alexander Potapenko > > > > > > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > > > low-overhead sampling-based memory safety error detector of heap > > > use-after-free, invalid-free, and out-of-bounds access errors. > > > > > > KFENCE is designed to be enabled in production kernels, and has near > > > zero performance overhead. Compared to KASAN, KFENCE trades performance > > > for precision. The main motivation behind KFENCE's design, is that with > > > enough total uptime KFENCE will detect bugs in code paths not typically > > > exercised by non-production test workloads. One way to quickly achieve a > > > large enough total uptime is when the tool is deployed across a large > > > fleet of machines. > > > > > > KFENCE objects each reside on a dedicated page, at either the left or > > > right page boundaries. The pages to the left and right of the object > > > page are "guard pages", whose attributes are changed to a protected > > > state, and cause page faults on any attempted access to them. Such page > > > faults are then intercepted by KFENCE, which handles the fault > > > gracefully by reporting a memory access error. To detect out-of-bounds > > > writes to memory within the object's page itself, KFENCE also uses > > > pattern-based redzones. The following figure illustrates the page > > > layout: > > > > > > ---+---+---+---+---+---+--- > > > | x | O : | x | : O | x | > > > | x | B : | x | : B | x | > > > | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | > > > | x | E : ZONE | x | ZONE : E | x | > > > | x | C : | x | : C | x | > > > | x | T : | x | : T | x | > > > ---+---+---+---+---+---+--- > > > > > > Guarded allocations are set up based on a sample interval (can be set > > > via kfence.sample_interval). After expiration of the sample interval, a > > > guarded allocation from the KFENCE object pool is returned to the main > > > allocator (SLAB or SLUB). At this point, the timer is reset, and the > > > next allocation is set up after the expiration of the interval. > > > > From other sub-threads it sounds like these addresses are not part of > > the linear/direct map. > For x86 these addresses belong to .bss, i.e. "kernel text mapping" > section, isn't that the linear map? No; the "linear map" is the "direct mapping" on x86, and the "image" or "kernel text mapping" is a distinct VA region. The image mapping aliases (i.e. uses the same physical pages as) a portion of the linear map, and every page in the linear map has a struct page. Fon the x86_64 ivirtual memory layout, see: https://www.kernel.org/doc/html/latest/x86/x86_64/mm.html Originally, the kernel image lived in the linear map, but it was split out into a distinct VA range (among other things) to permit KASLR. When that split was made, the x86 virt_to_*() helpers were updated to detect when they were passed a kernel image address, and automatically fix that up as-if they'd been handed the linear map alias of that address. For going one-way from virt->{phys,page} that works ok, but it doesn't survive the round-trip, and introduces redundant work into each virt_to_*() call. As it was largely arch code that was using image addresses, we didn't bother with the fixup on arm64, as we preferred the stronger warning. At the time I was also under the impression that on x86 they wanted to get rid of the automatic fixup, but that doesn't seem to have happened. Thanks, Mark.
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Tue, Sep 29, 2020 at 4:24 PM Mark Rutland wrote: > > On Mon, Sep 21, 2020 at 03:26:02PM +0200, Marco Elver wrote: > > From: Alexander Potapenko > > > > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > > low-overhead sampling-based memory safety error detector of heap > > use-after-free, invalid-free, and out-of-bounds access errors. > > > > KFENCE is designed to be enabled in production kernels, and has near > > zero performance overhead. Compared to KASAN, KFENCE trades performance > > for precision. The main motivation behind KFENCE's design, is that with > > enough total uptime KFENCE will detect bugs in code paths not typically > > exercised by non-production test workloads. One way to quickly achieve a > > large enough total uptime is when the tool is deployed across a large > > fleet of machines. > > > > KFENCE objects each reside on a dedicated page, at either the left or > > right page boundaries. The pages to the left and right of the object > > page are "guard pages", whose attributes are changed to a protected > > state, and cause page faults on any attempted access to them. Such page > > faults are then intercepted by KFENCE, which handles the fault > > gracefully by reporting a memory access error. To detect out-of-bounds > > writes to memory within the object's page itself, KFENCE also uses > > pattern-based redzones. The following figure illustrates the page > > layout: > > > > ---+---+---+---+---+---+--- > > | x | O : | x | : O | x | > > | x | B : | x | : B | x | > > | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | > > | x | E : ZONE | x | ZONE : E | x | > > | x | C : | x | : C | x | > > | x | T : | x | : T | x | > > ---+---+---+---+---+---+--- > > > > Guarded allocations are set up based on a sample interval (can be set > > via kfence.sample_interval). After expiration of the sample interval, a > > guarded allocation from the KFENCE object pool is returned to the main > > allocator (SLAB or SLUB). At this point, the timer is reset, and the > > next allocation is set up after the expiration of the interval. > > From other sub-threads it sounds like these addresses are not part of > the linear/direct map. For x86 these addresses belong to .bss, i.e. "kernel text mapping" section, isn't that the linear map? I also don't see lm_alias being used much outside arm64 code. > Having kmalloc return addresses outside of the > linear map is going to break anything that relies on virt<->phys > conversions, and is liable to make DMA corrupt memory. There were > problems of that sort with VMAP_STACK, and this is why kvmalloc() is > separate from kmalloc(). > > Have you tested with CONFIG_DEBUG_VIRTUAL? I'd expect that to scream. Just checked - it doesn't scream on x86. > I strongly suspect this isn't going to be safe unless you always use an > in-place carevout from the linear map (which could be the linear alias > of a static carevout). > > [...] > > > +static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t > > size, gfp_t flags) > > +{ > > + return static_branch_unlikely(_allocation_key) ? > > __kfence_alloc(s, size, flags) : > > + NULL; > > +} > > Minor (unrelated) nit, but this would be easier to read as: > > static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, > gfp_t flags) > { > if (static_branch_unlikely(_allocation_key)) > return __kfence_alloc(s, size, flags); > return NULL; > } > > Thanks, > Mark. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Tue, Sep 29, 2020 at 04:51:29PM +0200, Marco Elver wrote: > On Tue, 29 Sep 2020 at 16:24, Mark Rutland wrote: > [...] > > > > From other sub-threads it sounds like these addresses are not part of > > the linear/direct map. Having kmalloc return addresses outside of the > > linear map is going to break anything that relies on virt<->phys > > conversions, and is liable to make DMA corrupt memory. There were > > problems of that sort with VMAP_STACK, and this is why kvmalloc() is > > separate from kmalloc(). > > > > Have you tested with CONFIG_DEBUG_VIRTUAL? I'd expect that to scream. > > > > I strongly suspect this isn't going to be safe unless you always use an > > in-place carevout from the linear map (which could be the linear alias > > of a static carevout). > > That's an excellent point, thank you! Indeed, on arm64, a version with > naive static-pool screams with CONFIG_DEBUG_VIRTUAL. > > We'll try to put together an arm64 version using a carveout as you suggest. Great, thanks! Just to be clear, the concerns for DMA and virt<->phys conversions also apply to x86 (the x86 virt<->phys conversion behaviour is more forgiving in the common case, but still has cases that can go wrong). Other than the code to initialize the page tables for the careveout, I think the carevout code can be geenric. Thanks, Mark.
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Tue, 29 Sep 2020 at 16:24, Mark Rutland wrote: [...] > > From other sub-threads it sounds like these addresses are not part of > the linear/direct map. Having kmalloc return addresses outside of the > linear map is going to break anything that relies on virt<->phys > conversions, and is liable to make DMA corrupt memory. There were > problems of that sort with VMAP_STACK, and this is why kvmalloc() is > separate from kmalloc(). > > Have you tested with CONFIG_DEBUG_VIRTUAL? I'd expect that to scream. > > I strongly suspect this isn't going to be safe unless you always use an > in-place carevout from the linear map (which could be the linear alias > of a static carevout). That's an excellent point, thank you! Indeed, on arm64, a version with naive static-pool screams with CONFIG_DEBUG_VIRTUAL. We'll try to put together an arm64 version using a carveout as you suggest. > [...] > > > +static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t > > size, gfp_t flags) > > +{ > > + return static_branch_unlikely(_allocation_key) ? > > __kfence_alloc(s, size, flags) : > > + NULL; > > +} > > Minor (unrelated) nit, but this would be easier to read as: > > static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, > gfp_t flags) > { > if (static_branch_unlikely(_allocation_key)) > return __kfence_alloc(s, size, flags); > return NULL; > } Will fix for v5. Thanks, -- Marco
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Mon, Sep 21, 2020 at 03:26:02PM +0200, Marco Elver wrote: > From: Alexander Potapenko > > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > low-overhead sampling-based memory safety error detector of heap > use-after-free, invalid-free, and out-of-bounds access errors. > > KFENCE is designed to be enabled in production kernels, and has near > zero performance overhead. Compared to KASAN, KFENCE trades performance > for precision. The main motivation behind KFENCE's design, is that with > enough total uptime KFENCE will detect bugs in code paths not typically > exercised by non-production test workloads. One way to quickly achieve a > large enough total uptime is when the tool is deployed across a large > fleet of machines. > > KFENCE objects each reside on a dedicated page, at either the left or > right page boundaries. The pages to the left and right of the object > page are "guard pages", whose attributes are changed to a protected > state, and cause page faults on any attempted access to them. Such page > faults are then intercepted by KFENCE, which handles the fault > gracefully by reporting a memory access error. To detect out-of-bounds > writes to memory within the object's page itself, KFENCE also uses > pattern-based redzones. The following figure illustrates the page > layout: > > ---+---+---+---+---+---+--- > | x | O : | x | : O | x | > | x | B : | x | : B | x | > | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | > | x | E : ZONE | x | ZONE : E | x | > | x | C : | x | : C | x | > | x | T : | x | : T | x | > ---+---+---+---+---+---+--- > > Guarded allocations are set up based on a sample interval (can be set > via kfence.sample_interval). After expiration of the sample interval, a > guarded allocation from the KFENCE object pool is returned to the main > allocator (SLAB or SLUB). At this point, the timer is reset, and the > next allocation is set up after the expiration of the interval. >From other sub-threads it sounds like these addresses are not part of the linear/direct map. Having kmalloc return addresses outside of the linear map is going to break anything that relies on virt<->phys conversions, and is liable to make DMA corrupt memory. There were problems of that sort with VMAP_STACK, and this is why kvmalloc() is separate from kmalloc(). Have you tested with CONFIG_DEBUG_VIRTUAL? I'd expect that to scream. I strongly suspect this isn't going to be safe unless you always use an in-place carevout from the linear map (which could be the linear alias of a static carevout). [...] > +static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, > gfp_t flags) > +{ > + return static_branch_unlikely(_allocation_key) ? > __kfence_alloc(s, size, flags) : > + NULL; > +} Minor (unrelated) nit, but this would be easier to read as: static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags) { if (static_branch_unlikely(_allocation_key)) return __kfence_alloc(s, size, flags); return NULL; } Thanks, Mark.
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Tue, Sep 29, 2020 at 3:49 PM Marco Elver wrote: > > On Tue, 29 Sep 2020 at 15:48, Andrey Konovalov wrote: > > On Tue, Sep 29, 2020 at 3:11 PM Marco Elver wrote: > > > > > > On Tue, Sep 29, 2020 at 02:42PM +0200, Andrey Konovalov wrote: > > > [...] > > > > > +*/ > > > > > + index = (addr - (unsigned long)__kfence_pool) / (PAGE_SIZE * > > > > > 2) - 1; > > > > > > > > Why do we subtract 1 here? We do have the metadata entry reserved for > > > > something? > > > > > > Above the declaration of __kfence_pool it says: > > > > > > * We allocate an even number of pages, as it simplifies > > > calculations to map > > > * address to metadata indices; effectively, the very first page > > > serves as an > > > * extended guard page, but otherwise has no special purpose. > > > > > > Hopefully that clarifies the `- 1` here. > > > > So there are two guard pages at the beginning and only then a page > > that holds an object? > > Yes, correct. OK, I see. This isn't directly clear from the comment though, at least for me :)
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Tue, 29 Sep 2020 at 15:48, Andrey Konovalov wrote: > On Tue, Sep 29, 2020 at 3:11 PM Marco Elver wrote: > > > > On Tue, Sep 29, 2020 at 02:42PM +0200, Andrey Konovalov wrote: > > [...] > > > > +*/ > > > > + index = (addr - (unsigned long)__kfence_pool) / (PAGE_SIZE * 2) > > > > - 1; > > > > > > Why do we subtract 1 here? We do have the metadata entry reserved for > > > something? > > > > Above the declaration of __kfence_pool it says: > > > > * We allocate an even number of pages, as it simplifies > > calculations to map > > * address to metadata indices; effectively, the very first page > > serves as an > > * extended guard page, but otherwise has no special purpose. > > > > Hopefully that clarifies the `- 1` here. > > So there are two guard pages at the beginning and only then a page > that holds an object? Yes, correct.
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Tue, Sep 29, 2020 at 3:11 PM Marco Elver wrote: > > On Tue, Sep 29, 2020 at 02:42PM +0200, Andrey Konovalov wrote: > [...] > > > +*/ > > > + index = (addr - (unsigned long)__kfence_pool) / (PAGE_SIZE * 2) - > > > 1; > > > > Why do we subtract 1 here? We do have the metadata entry reserved for > > something? > > Above the declaration of __kfence_pool it says: > > * We allocate an even number of pages, as it simplifies calculations > to map > * address to metadata indices; effectively, the very first page > serves as an > * extended guard page, but otherwise has no special purpose. > > Hopefully that clarifies the `- 1` here. So there are two guard pages at the beginning and only then a page that holds an object?
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Tue, Sep 29, 2020 at 02:42PM +0200, Andrey Konovalov wrote: [...] > > +*/ > > + index = (addr - (unsigned long)__kfence_pool) / (PAGE_SIZE * 2) - 1; > > Why do we subtract 1 here? We do have the metadata entry reserved for > something? Above the declaration of __kfence_pool it says: * We allocate an even number of pages, as it simplifies calculations to map * address to metadata indices; effectively, the very first page serves as an * extended guard page, but otherwise has no special purpose. Hopefully that clarifies the `- 1` here. [...] > > + /* Allocation and free stack information. */ > > + int num_alloc_stack; > > + int num_free_stack; > > + unsigned long alloc_stack[KFENCE_STACK_DEPTH]; > > + unsigned long free_stack[KFENCE_STACK_DEPTH]; > > It was a concious decision to not use stackdepot, right? Perhaps it > makes sense to document the reason somewhere. Yes; we want to avoid the dynamic allocations that stackdepot does. [...] Thanks, -- Marco
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Mon, Sep 21, 2020 at 3:26 PM Marco Elver wrote: > > From: Alexander Potapenko > > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > low-overhead sampling-based memory safety error detector of heap > use-after-free, invalid-free, and out-of-bounds access errors. > > KFENCE is designed to be enabled in production kernels, and has near > zero performance overhead. Compared to KASAN, KFENCE trades performance > for precision. The main motivation behind KFENCE's design, is that with > enough total uptime KFENCE will detect bugs in code paths not typically > exercised by non-production test workloads. One way to quickly achieve a > large enough total uptime is when the tool is deployed across a large > fleet of machines. > > KFENCE objects each reside on a dedicated page, at either the left or > right page boundaries. The pages to the left and right of the object > page are "guard pages", whose attributes are changed to a protected > state, and cause page faults on any attempted access to them. Such page > faults are then intercepted by KFENCE, which handles the fault > gracefully by reporting a memory access error. To detect out-of-bounds > writes to memory within the object's page itself, KFENCE also uses > pattern-based redzones. The following figure illustrates the page > layout: > > ---+---+---+---+---+---+--- > | x | O : | x | : O | x | > | x | B : | x | : B | x | > | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | > | x | E : ZONE | x | ZONE : E | x | > | x | C : | x | : C | x | > | x | T : | x | : T | x | > ---+---+---+---+---+---+--- > > Guarded allocations are set up based on a sample interval (can be set > via kfence.sample_interval). After expiration of the sample interval, a > guarded allocation from the KFENCE object pool is returned to the main > allocator (SLAB or SLUB). At this point, the timer is reset, and the > next allocation is set up after the expiration of the interval. > > To enable/disable a KFENCE allocation through the main allocator's > fast-path without overhead, KFENCE relies on static branches via the > static keys infrastructure. The static branch is toggled to redirect the > allocation to KFENCE. To date, we have verified by running synthetic > benchmarks (sysbench I/O workloads) that a kernel compiled with KFENCE > is performance-neutral compared to the non-KFENCE baseline. > > For more details, see Documentation/dev-tools/kfence.rst (added later in > the series). > > Reviewed-by: Dmitry Vyukov > Co-developed-by: Marco Elver > Signed-off-by: Marco Elver > Signed-off-by: Alexander Potapenko > --- > v3: > * Reports by SeongJae Park: > * Remove reference to Documentation/dev-tools/kfence.rst. > * Remove redundant braces. > * Use CONFIG_KFENCE_NUM_OBJECTS instead of ARRAY_SIZE(...). > * Align some comments. > * Add figure from Documentation/dev-tools/kfence.rst added later in > series to patch description. > > v2: > * Add missing __printf attribute to seq_con_printf, and fix new warning. > [reported by kernel test robot ] > * Fix up some comments [reported by Jonathan Cameron]. > * Remove 2 cases of redundant stack variable initialization > [reported by Jonathan Cameron]. > * Fix printf format [reported by kernel test robot ]. > * Print (in kfence-#nn) after address, to more clearly establish link > between first and second stacktrace [reported by Andrey Konovalov]. > * Make choice between KASAN and KFENCE clearer in Kconfig help text > [suggested by Dave Hansen]. > * Document CONFIG_KFENCE_SAMPLE_INTERVAL=0. > * Shorten memory corruption report line length. > * Make /sys/module/kfence/parameters/sample_interval root-writable for > all builds (to enable debugging, automatic dynamic tweaking). > * Reports by Dmitry Vyukov: > * Do not store negative size for right-located objects > * Only cache-align addresses of right-located objects. > * Run toggle_allocation_gate() after KFENCE is enabled. > * Add empty line between allocation and free stacks. > * Add comment about SLAB_TYPESAFE_BY_RCU. > * Also skip internals for allocation/free stacks. > * s/KFENCE_FAULT_INJECTION/KFENCE_STRESS_TEST_FAULTS/ as FAULT_INJECTION > is already overloaded in different contexts. > * Parenthesis for macro variable. > * Lower max of KFENCE_NUM_OBJECTS config variable. > --- > MAINTAINERS| 11 + > include/linux/kfence.h | 174 ++ > init/main.c| 2 + > lib/Kconfig.debug | 1 + > lib/Kconfig.kfence | 63 > mm/Makefile| 1 + > mm/kfence/Makefile | 3 + > mm/kfence/core.c | 733 + > mm/kfence/kfence.h | 102 ++ >
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Fri, 25 Sep 2020 at 13:24, 'SeongJae Park' via kasan-dev wrote: > > On Mon, 21 Sep 2020 15:26:02 +0200 Marco Elver wrote: > > > From: Alexander Potapenko > > > > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > > low-overhead sampling-based memory safety error detector of heap > > use-after-free, invalid-free, and out-of-bounds access errors. > > > > KFENCE is designed to be enabled in production kernels, and has near > > zero performance overhead. Compared to KASAN, KFENCE trades performance > > for precision. The main motivation behind KFENCE's design, is that with > > enough total uptime KFENCE will detect bugs in code paths not typically > > exercised by non-production test workloads. One way to quickly achieve a > > large enough total uptime is when the tool is deployed across a large > > fleet of machines. > > > > KFENCE objects each reside on a dedicated page, at either the left or > > right page boundaries. The pages to the left and right of the object > > page are "guard pages", whose attributes are changed to a protected > > state, and cause page faults on any attempted access to them. Such page > > faults are then intercepted by KFENCE, which handles the fault > > gracefully by reporting a memory access error. To detect out-of-bounds > > writes to memory within the object's page itself, KFENCE also uses > > pattern-based redzones. The following figure illustrates the page > > layout: > > > > ---+---+---+---+---+---+--- > > | x | O : | x | : O | x | > > | x | B : | x | : B | x | > > | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | > > | x | E : ZONE | x | ZONE : E | x | > > | x | C : | x | : C | x | > > | x | T : | x | : T | x | > > ---+---+---+---+---+---+--- > > > > Guarded allocations are set up based on a sample interval (can be set > > via kfence.sample_interval). After expiration of the sample interval, a > > guarded allocation from the KFENCE object pool is returned to the main > > allocator (SLAB or SLUB). At this point, the timer is reset, and the > > next allocation is set up after the expiration of the interval. > > > > To enable/disable a KFENCE allocation through the main allocator's > > fast-path without overhead, KFENCE relies on static branches via the > > static keys infrastructure. The static branch is toggled to redirect the > > allocation to KFENCE. To date, we have verified by running synthetic > > benchmarks (sysbench I/O workloads) that a kernel compiled with KFENCE > > is performance-neutral compared to the non-KFENCE baseline. > > > > For more details, see Documentation/dev-tools/kfence.rst (added later in > > the series). > > > > Reviewed-by: Dmitry Vyukov > > Co-developed-by: Marco Elver > > Signed-off-by: Marco Elver > > Signed-off-by: Alexander Potapenko > > --- > > v3: > > * Reports by SeongJae Park: > > * Remove reference to Documentation/dev-tools/kfence.rst. > > * Remove redundant braces. > > * Use CONFIG_KFENCE_NUM_OBJECTS instead of ARRAY_SIZE(...). > > * Align some comments. > > * Add figure from Documentation/dev-tools/kfence.rst added later in > > series to patch description. > > > > v2: > > * Add missing __printf attribute to seq_con_printf, and fix new warning. > > [reported by kernel test robot ] > > * Fix up some comments [reported by Jonathan Cameron]. > > * Remove 2 cases of redundant stack variable initialization > > [reported by Jonathan Cameron]. > > * Fix printf format [reported by kernel test robot ]. > > * Print (in kfence-#nn) after address, to more clearly establish link > > between first and second stacktrace [reported by Andrey Konovalov]. > > * Make choice between KASAN and KFENCE clearer in Kconfig help text > > [suggested by Dave Hansen]. > > * Document CONFIG_KFENCE_SAMPLE_INTERVAL=0. > > * Shorten memory corruption report line length. > > * Make /sys/module/kfence/parameters/sample_interval root-writable for > > all builds (to enable debugging, automatic dynamic tweaking). > > * Reports by Dmitry Vyukov: > > * Do not store negative size for right-located objects > > * Only cache-align addresses of right-located objects. > > * Run toggle_allocation_gate() after KFENCE is enabled. > > * Add empty line between allocation and free stacks. > > * Add comment about SLAB_TYPESAFE_BY_RCU. > > * Also skip internals for allocation/free stacks. > > * s/KFENCE_FAULT_INJECTION/KFENCE_STRESS_TEST_FAULTS/ as FAULT_INJECTION > > is already overloaded in different contexts. > > * Parenthesis for macro variable. > > * Lower max of KFENCE_NUM_OBJECTS config variable. > > --- > > MAINTAINERS| 11 + > > include/linux/kfence.h | 174 ++ > > init/main.c| 2 +
Re: [PATCH v3 01/10] mm: add Kernel Electric-Fence infrastructure
On Mon, 21 Sep 2020 15:26:02 +0200 Marco Elver wrote: > From: Alexander Potapenko > > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > low-overhead sampling-based memory safety error detector of heap > use-after-free, invalid-free, and out-of-bounds access errors. > > KFENCE is designed to be enabled in production kernels, and has near > zero performance overhead. Compared to KASAN, KFENCE trades performance > for precision. The main motivation behind KFENCE's design, is that with > enough total uptime KFENCE will detect bugs in code paths not typically > exercised by non-production test workloads. One way to quickly achieve a > large enough total uptime is when the tool is deployed across a large > fleet of machines. > > KFENCE objects each reside on a dedicated page, at either the left or > right page boundaries. The pages to the left and right of the object > page are "guard pages", whose attributes are changed to a protected > state, and cause page faults on any attempted access to them. Such page > faults are then intercepted by KFENCE, which handles the fault > gracefully by reporting a memory access error. To detect out-of-bounds > writes to memory within the object's page itself, KFENCE also uses > pattern-based redzones. The following figure illustrates the page > layout: > > ---+---+---+---+---+---+--- > | x | O : | x | : O | x | > | x | B : | x | : B | x | > | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | > | x | E : ZONE | x | ZONE : E | x | > | x | C : | x | : C | x | > | x | T : | x | : T | x | > ---+---+---+---+---+---+--- > > Guarded allocations are set up based on a sample interval (can be set > via kfence.sample_interval). After expiration of the sample interval, a > guarded allocation from the KFENCE object pool is returned to the main > allocator (SLAB or SLUB). At this point, the timer is reset, and the > next allocation is set up after the expiration of the interval. > > To enable/disable a KFENCE allocation through the main allocator's > fast-path without overhead, KFENCE relies on static branches via the > static keys infrastructure. The static branch is toggled to redirect the > allocation to KFENCE. To date, we have verified by running synthetic > benchmarks (sysbench I/O workloads) that a kernel compiled with KFENCE > is performance-neutral compared to the non-KFENCE baseline. > > For more details, see Documentation/dev-tools/kfence.rst (added later in > the series). > > Reviewed-by: Dmitry Vyukov > Co-developed-by: Marco Elver > Signed-off-by: Marco Elver > Signed-off-by: Alexander Potapenko > --- > v3: > * Reports by SeongJae Park: > * Remove reference to Documentation/dev-tools/kfence.rst. > * Remove redundant braces. > * Use CONFIG_KFENCE_NUM_OBJECTS instead of ARRAY_SIZE(...). > * Align some comments. > * Add figure from Documentation/dev-tools/kfence.rst added later in > series to patch description. > > v2: > * Add missing __printf attribute to seq_con_printf, and fix new warning. > [reported by kernel test robot ] > * Fix up some comments [reported by Jonathan Cameron]. > * Remove 2 cases of redundant stack variable initialization > [reported by Jonathan Cameron]. > * Fix printf format [reported by kernel test robot ]. > * Print (in kfence-#nn) after address, to more clearly establish link > between first and second stacktrace [reported by Andrey Konovalov]. > * Make choice between KASAN and KFENCE clearer in Kconfig help text > [suggested by Dave Hansen]. > * Document CONFIG_KFENCE_SAMPLE_INTERVAL=0. > * Shorten memory corruption report line length. > * Make /sys/module/kfence/parameters/sample_interval root-writable for > all builds (to enable debugging, automatic dynamic tweaking). > * Reports by Dmitry Vyukov: > * Do not store negative size for right-located objects > * Only cache-align addresses of right-located objects. > * Run toggle_allocation_gate() after KFENCE is enabled. > * Add empty line between allocation and free stacks. > * Add comment about SLAB_TYPESAFE_BY_RCU. > * Also skip internals for allocation/free stacks. > * s/KFENCE_FAULT_INJECTION/KFENCE_STRESS_TEST_FAULTS/ as FAULT_INJECTION > is already overloaded in different contexts. > * Parenthesis for macro variable. > * Lower max of KFENCE_NUM_OBJECTS config variable. > --- > MAINTAINERS| 11 + > include/linux/kfence.h | 174 ++ > init/main.c| 2 + > lib/Kconfig.debug | 1 + > lib/Kconfig.kfence | 63 > mm/Makefile| 1 + > mm/kfence/Makefile | 3 + > mm/kfence/core.c | 733 + > mm/kfence/kfence.h | 102