Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-26 Thread Christoph Lameter
On Mon, 25 Apr 2016, Thomas Garnier wrote:

> To generate entropy, we use get_random_bytes_arch because 0 bits of
> entropy is available in the boot stage. In the worse case this function
> will fallback to the get_random_bytes sub API. We also generate a shift
> random number to shift pre-computed freelist for each new set of pages.
>
> The config option name is not specific to the SLAB as this approach will
> be extended to other allocators like SLUB.
>
> Performance results highlighted no major changes:

Ok. alloc/free tests are not affected since this exercises the per cpu
objects. And the other ones as well since most of the overhead occurs on
slab page initialization.

> Before:
> 1 times kmalloc(1024) -> 393 cycles kfree -> 251 cycles
> 1 times kmalloc(2048) -> 649 cycles kfree -> 228 cycles
> 1 times kmalloc(4096) -> 806 cycles kfree -> 370 cycles
> 1 times kmalloc(8192) -> 814 cycles kfree -> 411 cycles
> 1 times kmalloc(16384) -> 892 cycles kfree -> 455 cycles
>
> After:
> 1 times kmalloc(1024) -> 342 cycles kfree -> 157 cycles
> 1 times kmalloc(2048) -> 701 cycles kfree -> 238 cycles
> 1 times kmalloc(4096) -> 803 cycles kfree -> 364 cycles
> 1 times kmalloc(8192) -> 835 cycles kfree -> 404 cycles
> 1 times kmalloc(16384) -> 896 cycles kfree -> 441 cycles

And there is some slight regression with the larger objects. Not sure if
we are really hitting the slab page initialization too much there either.
Pretty minimal in synthetic tests. Can you run something like hackbench
too?

Otherwise this looks ok.

Acked-by: Christoph Lameter 




Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-26 Thread Christoph Lameter
On Mon, 25 Apr 2016, Thomas Garnier wrote:

> To generate entropy, we use get_random_bytes_arch because 0 bits of
> entropy is available in the boot stage. In the worse case this function
> will fallback to the get_random_bytes sub API. We also generate a shift
> random number to shift pre-computed freelist for each new set of pages.
>
> The config option name is not specific to the SLAB as this approach will
> be extended to other allocators like SLUB.
>
> Performance results highlighted no major changes:

Ok. alloc/free tests are not affected since this exercises the per cpu
objects. And the other ones as well since most of the overhead occurs on
slab page initialization.

> Before:
> 1 times kmalloc(1024) -> 393 cycles kfree -> 251 cycles
> 1 times kmalloc(2048) -> 649 cycles kfree -> 228 cycles
> 1 times kmalloc(4096) -> 806 cycles kfree -> 370 cycles
> 1 times kmalloc(8192) -> 814 cycles kfree -> 411 cycles
> 1 times kmalloc(16384) -> 892 cycles kfree -> 455 cycles
>
> After:
> 1 times kmalloc(1024) -> 342 cycles kfree -> 157 cycles
> 1 times kmalloc(2048) -> 701 cycles kfree -> 238 cycles
> 1 times kmalloc(4096) -> 803 cycles kfree -> 364 cycles
> 1 times kmalloc(8192) -> 835 cycles kfree -> 404 cycles
> 1 times kmalloc(16384) -> 896 cycles kfree -> 441 cycles

And there is some slight regression with the larger objects. Not sure if
we are really hitting the slab page initialization too much there either.
Pretty minimal in synthetic tests. Can you run something like hackbench
too?

Otherwise this looks ok.

Acked-by: Christoph Lameter 




Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Thomas Garnier
Make sense. I think it is still valuable to randomize earlier pages. I
will adapt the code, test and send patch v4.

Thanks for the quick feedback,
Thomas

On Mon, Apr 25, 2016 at 5:40 PM, Joonsoo Kim  wrote:
> On Mon, Apr 25, 2016 at 01:39:23PM -0700, Thomas Garnier wrote:
>> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>> SLAB freelist. The list is randomized during initialization of a new set
>> of pages. The order on different freelist sizes is pre-computed at boot
>> for performance. Each kmem_cache has its own randomized freelist except
>> early on boot where global lists are used. This security feature reduces
>> the predictability of the kernel SLAB allocator against heap overflows
>> rendering attacks much less stable.
>>
>> For example this attack against SLUB (also applicable against SLAB)
>> would be affected:
>> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>>
>> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>> a controllable heap is opened to new attacks not yet publicly discussed.
>> A kernel heap overflow can be transformed to multiple use-after-free.
>> This feature makes this type of attack harder too.
>>
>> To generate entropy, we use get_random_bytes_arch because 0 bits of
>> entropy is available in the boot stage. In the worse case this function
>> will fallback to the get_random_bytes sub API. We also generate a shift
>> random number to shift pre-computed freelist for each new set of pages.
>>
>> The config option name is not specific to the SLAB as this approach will
>> be extended to other allocators like SLUB.
>>
>> Performance results highlighted no major changes:
>>
>> slab_test 1 run on boot. Difference only seen on the 2048 size test
>> being the worse case scenario covered by freelist randomization. New
>> slab pages are constantly being created on the 1 allocations.
>> Variance should be mainly due to getting new pages every few
>> allocations.
>>
>> Before:
>>
>> Single thread testing
>> =
>> 1. Kmalloc: Repeatedly allocate then free test
>> 1 times kmalloc(8) -> 99 cycles kfree -> 112 cycles
>> 1 times kmalloc(16) -> 109 cycles kfree -> 140 cycles
>> 1 times kmalloc(32) -> 129 cycles kfree -> 137 cycles
>> 1 times kmalloc(64) -> 141 cycles kfree -> 141 cycles
>> 1 times kmalloc(128) -> 152 cycles kfree -> 148 cycles
>> 1 times kmalloc(256) -> 195 cycles kfree -> 167 cycles
>> 1 times kmalloc(512) -> 257 cycles kfree -> 199 cycles
>> 1 times kmalloc(1024) -> 393 cycles kfree -> 251 cycles
>> 1 times kmalloc(2048) -> 649 cycles kfree -> 228 cycles
>> 1 times kmalloc(4096) -> 806 cycles kfree -> 370 cycles
>> 1 times kmalloc(8192) -> 814 cycles kfree -> 411 cycles
>> 1 times kmalloc(16384) -> 892 cycles kfree -> 455 cycles
>> 2. Kmalloc: alloc/free test
>> 1 times kmalloc(8)/kfree -> 121 cycles
>> 1 times kmalloc(16)/kfree -> 121 cycles
>> 1 times kmalloc(32)/kfree -> 121 cycles
>> 1 times kmalloc(64)/kfree -> 121 cycles
>> 1 times kmalloc(128)/kfree -> 121 cycles
>> 1 times kmalloc(256)/kfree -> 119 cycles
>> 1 times kmalloc(512)/kfree -> 119 cycles
>> 1 times kmalloc(1024)/kfree -> 119 cycles
>> 1 times kmalloc(2048)/kfree -> 119 cycles
>> 1 times kmalloc(4096)/kfree -> 121 cycles
>> 1 times kmalloc(8192)/kfree -> 119 cycles
>> 1 times kmalloc(16384)/kfree -> 119 cycles
>>
>> After:
>>
>> Single thread testing
>> =
>> 1. Kmalloc: Repeatedly allocate then free test
>> 1 times kmalloc(8) -> 130 cycles kfree -> 86 cycles
>> 1 times kmalloc(16) -> 118 cycles kfree -> 86 cycles
>> 1 times kmalloc(32) -> 121 cycles kfree -> 85 cycles
>> 1 times kmalloc(64) -> 176 cycles kfree -> 102 cycles
>> 1 times kmalloc(128) -> 178 cycles kfree -> 100 cycles
>> 1 times kmalloc(256) -> 205 cycles kfree -> 109 cycles
>> 1 times kmalloc(512) -> 262 cycles kfree -> 136 cycles
>> 1 times kmalloc(1024) -> 342 cycles kfree -> 157 cycles
>> 1 times kmalloc(2048) -> 701 cycles kfree -> 238 cycles
>> 1 times kmalloc(4096) -> 803 cycles kfree -> 364 cycles
>> 1 times kmalloc(8192) -> 835 cycles kfree -> 404 cycles
>> 1 times kmalloc(16384) -> 896 cycles kfree -> 441 cycles
>> 2. Kmalloc: alloc/free test
>> 1 times kmalloc(8)/kfree -> 121 cycles
>> 1 times kmalloc(16)/kfree -> 121 cycles
>> 1 times kmalloc(32)/kfree -> 123 cycles
>> 1 times kmalloc(64)/kfree -> 142 cycles
>> 1 times kmalloc(128)/kfree -> 121 cycles
>> 1 times kmalloc(256)/kfree -> 119 cycles
>> 1 times kmalloc(512)/kfree -> 119 cycles
>> 1 times kmalloc(1024)/kfree -> 119 cycles
>> 1 times kmalloc(2048)/kfree -> 119 cycles
>> 1 times kmalloc(4096)/kfree -> 119 cycles
>> 1 times kmalloc(8192)/kfree -> 119 cycles
>> 1 times kmalloc(16384)/kfree -> 119 cycles
>>
>> Signed-off-by: Thomas Garnier 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Thomas Garnier
Make sense. I think it is still valuable to randomize earlier pages. I
will adapt the code, test and send patch v4.

Thanks for the quick feedback,
Thomas

On Mon, Apr 25, 2016 at 5:40 PM, Joonsoo Kim  wrote:
> On Mon, Apr 25, 2016 at 01:39:23PM -0700, Thomas Garnier wrote:
>> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>> SLAB freelist. The list is randomized during initialization of a new set
>> of pages. The order on different freelist sizes is pre-computed at boot
>> for performance. Each kmem_cache has its own randomized freelist except
>> early on boot where global lists are used. This security feature reduces
>> the predictability of the kernel SLAB allocator against heap overflows
>> rendering attacks much less stable.
>>
>> For example this attack against SLUB (also applicable against SLAB)
>> would be affected:
>> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>>
>> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>> a controllable heap is opened to new attacks not yet publicly discussed.
>> A kernel heap overflow can be transformed to multiple use-after-free.
>> This feature makes this type of attack harder too.
>>
>> To generate entropy, we use get_random_bytes_arch because 0 bits of
>> entropy is available in the boot stage. In the worse case this function
>> will fallback to the get_random_bytes sub API. We also generate a shift
>> random number to shift pre-computed freelist for each new set of pages.
>>
>> The config option name is not specific to the SLAB as this approach will
>> be extended to other allocators like SLUB.
>>
>> Performance results highlighted no major changes:
>>
>> slab_test 1 run on boot. Difference only seen on the 2048 size test
>> being the worse case scenario covered by freelist randomization. New
>> slab pages are constantly being created on the 1 allocations.
>> Variance should be mainly due to getting new pages every few
>> allocations.
>>
>> Before:
>>
>> Single thread testing
>> =
>> 1. Kmalloc: Repeatedly allocate then free test
>> 1 times kmalloc(8) -> 99 cycles kfree -> 112 cycles
>> 1 times kmalloc(16) -> 109 cycles kfree -> 140 cycles
>> 1 times kmalloc(32) -> 129 cycles kfree -> 137 cycles
>> 1 times kmalloc(64) -> 141 cycles kfree -> 141 cycles
>> 1 times kmalloc(128) -> 152 cycles kfree -> 148 cycles
>> 1 times kmalloc(256) -> 195 cycles kfree -> 167 cycles
>> 1 times kmalloc(512) -> 257 cycles kfree -> 199 cycles
>> 1 times kmalloc(1024) -> 393 cycles kfree -> 251 cycles
>> 1 times kmalloc(2048) -> 649 cycles kfree -> 228 cycles
>> 1 times kmalloc(4096) -> 806 cycles kfree -> 370 cycles
>> 1 times kmalloc(8192) -> 814 cycles kfree -> 411 cycles
>> 1 times kmalloc(16384) -> 892 cycles kfree -> 455 cycles
>> 2. Kmalloc: alloc/free test
>> 1 times kmalloc(8)/kfree -> 121 cycles
>> 1 times kmalloc(16)/kfree -> 121 cycles
>> 1 times kmalloc(32)/kfree -> 121 cycles
>> 1 times kmalloc(64)/kfree -> 121 cycles
>> 1 times kmalloc(128)/kfree -> 121 cycles
>> 1 times kmalloc(256)/kfree -> 119 cycles
>> 1 times kmalloc(512)/kfree -> 119 cycles
>> 1 times kmalloc(1024)/kfree -> 119 cycles
>> 1 times kmalloc(2048)/kfree -> 119 cycles
>> 1 times kmalloc(4096)/kfree -> 121 cycles
>> 1 times kmalloc(8192)/kfree -> 119 cycles
>> 1 times kmalloc(16384)/kfree -> 119 cycles
>>
>> After:
>>
>> Single thread testing
>> =
>> 1. Kmalloc: Repeatedly allocate then free test
>> 1 times kmalloc(8) -> 130 cycles kfree -> 86 cycles
>> 1 times kmalloc(16) -> 118 cycles kfree -> 86 cycles
>> 1 times kmalloc(32) -> 121 cycles kfree -> 85 cycles
>> 1 times kmalloc(64) -> 176 cycles kfree -> 102 cycles
>> 1 times kmalloc(128) -> 178 cycles kfree -> 100 cycles
>> 1 times kmalloc(256) -> 205 cycles kfree -> 109 cycles
>> 1 times kmalloc(512) -> 262 cycles kfree -> 136 cycles
>> 1 times kmalloc(1024) -> 342 cycles kfree -> 157 cycles
>> 1 times kmalloc(2048) -> 701 cycles kfree -> 238 cycles
>> 1 times kmalloc(4096) -> 803 cycles kfree -> 364 cycles
>> 1 times kmalloc(8192) -> 835 cycles kfree -> 404 cycles
>> 1 times kmalloc(16384) -> 896 cycles kfree -> 441 cycles
>> 2. Kmalloc: alloc/free test
>> 1 times kmalloc(8)/kfree -> 121 cycles
>> 1 times kmalloc(16)/kfree -> 121 cycles
>> 1 times kmalloc(32)/kfree -> 123 cycles
>> 1 times kmalloc(64)/kfree -> 142 cycles
>> 1 times kmalloc(128)/kfree -> 121 cycles
>> 1 times kmalloc(256)/kfree -> 119 cycles
>> 1 times kmalloc(512)/kfree -> 119 cycles
>> 1 times kmalloc(1024)/kfree -> 119 cycles
>> 1 times kmalloc(2048)/kfree -> 119 cycles
>> 1 times kmalloc(4096)/kfree -> 119 cycles
>> 1 times kmalloc(8192)/kfree -> 119 cycles
>> 1 times kmalloc(16384)/kfree -> 119 cycles
>>
>> Signed-off-by: Thomas Garnier 
>> ---
>> Based on next-20160422

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Joonsoo Kim
On Mon, Apr 25, 2016 at 01:39:23PM -0700, Thomas Garnier wrote:
> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> SLAB freelist. The list is randomized during initialization of a new set
> of pages. The order on different freelist sizes is pre-computed at boot
> for performance. Each kmem_cache has its own randomized freelist except
> early on boot where global lists are used. This security feature reduces
> the predictability of the kernel SLAB allocator against heap overflows
> rendering attacks much less stable.
> 
> For example this attack against SLUB (also applicable against SLAB)
> would be affected:
> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> 
> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> a controllable heap is opened to new attacks not yet publicly discussed.
> A kernel heap overflow can be transformed to multiple use-after-free.
> This feature makes this type of attack harder too.
> 
> To generate entropy, we use get_random_bytes_arch because 0 bits of
> entropy is available in the boot stage. In the worse case this function
> will fallback to the get_random_bytes sub API. We also generate a shift
> random number to shift pre-computed freelist for each new set of pages.
> 
> The config option name is not specific to the SLAB as this approach will
> be extended to other allocators like SLUB.
> 
> Performance results highlighted no major changes:
> 
> slab_test 1 run on boot. Difference only seen on the 2048 size test
> being the worse case scenario covered by freelist randomization. New
> slab pages are constantly being created on the 1 allocations.
> Variance should be mainly due to getting new pages every few
> allocations.
> 
> Before:
> 
> Single thread testing
> =
> 1. Kmalloc: Repeatedly allocate then free test
> 1 times kmalloc(8) -> 99 cycles kfree -> 112 cycles
> 1 times kmalloc(16) -> 109 cycles kfree -> 140 cycles
> 1 times kmalloc(32) -> 129 cycles kfree -> 137 cycles
> 1 times kmalloc(64) -> 141 cycles kfree -> 141 cycles
> 1 times kmalloc(128) -> 152 cycles kfree -> 148 cycles
> 1 times kmalloc(256) -> 195 cycles kfree -> 167 cycles
> 1 times kmalloc(512) -> 257 cycles kfree -> 199 cycles
> 1 times kmalloc(1024) -> 393 cycles kfree -> 251 cycles
> 1 times kmalloc(2048) -> 649 cycles kfree -> 228 cycles
> 1 times kmalloc(4096) -> 806 cycles kfree -> 370 cycles
> 1 times kmalloc(8192) -> 814 cycles kfree -> 411 cycles
> 1 times kmalloc(16384) -> 892 cycles kfree -> 455 cycles
> 2. Kmalloc: alloc/free test
> 1 times kmalloc(8)/kfree -> 121 cycles
> 1 times kmalloc(16)/kfree -> 121 cycles
> 1 times kmalloc(32)/kfree -> 121 cycles
> 1 times kmalloc(64)/kfree -> 121 cycles
> 1 times kmalloc(128)/kfree -> 121 cycles
> 1 times kmalloc(256)/kfree -> 119 cycles
> 1 times kmalloc(512)/kfree -> 119 cycles
> 1 times kmalloc(1024)/kfree -> 119 cycles
> 1 times kmalloc(2048)/kfree -> 119 cycles
> 1 times kmalloc(4096)/kfree -> 121 cycles
> 1 times kmalloc(8192)/kfree -> 119 cycles
> 1 times kmalloc(16384)/kfree -> 119 cycles
> 
> After:
> 
> Single thread testing
> =
> 1. Kmalloc: Repeatedly allocate then free test
> 1 times kmalloc(8) -> 130 cycles kfree -> 86 cycles
> 1 times kmalloc(16) -> 118 cycles kfree -> 86 cycles
> 1 times kmalloc(32) -> 121 cycles kfree -> 85 cycles
> 1 times kmalloc(64) -> 176 cycles kfree -> 102 cycles
> 1 times kmalloc(128) -> 178 cycles kfree -> 100 cycles
> 1 times kmalloc(256) -> 205 cycles kfree -> 109 cycles
> 1 times kmalloc(512) -> 262 cycles kfree -> 136 cycles
> 1 times kmalloc(1024) -> 342 cycles kfree -> 157 cycles
> 1 times kmalloc(2048) -> 701 cycles kfree -> 238 cycles
> 1 times kmalloc(4096) -> 803 cycles kfree -> 364 cycles
> 1 times kmalloc(8192) -> 835 cycles kfree -> 404 cycles
> 1 times kmalloc(16384) -> 896 cycles kfree -> 441 cycles
> 2. Kmalloc: alloc/free test
> 1 times kmalloc(8)/kfree -> 121 cycles
> 1 times kmalloc(16)/kfree -> 121 cycles
> 1 times kmalloc(32)/kfree -> 123 cycles
> 1 times kmalloc(64)/kfree -> 142 cycles
> 1 times kmalloc(128)/kfree -> 121 cycles
> 1 times kmalloc(256)/kfree -> 119 cycles
> 1 times kmalloc(512)/kfree -> 119 cycles
> 1 times kmalloc(1024)/kfree -> 119 cycles
> 1 times kmalloc(2048)/kfree -> 119 cycles
> 1 times kmalloc(4096)/kfree -> 119 cycles
> 1 times kmalloc(8192)/kfree -> 119 cycles
> 1 times kmalloc(16384)/kfree -> 119 cycles
> 
> Signed-off-by: Thomas Garnier 
> ---
> Based on next-20160422
> ---
>  include/linux/slab_def.h |   4 +
>  init/Kconfig |   9 ++
>  mm/slab.c| 213 
> ++-
>  3 files changed, 224 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/slab_def.h 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Joonsoo Kim
On Mon, Apr 25, 2016 at 01:39:23PM -0700, Thomas Garnier wrote:
> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> SLAB freelist. The list is randomized during initialization of a new set
> of pages. The order on different freelist sizes is pre-computed at boot
> for performance. Each kmem_cache has its own randomized freelist except
> early on boot where global lists are used. This security feature reduces
> the predictability of the kernel SLAB allocator against heap overflows
> rendering attacks much less stable.
> 
> For example this attack against SLUB (also applicable against SLAB)
> would be affected:
> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> 
> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> a controllable heap is opened to new attacks not yet publicly discussed.
> A kernel heap overflow can be transformed to multiple use-after-free.
> This feature makes this type of attack harder too.
> 
> To generate entropy, we use get_random_bytes_arch because 0 bits of
> entropy is available in the boot stage. In the worse case this function
> will fallback to the get_random_bytes sub API. We also generate a shift
> random number to shift pre-computed freelist for each new set of pages.
> 
> The config option name is not specific to the SLAB as this approach will
> be extended to other allocators like SLUB.
> 
> Performance results highlighted no major changes:
> 
> slab_test 1 run on boot. Difference only seen on the 2048 size test
> being the worse case scenario covered by freelist randomization. New
> slab pages are constantly being created on the 1 allocations.
> Variance should be mainly due to getting new pages every few
> allocations.
> 
> Before:
> 
> Single thread testing
> =
> 1. Kmalloc: Repeatedly allocate then free test
> 1 times kmalloc(8) -> 99 cycles kfree -> 112 cycles
> 1 times kmalloc(16) -> 109 cycles kfree -> 140 cycles
> 1 times kmalloc(32) -> 129 cycles kfree -> 137 cycles
> 1 times kmalloc(64) -> 141 cycles kfree -> 141 cycles
> 1 times kmalloc(128) -> 152 cycles kfree -> 148 cycles
> 1 times kmalloc(256) -> 195 cycles kfree -> 167 cycles
> 1 times kmalloc(512) -> 257 cycles kfree -> 199 cycles
> 1 times kmalloc(1024) -> 393 cycles kfree -> 251 cycles
> 1 times kmalloc(2048) -> 649 cycles kfree -> 228 cycles
> 1 times kmalloc(4096) -> 806 cycles kfree -> 370 cycles
> 1 times kmalloc(8192) -> 814 cycles kfree -> 411 cycles
> 1 times kmalloc(16384) -> 892 cycles kfree -> 455 cycles
> 2. Kmalloc: alloc/free test
> 1 times kmalloc(8)/kfree -> 121 cycles
> 1 times kmalloc(16)/kfree -> 121 cycles
> 1 times kmalloc(32)/kfree -> 121 cycles
> 1 times kmalloc(64)/kfree -> 121 cycles
> 1 times kmalloc(128)/kfree -> 121 cycles
> 1 times kmalloc(256)/kfree -> 119 cycles
> 1 times kmalloc(512)/kfree -> 119 cycles
> 1 times kmalloc(1024)/kfree -> 119 cycles
> 1 times kmalloc(2048)/kfree -> 119 cycles
> 1 times kmalloc(4096)/kfree -> 121 cycles
> 1 times kmalloc(8192)/kfree -> 119 cycles
> 1 times kmalloc(16384)/kfree -> 119 cycles
> 
> After:
> 
> Single thread testing
> =
> 1. Kmalloc: Repeatedly allocate then free test
> 1 times kmalloc(8) -> 130 cycles kfree -> 86 cycles
> 1 times kmalloc(16) -> 118 cycles kfree -> 86 cycles
> 1 times kmalloc(32) -> 121 cycles kfree -> 85 cycles
> 1 times kmalloc(64) -> 176 cycles kfree -> 102 cycles
> 1 times kmalloc(128) -> 178 cycles kfree -> 100 cycles
> 1 times kmalloc(256) -> 205 cycles kfree -> 109 cycles
> 1 times kmalloc(512) -> 262 cycles kfree -> 136 cycles
> 1 times kmalloc(1024) -> 342 cycles kfree -> 157 cycles
> 1 times kmalloc(2048) -> 701 cycles kfree -> 238 cycles
> 1 times kmalloc(4096) -> 803 cycles kfree -> 364 cycles
> 1 times kmalloc(8192) -> 835 cycles kfree -> 404 cycles
> 1 times kmalloc(16384) -> 896 cycles kfree -> 441 cycles
> 2. Kmalloc: alloc/free test
> 1 times kmalloc(8)/kfree -> 121 cycles
> 1 times kmalloc(16)/kfree -> 121 cycles
> 1 times kmalloc(32)/kfree -> 123 cycles
> 1 times kmalloc(64)/kfree -> 142 cycles
> 1 times kmalloc(128)/kfree -> 121 cycles
> 1 times kmalloc(256)/kfree -> 119 cycles
> 1 times kmalloc(512)/kfree -> 119 cycles
> 1 times kmalloc(1024)/kfree -> 119 cycles
> 1 times kmalloc(2048)/kfree -> 119 cycles
> 1 times kmalloc(4096)/kfree -> 119 cycles
> 1 times kmalloc(8192)/kfree -> 119 cycles
> 1 times kmalloc(16384)/kfree -> 119 cycles
> 
> Signed-off-by: Thomas Garnier 
> ---
> Based on next-20160422
> ---
>  include/linux/slab_def.h |   4 +
>  init/Kconfig |   9 ++
>  mm/slab.c| 213 
> ++-
>  3 files changed, 224 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
> index 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Thomas Garnier
On Mon, Apr 25, 2016 at 2:38 PM, Andrew Morton
 wrote:
> On Mon, 25 Apr 2016 14:14:33 -0700 Thomas Garnier  wrote:
>
>> >>> + /* Get best entropy at this stage */
>> >>> + get_random_bytes_arch(, sizeof(seed));
>> >>
>> >> See concerns in other email - isn't this a no-op if CONFIG_ARCH_RANDOM=n?
>> >>
>>
>> The arch_* functions will return 0 which will break the loop in
>> get_random_bytes_arch and make it uses extract_entropy (as does
>> get_random_bytes).
>> (cf http://lxr.free-electrons.com/source/drivers/char/random.c#L1335)
>>
>
> oop, sorry, I misread the code.
>
> (and the get_random_bytes_arch() comment "This function will use the
> architecture-specific hardware random number generator if it is
> available" is misleading, so there)

No problem, better double check it. I agree it is misleading.

Thomas


Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Thomas Garnier
On Mon, Apr 25, 2016 at 2:38 PM, Andrew Morton
 wrote:
> On Mon, 25 Apr 2016 14:14:33 -0700 Thomas Garnier  wrote:
>
>> >>> + /* Get best entropy at this stage */
>> >>> + get_random_bytes_arch(, sizeof(seed));
>> >>
>> >> See concerns in other email - isn't this a no-op if CONFIG_ARCH_RANDOM=n?
>> >>
>>
>> The arch_* functions will return 0 which will break the loop in
>> get_random_bytes_arch and make it uses extract_entropy (as does
>> get_random_bytes).
>> (cf http://lxr.free-electrons.com/source/drivers/char/random.c#L1335)
>>
>
> oop, sorry, I misread the code.
>
> (and the get_random_bytes_arch() comment "This function will use the
> architecture-specific hardware random number generator if it is
> available" is misleading, so there)

No problem, better double check it. I agree it is misleading.

Thomas


Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Andrew Morton
On Mon, 25 Apr 2016 14:14:33 -0700 Thomas Garnier  wrote:

> >>> + /* Get best entropy at this stage */
> >>> + get_random_bytes_arch(, sizeof(seed));
> >>
> >> See concerns in other email - isn't this a no-op if CONFIG_ARCH_RANDOM=n?
> >>
> 
> The arch_* functions will return 0 which will break the loop in
> get_random_bytes_arch and make it uses extract_entropy (as does
> get_random_bytes).
> (cf http://lxr.free-electrons.com/source/drivers/char/random.c#L1335)
> 

oop, sorry, I misread the code.

(and the get_random_bytes_arch() comment "This function will use the
architecture-specific hardware random number generator if it is
available" is misleading, so there)


Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Andrew Morton
On Mon, 25 Apr 2016 14:14:33 -0700 Thomas Garnier  wrote:

> >>> + /* Get best entropy at this stage */
> >>> + get_random_bytes_arch(, sizeof(seed));
> >>
> >> See concerns in other email - isn't this a no-op if CONFIG_ARCH_RANDOM=n?
> >>
> 
> The arch_* functions will return 0 which will break the loop in
> get_random_bytes_arch and make it uses extract_entropy (as does
> get_random_bytes).
> (cf http://lxr.free-electrons.com/source/drivers/char/random.c#L1335)
> 

oop, sorry, I misread the code.

(and the get_random_bytes_arch() comment "This function will use the
architecture-specific hardware random number generator if it is
available" is misleading, so there)


Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Thomas Garnier
On Mon, Apr 25, 2016 at 2:13 PM, Thomas Garnier  wrote:
> On Mon, Apr 25, 2016 at 2:10 PM, Andrew Morton
>  wrote:
>> On Mon, 25 Apr 2016 13:39:23 -0700 Thomas Garnier  
>> wrote:
>>
>>> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>>> SLAB freelist. The list is randomized during initialization of a new set
>>> of pages. The order on different freelist sizes is pre-computed at boot
>>> for performance. Each kmem_cache has its own randomized freelist except
>>> early on boot where global lists are used. This security feature reduces
>>> the predictability of the kernel SLAB allocator against heap overflows
>>> rendering attacks much less stable.
>>>
>>> For example this attack against SLUB (also applicable against SLAB)
>>> would be affected:
>>> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>>>
>>> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>>> a controllable heap is opened to new attacks not yet publicly discussed.
>>> A kernel heap overflow can be transformed to multiple use-after-free.
>>> This feature makes this type of attack harder too.
>>>
>>> To generate entropy, we use get_random_bytes_arch because 0 bits of
>>> entropy is available in the boot stage. In the worse case this function
>>> will fallback to the get_random_bytes sub API. We also generate a shift
>>> random number to shift pre-computed freelist for each new set of pages.
>>>
>>> The config option name is not specific to the SLAB as this approach will
>>> be extended to other allocators like SLUB.
>>>
>>> Performance results highlighted no major changes:
>>>
>>> slab_test 1 run on boot. Difference only seen on the 2048 size test
>>> being the worse case scenario covered by freelist randomization. New
>>> slab pages are constantly being created on the 1 allocations.
>>> Variance should be mainly due to getting new pages every few
>>> allocations.
>>>
>>> ...
>>>
>>> --- a/include/linux/slab_def.h
>>> +++ b/include/linux/slab_def.h
>>> @@ -80,6 +80,10 @@ struct kmem_cache {
>>>   struct kasan_cache kasan_info;
>>>  #endif
>>>
>>> +#ifdef CONFIG_FREELIST_RANDOM
>>
>> CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
>> CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
>> identifier could be used for implementing randomisation in
>> slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
>>
>>> + void *random_seq;
>>> +#endif
>>> +
>>>   struct kmem_cache_node *node[MAX_NUMNODES];
>>>  };
>>>
>>> diff --git a/init/Kconfig b/init/Kconfig
>>> index 0c66640..73453d0 100644
>>> --- a/init/Kconfig
>>> +++ b/init/Kconfig
>>> @@ -1742,6 +1742,15 @@ config SLOB
>>>
>>>  endchoice
>>>
>>> +config FREELIST_RANDOM
>>> + default n
>>> + depends on SLAB
>>> + bool "SLAB freelist randomization"
>>> + help
>>> +   Randomizes the freelist order used on creating new SLABs. This
>>> +   security feature reduces the predictability of the kernel slab
>>> +   allocator against heap overflows.
>>> +
>>>  config SLUB_CPU_PARTIAL
>>>   default y
>>>   depends on SLUB && SMP
>>> diff --git a/mm/slab.c b/mm/slab.c
>>> index b82ee6b..89eb617 100644
>>> --- a/mm/slab.c
>>> +++ b/mm/slab.c
>>> @@ -116,6 +116,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>
>>>  #include 
>>>
>>> @@ -1230,6 +1231,100 @@ static void __init set_up_node(struct kmem_cache 
>>> *cachep, int index)
>>>   }
>>>  }
>>>
>>> +#ifdef CONFIG_FREELIST_RANDOM
>>> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t 
>>> *list,
>>> + size_t count)
>>> +{
>>> + size_t i;
>>> + unsigned int rand;
>>> +
>>> + for (i = 0; i < count; i++)
>>> + list[i] = i;
>>> +
>>> + /* Fisher-Yates shuffle */
>>> + for (i = count - 1; i > 0; i--) {
>>> + rand = prandom_u32_state(state);
>>> + rand %= (i + 1);
>>> + swap(list[i], list[rand]);
>>> + }
>>> +}
>>> +
>>> +/* Create a random sequence per cache */
>>> +static void cache_random_seq_create(struct kmem_cache *cachep)
>>> +{
>>> + unsigned int seed, count = cachep->num;
>>> + struct rnd_state state;
>>> +
>>> + if (count < 2)
>>> + return;
>>> +
>>> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), 
>>> GFP_KERNEL);
>>> + BUG_ON(cachep->random_seq == NULL);
>
> On your previous email. (trying to stay in one thread). I added a
> comment on this
> version to explain that we need best entropy at this boot stage.
>
>>
>> Yikes, that's a bit rude.  Is there no way of recovering from this?  If
>> the answer to that is really really "no" then I guess we should put a
>> __GFP_NOFAIL in there.  Add a comment explaining why (apologetically -
>> __GFP_NOFAIL is unpopular!) and remove the now-unneeded BUG_ON.
>>
>>
>
> We can always use the static. I 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Thomas Garnier
On Mon, Apr 25, 2016 at 2:13 PM, Thomas Garnier  wrote:
> On Mon, Apr 25, 2016 at 2:10 PM, Andrew Morton
>  wrote:
>> On Mon, 25 Apr 2016 13:39:23 -0700 Thomas Garnier  
>> wrote:
>>
>>> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>>> SLAB freelist. The list is randomized during initialization of a new set
>>> of pages. The order on different freelist sizes is pre-computed at boot
>>> for performance. Each kmem_cache has its own randomized freelist except
>>> early on boot where global lists are used. This security feature reduces
>>> the predictability of the kernel SLAB allocator against heap overflows
>>> rendering attacks much less stable.
>>>
>>> For example this attack against SLUB (also applicable against SLAB)
>>> would be affected:
>>> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>>>
>>> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>>> a controllable heap is opened to new attacks not yet publicly discussed.
>>> A kernel heap overflow can be transformed to multiple use-after-free.
>>> This feature makes this type of attack harder too.
>>>
>>> To generate entropy, we use get_random_bytes_arch because 0 bits of
>>> entropy is available in the boot stage. In the worse case this function
>>> will fallback to the get_random_bytes sub API. We also generate a shift
>>> random number to shift pre-computed freelist for each new set of pages.
>>>
>>> The config option name is not specific to the SLAB as this approach will
>>> be extended to other allocators like SLUB.
>>>
>>> Performance results highlighted no major changes:
>>>
>>> slab_test 1 run on boot. Difference only seen on the 2048 size test
>>> being the worse case scenario covered by freelist randomization. New
>>> slab pages are constantly being created on the 1 allocations.
>>> Variance should be mainly due to getting new pages every few
>>> allocations.
>>>
>>> ...
>>>
>>> --- a/include/linux/slab_def.h
>>> +++ b/include/linux/slab_def.h
>>> @@ -80,6 +80,10 @@ struct kmem_cache {
>>>   struct kasan_cache kasan_info;
>>>  #endif
>>>
>>> +#ifdef CONFIG_FREELIST_RANDOM
>>
>> CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
>> CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
>> identifier could be used for implementing randomisation in
>> slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
>>
>>> + void *random_seq;
>>> +#endif
>>> +
>>>   struct kmem_cache_node *node[MAX_NUMNODES];
>>>  };
>>>
>>> diff --git a/init/Kconfig b/init/Kconfig
>>> index 0c66640..73453d0 100644
>>> --- a/init/Kconfig
>>> +++ b/init/Kconfig
>>> @@ -1742,6 +1742,15 @@ config SLOB
>>>
>>>  endchoice
>>>
>>> +config FREELIST_RANDOM
>>> + default n
>>> + depends on SLAB
>>> + bool "SLAB freelist randomization"
>>> + help
>>> +   Randomizes the freelist order used on creating new SLABs. This
>>> +   security feature reduces the predictability of the kernel slab
>>> +   allocator against heap overflows.
>>> +
>>>  config SLUB_CPU_PARTIAL
>>>   default y
>>>   depends on SLUB && SMP
>>> diff --git a/mm/slab.c b/mm/slab.c
>>> index b82ee6b..89eb617 100644
>>> --- a/mm/slab.c
>>> +++ b/mm/slab.c
>>> @@ -116,6 +116,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>
>>>  #include 
>>>
>>> @@ -1230,6 +1231,100 @@ static void __init set_up_node(struct kmem_cache 
>>> *cachep, int index)
>>>   }
>>>  }
>>>
>>> +#ifdef CONFIG_FREELIST_RANDOM
>>> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t 
>>> *list,
>>> + size_t count)
>>> +{
>>> + size_t i;
>>> + unsigned int rand;
>>> +
>>> + for (i = 0; i < count; i++)
>>> + list[i] = i;
>>> +
>>> + /* Fisher-Yates shuffle */
>>> + for (i = count - 1; i > 0; i--) {
>>> + rand = prandom_u32_state(state);
>>> + rand %= (i + 1);
>>> + swap(list[i], list[rand]);
>>> + }
>>> +}
>>> +
>>> +/* Create a random sequence per cache */
>>> +static void cache_random_seq_create(struct kmem_cache *cachep)
>>> +{
>>> + unsigned int seed, count = cachep->num;
>>> + struct rnd_state state;
>>> +
>>> + if (count < 2)
>>> + return;
>>> +
>>> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), 
>>> GFP_KERNEL);
>>> + BUG_ON(cachep->random_seq == NULL);
>
> On your previous email. (trying to stay in one thread). I added a
> comment on this
> version to explain that we need best entropy at this boot stage.
>
>>
>> Yikes, that's a bit rude.  Is there no way of recovering from this?  If
>> the answer to that is really really "no" then I guess we should put a
>> __GFP_NOFAIL in there.  Add a comment explaining why (apologetically -
>> __GFP_NOFAIL is unpopular!) and remove the now-unneeded BUG_ON.
>>
>>
>
> We can always use the static. I will update on next iteration to remove the
> BUG_ON.
>
>>> + /* 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Thomas Garnier
On Mon, Apr 25, 2016 at 2:10 PM, Andrew Morton
 wrote:
> On Mon, 25 Apr 2016 13:39:23 -0700 Thomas Garnier  wrote:
>
>> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>> SLAB freelist. The list is randomized during initialization of a new set
>> of pages. The order on different freelist sizes is pre-computed at boot
>> for performance. Each kmem_cache has its own randomized freelist except
>> early on boot where global lists are used. This security feature reduces
>> the predictability of the kernel SLAB allocator against heap overflows
>> rendering attacks much less stable.
>>
>> For example this attack against SLUB (also applicable against SLAB)
>> would be affected:
>> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>>
>> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>> a controllable heap is opened to new attacks not yet publicly discussed.
>> A kernel heap overflow can be transformed to multiple use-after-free.
>> This feature makes this type of attack harder too.
>>
>> To generate entropy, we use get_random_bytes_arch because 0 bits of
>> entropy is available in the boot stage. In the worse case this function
>> will fallback to the get_random_bytes sub API. We also generate a shift
>> random number to shift pre-computed freelist for each new set of pages.
>>
>> The config option name is not specific to the SLAB as this approach will
>> be extended to other allocators like SLUB.
>>
>> Performance results highlighted no major changes:
>>
>> slab_test 1 run on boot. Difference only seen on the 2048 size test
>> being the worse case scenario covered by freelist randomization. New
>> slab pages are constantly being created on the 1 allocations.
>> Variance should be mainly due to getting new pages every few
>> allocations.
>>
>> ...
>>
>> --- a/include/linux/slab_def.h
>> +++ b/include/linux/slab_def.h
>> @@ -80,6 +80,10 @@ struct kmem_cache {
>>   struct kasan_cache kasan_info;
>>  #endif
>>
>> +#ifdef CONFIG_FREELIST_RANDOM
>
> CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> identifier could be used for implementing randomisation in
> slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
>
>> + void *random_seq;
>> +#endif
>> +
>>   struct kmem_cache_node *node[MAX_NUMNODES];
>>  };
>>
>> diff --git a/init/Kconfig b/init/Kconfig
>> index 0c66640..73453d0 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -1742,6 +1742,15 @@ config SLOB
>>
>>  endchoice
>>
>> +config FREELIST_RANDOM
>> + default n
>> + depends on SLAB
>> + bool "SLAB freelist randomization"
>> + help
>> +   Randomizes the freelist order used on creating new SLABs. This
>> +   security feature reduces the predictability of the kernel slab
>> +   allocator against heap overflows.
>> +
>>  config SLUB_CPU_PARTIAL
>>   default y
>>   depends on SLUB && SMP
>> diff --git a/mm/slab.c b/mm/slab.c
>> index b82ee6b..89eb617 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -116,6 +116,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>
>>  #include 
>>
>> @@ -1230,6 +1231,100 @@ static void __init set_up_node(struct kmem_cache 
>> *cachep, int index)
>>   }
>>  }
>>
>> +#ifdef CONFIG_FREELIST_RANDOM
>> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t 
>> *list,
>> + size_t count)
>> +{
>> + size_t i;
>> + unsigned int rand;
>> +
>> + for (i = 0; i < count; i++)
>> + list[i] = i;
>> +
>> + /* Fisher-Yates shuffle */
>> + for (i = count - 1; i > 0; i--) {
>> + rand = prandom_u32_state(state);
>> + rand %= (i + 1);
>> + swap(list[i], list[rand]);
>> + }
>> +}
>> +
>> +/* Create a random sequence per cache */
>> +static void cache_random_seq_create(struct kmem_cache *cachep)
>> +{
>> + unsigned int seed, count = cachep->num;
>> + struct rnd_state state;
>> +
>> + if (count < 2)
>> + return;
>> +
>> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), 
>> GFP_KERNEL);
>> + BUG_ON(cachep->random_seq == NULL);

On your previous email. (trying to stay in one thread). I added a
comment on this
version to explain that we need best entropy at this boot stage.

>
> Yikes, that's a bit rude.  Is there no way of recovering from this?  If
> the answer to that is really really "no" then I guess we should put a
> __GFP_NOFAIL in there.  Add a comment explaining why (apologetically -
> __GFP_NOFAIL is unpopular!) and remove the now-unneeded BUG_ON.
>
>

We can always use the static. I will update on next iteration to remove the
BUG_ON.

>> + /* Get best entropy at this stage */
>> + get_random_bytes_arch(, sizeof(seed));
>
> See concerns in other email - isn't this a no-op if CONFIG_ARCH_RANDOM=n?
>



Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Thomas Garnier
On Mon, Apr 25, 2016 at 2:10 PM, Andrew Morton
 wrote:
> On Mon, 25 Apr 2016 13:39:23 -0700 Thomas Garnier  wrote:
>
>> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>> SLAB freelist. The list is randomized during initialization of a new set
>> of pages. The order on different freelist sizes is pre-computed at boot
>> for performance. Each kmem_cache has its own randomized freelist except
>> early on boot where global lists are used. This security feature reduces
>> the predictability of the kernel SLAB allocator against heap overflows
>> rendering attacks much less stable.
>>
>> For example this attack against SLUB (also applicable against SLAB)
>> would be affected:
>> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>>
>> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>> a controllable heap is opened to new attacks not yet publicly discussed.
>> A kernel heap overflow can be transformed to multiple use-after-free.
>> This feature makes this type of attack harder too.
>>
>> To generate entropy, we use get_random_bytes_arch because 0 bits of
>> entropy is available in the boot stage. In the worse case this function
>> will fallback to the get_random_bytes sub API. We also generate a shift
>> random number to shift pre-computed freelist for each new set of pages.
>>
>> The config option name is not specific to the SLAB as this approach will
>> be extended to other allocators like SLUB.
>>
>> Performance results highlighted no major changes:
>>
>> slab_test 1 run on boot. Difference only seen on the 2048 size test
>> being the worse case scenario covered by freelist randomization. New
>> slab pages are constantly being created on the 1 allocations.
>> Variance should be mainly due to getting new pages every few
>> allocations.
>>
>> ...
>>
>> --- a/include/linux/slab_def.h
>> +++ b/include/linux/slab_def.h
>> @@ -80,6 +80,10 @@ struct kmem_cache {
>>   struct kasan_cache kasan_info;
>>  #endif
>>
>> +#ifdef CONFIG_FREELIST_RANDOM
>
> CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> identifier could be used for implementing randomisation in
> slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
>
>> + void *random_seq;
>> +#endif
>> +
>>   struct kmem_cache_node *node[MAX_NUMNODES];
>>  };
>>
>> diff --git a/init/Kconfig b/init/Kconfig
>> index 0c66640..73453d0 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -1742,6 +1742,15 @@ config SLOB
>>
>>  endchoice
>>
>> +config FREELIST_RANDOM
>> + default n
>> + depends on SLAB
>> + bool "SLAB freelist randomization"
>> + help
>> +   Randomizes the freelist order used on creating new SLABs. This
>> +   security feature reduces the predictability of the kernel slab
>> +   allocator against heap overflows.
>> +
>>  config SLUB_CPU_PARTIAL
>>   default y
>>   depends on SLUB && SMP
>> diff --git a/mm/slab.c b/mm/slab.c
>> index b82ee6b..89eb617 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -116,6 +116,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>
>>  #include 
>>
>> @@ -1230,6 +1231,100 @@ static void __init set_up_node(struct kmem_cache 
>> *cachep, int index)
>>   }
>>  }
>>
>> +#ifdef CONFIG_FREELIST_RANDOM
>> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t 
>> *list,
>> + size_t count)
>> +{
>> + size_t i;
>> + unsigned int rand;
>> +
>> + for (i = 0; i < count; i++)
>> + list[i] = i;
>> +
>> + /* Fisher-Yates shuffle */
>> + for (i = count - 1; i > 0; i--) {
>> + rand = prandom_u32_state(state);
>> + rand %= (i + 1);
>> + swap(list[i], list[rand]);
>> + }
>> +}
>> +
>> +/* Create a random sequence per cache */
>> +static void cache_random_seq_create(struct kmem_cache *cachep)
>> +{
>> + unsigned int seed, count = cachep->num;
>> + struct rnd_state state;
>> +
>> + if (count < 2)
>> + return;
>> +
>> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), 
>> GFP_KERNEL);
>> + BUG_ON(cachep->random_seq == NULL);

On your previous email. (trying to stay in one thread). I added a
comment on this
version to explain that we need best entropy at this boot stage.

>
> Yikes, that's a bit rude.  Is there no way of recovering from this?  If
> the answer to that is really really "no" then I guess we should put a
> __GFP_NOFAIL in there.  Add a comment explaining why (apologetically -
> __GFP_NOFAIL is unpopular!) and remove the now-unneeded BUG_ON.
>
>

We can always use the static. I will update on next iteration to remove the
BUG_ON.

>> + /* Get best entropy at this stage */
>> + get_random_bytes_arch(, sizeof(seed));
>
> See concerns in other email - isn't this a no-op if CONFIG_ARCH_RANDOM=n?
>


>
>> + prandom_seed_state(, seed);
>> +
>> + 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Andrew Morton
On Mon, 25 Apr 2016 13:39:23 -0700 Thomas Garnier  wrote:

> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> SLAB freelist. The list is randomized during initialization of a new set
> of pages. The order on different freelist sizes is pre-computed at boot
> for performance. Each kmem_cache has its own randomized freelist except
> early on boot where global lists are used. This security feature reduces
> the predictability of the kernel SLAB allocator against heap overflows
> rendering attacks much less stable.
> 
> For example this attack against SLUB (also applicable against SLAB)
> would be affected:
> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> 
> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> a controllable heap is opened to new attacks not yet publicly discussed.
> A kernel heap overflow can be transformed to multiple use-after-free.
> This feature makes this type of attack harder too.
> 
> To generate entropy, we use get_random_bytes_arch because 0 bits of
> entropy is available in the boot stage. In the worse case this function
> will fallback to the get_random_bytes sub API. We also generate a shift
> random number to shift pre-computed freelist for each new set of pages.
> 
> The config option name is not specific to the SLAB as this approach will
> be extended to other allocators like SLUB.
> 
> Performance results highlighted no major changes:
> 
> slab_test 1 run on boot. Difference only seen on the 2048 size test
> being the worse case scenario covered by freelist randomization. New
> slab pages are constantly being created on the 1 allocations.
> Variance should be mainly due to getting new pages every few
> allocations.
>
> ...
>
> --- a/include/linux/slab_def.h
> +++ b/include/linux/slab_def.h
> @@ -80,6 +80,10 @@ struct kmem_cache {
>   struct kasan_cache kasan_info;
>  #endif
>  
> +#ifdef CONFIG_FREELIST_RANDOM

CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague. 
CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
identifier could be used for implementing randomisation in
slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?

> + void *random_seq;
> +#endif
> +
>   struct kmem_cache_node *node[MAX_NUMNODES];
>  };
>  
> diff --git a/init/Kconfig b/init/Kconfig
> index 0c66640..73453d0 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1742,6 +1742,15 @@ config SLOB
>  
>  endchoice
>  
> +config FREELIST_RANDOM
> + default n
> + depends on SLAB
> + bool "SLAB freelist randomization"
> + help
> +   Randomizes the freelist order used on creating new SLABs. This
> +   security feature reduces the predictability of the kernel slab
> +   allocator against heap overflows.
> +
>  config SLUB_CPU_PARTIAL
>   default y
>   depends on SLUB && SMP
> diff --git a/mm/slab.c b/mm/slab.c
> index b82ee6b..89eb617 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -116,6 +116,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -1230,6 +1231,100 @@ static void __init set_up_node(struct kmem_cache 
> *cachep, int index)
>   }
>  }
>  
> +#ifdef CONFIG_FREELIST_RANDOM
> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t *list,
> + size_t count)
> +{
> + size_t i;
> + unsigned int rand;
> +
> + for (i = 0; i < count; i++)
> + list[i] = i;
> +
> + /* Fisher-Yates shuffle */
> + for (i = count - 1; i > 0; i--) {
> + rand = prandom_u32_state(state);
> + rand %= (i + 1);
> + swap(list[i], list[rand]);
> + }
> +}
> +
> +/* Create a random sequence per cache */
> +static void cache_random_seq_create(struct kmem_cache *cachep)
> +{
> + unsigned int seed, count = cachep->num;
> + struct rnd_state state;
> +
> + if (count < 2)
> + return;
> +
> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> + BUG_ON(cachep->random_seq == NULL);

Yikes, that's a bit rude.  Is there no way of recovering from this?  If
the answer to that is really really "no" then I guess we should put a
__GFP_NOFAIL in there.  Add a comment explaining why (apologetically -
__GFP_NOFAIL is unpopular!) and remove the now-unneeded BUG_ON.


> + /* Get best entropy at this stage */
> + get_random_bytes_arch(, sizeof(seed));

See concerns in other email - isn't this a no-op if CONFIG_ARCH_RANDOM=n?


> + prandom_seed_state(, seed);
> +
> + freelist_randomize(, cachep->random_seq, count);
> +}
> +



Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-25 Thread Andrew Morton
On Mon, 25 Apr 2016 13:39:23 -0700 Thomas Garnier  wrote:

> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> SLAB freelist. The list is randomized during initialization of a new set
> of pages. The order on different freelist sizes is pre-computed at boot
> for performance. Each kmem_cache has its own randomized freelist except
> early on boot where global lists are used. This security feature reduces
> the predictability of the kernel SLAB allocator against heap overflows
> rendering attacks much less stable.
> 
> For example this attack against SLUB (also applicable against SLAB)
> would be affected:
> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> 
> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> a controllable heap is opened to new attacks not yet publicly discussed.
> A kernel heap overflow can be transformed to multiple use-after-free.
> This feature makes this type of attack harder too.
> 
> To generate entropy, we use get_random_bytes_arch because 0 bits of
> entropy is available in the boot stage. In the worse case this function
> will fallback to the get_random_bytes sub API. We also generate a shift
> random number to shift pre-computed freelist for each new set of pages.
> 
> The config option name is not specific to the SLAB as this approach will
> be extended to other allocators like SLUB.
> 
> Performance results highlighted no major changes:
> 
> slab_test 1 run on boot. Difference only seen on the 2048 size test
> being the worse case scenario covered by freelist randomization. New
> slab pages are constantly being created on the 1 allocations.
> Variance should be mainly due to getting new pages every few
> allocations.
>
> ...
>
> --- a/include/linux/slab_def.h
> +++ b/include/linux/slab_def.h
> @@ -80,6 +80,10 @@ struct kmem_cache {
>   struct kasan_cache kasan_info;
>  #endif
>  
> +#ifdef CONFIG_FREELIST_RANDOM

CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague. 
CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
identifier could be used for implementing randomisation in
slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?

> + void *random_seq;
> +#endif
> +
>   struct kmem_cache_node *node[MAX_NUMNODES];
>  };
>  
> diff --git a/init/Kconfig b/init/Kconfig
> index 0c66640..73453d0 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1742,6 +1742,15 @@ config SLOB
>  
>  endchoice
>  
> +config FREELIST_RANDOM
> + default n
> + depends on SLAB
> + bool "SLAB freelist randomization"
> + help
> +   Randomizes the freelist order used on creating new SLABs. This
> +   security feature reduces the predictability of the kernel slab
> +   allocator against heap overflows.
> +
>  config SLUB_CPU_PARTIAL
>   default y
>   depends on SLUB && SMP
> diff --git a/mm/slab.c b/mm/slab.c
> index b82ee6b..89eb617 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -116,6 +116,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -1230,6 +1231,100 @@ static void __init set_up_node(struct kmem_cache 
> *cachep, int index)
>   }
>  }
>  
> +#ifdef CONFIG_FREELIST_RANDOM
> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t *list,
> + size_t count)
> +{
> + size_t i;
> + unsigned int rand;
> +
> + for (i = 0; i < count; i++)
> + list[i] = i;
> +
> + /* Fisher-Yates shuffle */
> + for (i = count - 1; i > 0; i--) {
> + rand = prandom_u32_state(state);
> + rand %= (i + 1);
> + swap(list[i], list[rand]);
> + }
> +}
> +
> +/* Create a random sequence per cache */
> +static void cache_random_seq_create(struct kmem_cache *cachep)
> +{
> + unsigned int seed, count = cachep->num;
> + struct rnd_state state;
> +
> + if (count < 2)
> + return;
> +
> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> + BUG_ON(cachep->random_seq == NULL);

Yikes, that's a bit rude.  Is there no way of recovering from this?  If
the answer to that is really really "no" then I guess we should put a
__GFP_NOFAIL in there.  Add a comment explaining why (apologetically -
__GFP_NOFAIL is unpopular!) and remove the now-unneeded BUG_ON.


> + /* Get best entropy at this stage */
> + get_random_bytes_arch(, sizeof(seed));

See concerns in other email - isn't this a no-op if CONFIG_ARCH_RANDOM=n?


> + prandom_seed_state(, seed);
> +
> + freelist_randomize(, cachep->random_seq, count);
> +}
> +



Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-20 Thread Thomas Garnier
On Wed, Apr 20, 2016 at 1:08 AM, Joonsoo Kim  wrote:
> On Tue, Apr 19, 2016 at 09:44:54AM -0700, Thomas Garnier wrote:
>> On Tue, Apr 19, 2016 at 12:15 AM, Joonsoo Kim  wrote:
>> > On Mon, Apr 18, 2016 at 10:14:39AM -0700, Thomas Garnier wrote:
>> >> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>> >> SLAB freelist. The list is randomized during initialization of a new set
>> >> of pages. The order on different freelist sizes is pre-computed at boot
>> >> for performance. This security feature reduces the predictability of the
>> >> kernel SLAB allocator against heap overflows rendering attacks much less
>> >> stable.
>> >
>> > I'm not familiar on security but it doesn't look much secure than
>> > before. Is there any other way to generate different sequence of freelist
>> > for each new set of pages? Current approach using pre-computed array will
>> > generate same sequence of freelist for all new set of pages having same 
>> > size
>> > class. Is it sufficient?
>> >
>>
>> I think it is sufficient. There is a tradeoff for performance. We could 
>> randomly
>> pick an object from the freelist every time (on slab_get_obj) but I
>> think it will
>> have significant impact (at least 3%).
>>
>> >> For example this attack against SLUB (also applicable against SLAB)
>> >> would be affected:
>> >> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>> >>
>> >> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>> >> a controllable heap is opened to new attacks not yet publicly discussed.
>> >> A kernel heap overflow can be transformed to multiple use-after-free.
>> >> This feature makes this type of attack harder too.
>> >>
>> >> To generate entropy, we use get_random_bytes_arch because 0 bits of
>> >> entropy is available at that boot stage. In the worse case this function
>> >> will fallback to the get_random_bytes sub API.
>> >>
>> >> The config option name is not specific to the SLAB as this approach will
>> >> be extended to other allocators like SLUB.
>> >
>> > If this feature will be applied to the SLUB, it's better to put common
>> > code to mm/slab_common.c.
>> >
>>
>> I think it might be moved there once we implement the SLUB counterpart
>> but it is too early to define which part will be common.
>>
>> >>
>> >> Performance results highlighted no major changes:
>> >>
>> >> Netperf average on 10 runs:
>> >>
>> >> threads,base,change
>> >> 16,576943.10,585905.90 (101.55%)
>> >> 32,564082.00,569741.20 (101.00%)
>> >> 48,558334.30,561851.20 (100.63%)
>> >> 64,552025.20,556448.30 (100.80%)
>> >> 80,552294.40,551743.10 (99.90%)
>> >> 96,552435.30,547529.20 (99.11%)
>> >> 112,551320.60,550183.20 (99.79%)
>> >> 128,549138.30,550542.70 (100.26%)
>> >> 144,549344.50,544529.10 (99.12%)
>> >> 160,550360.80,539929.30 (98.10%)
>> >>
>> >> slab_test 1 run on boot. After is faster except for odd result on size
>> >> 2048.
>> >
>> > Hmm... It's odd result. It adds more logic and it should
>> > decrease performance. I guess it would be experimental error but
>> > do you have any analysis about this result?
>> >
>>
>> I don't. I am glad to redo the test. I found that slab_test has very 
>> different
>> result based on the heap state at the time of the test. If I run the
>> test multiple
>> times, I have really various results on with or without the mitigation (on
>> dedicated hardware).
>>
>> >>
>> >> Before:
>> >>
>> >> Single thread testing
>> >> =
>> >> 1. Kmalloc: Repeatedly allocate then free test
>> >> 1 times kmalloc(8) -> 137 cycles kfree -> 126 cycles
>> >> 1 times kmalloc(16) -> 118 cycles kfree -> 119 cycles
>> >> 1 times kmalloc(32) -> 112 cycles kfree -> 119 cycles
>> >> 1 times kmalloc(64) -> 126 cycles kfree -> 123 cycles
>> >> 1 times kmalloc(128) -> 135 cycles kfree -> 131 cycles
>> >> 1 times kmalloc(256) -> 165 cycles kfree -> 104 cycles
>> >> 1 times kmalloc(512) -> 174 cycles kfree -> 126 cycles
>> >> 1 times kmalloc(1024) -> 242 cycles kfree -> 160 cycles
>> >> 1 times kmalloc(2048) -> 478 cycles kfree -> 239 cycles
>> >> 1 times kmalloc(4096) -> 747 cycles kfree -> 364 cycles
>> >> 1 times kmalloc(8192) -> 774 cycles kfree -> 404 cycles
>> >> 1 times kmalloc(16384) -> 849 cycles kfree -> 430 cycles
>> >> 2. Kmalloc: alloc/free test
>> >> 1 times kmalloc(8)/kfree -> 118 cycles
>> >> 1 times kmalloc(16)/kfree -> 118 cycles
>> >> 1 times kmalloc(32)/kfree -> 118 cycles
>> >> 1 times kmalloc(64)/kfree -> 121 cycles
>> >> 1 times kmalloc(128)/kfree -> 118 cycles
>> >> 1 times kmalloc(256)/kfree -> 115 cycles
>> >> 1 times kmalloc(512)/kfree -> 115 cycles
>> >> 1 times kmalloc(1024)/kfree -> 115 cycles
>> >> 1 times kmalloc(2048)/kfree -> 115 cycles
>> >> 1 times kmalloc(4096)/kfree -> 115 cycles
>> >> 1 times kmalloc(8192)/kfree -> 115 cycles
>> >> 1 times kmalloc(16384)/kfree 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-20 Thread Thomas Garnier
On Wed, Apr 20, 2016 at 1:08 AM, Joonsoo Kim  wrote:
> On Tue, Apr 19, 2016 at 09:44:54AM -0700, Thomas Garnier wrote:
>> On Tue, Apr 19, 2016 at 12:15 AM, Joonsoo Kim  wrote:
>> > On Mon, Apr 18, 2016 at 10:14:39AM -0700, Thomas Garnier wrote:
>> >> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>> >> SLAB freelist. The list is randomized during initialization of a new set
>> >> of pages. The order on different freelist sizes is pre-computed at boot
>> >> for performance. This security feature reduces the predictability of the
>> >> kernel SLAB allocator against heap overflows rendering attacks much less
>> >> stable.
>> >
>> > I'm not familiar on security but it doesn't look much secure than
>> > before. Is there any other way to generate different sequence of freelist
>> > for each new set of pages? Current approach using pre-computed array will
>> > generate same sequence of freelist for all new set of pages having same 
>> > size
>> > class. Is it sufficient?
>> >
>>
>> I think it is sufficient. There is a tradeoff for performance. We could 
>> randomly
>> pick an object from the freelist every time (on slab_get_obj) but I
>> think it will
>> have significant impact (at least 3%).
>>
>> >> For example this attack against SLUB (also applicable against SLAB)
>> >> would be affected:
>> >> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>> >>
>> >> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>> >> a controllable heap is opened to new attacks not yet publicly discussed.
>> >> A kernel heap overflow can be transformed to multiple use-after-free.
>> >> This feature makes this type of attack harder too.
>> >>
>> >> To generate entropy, we use get_random_bytes_arch because 0 bits of
>> >> entropy is available at that boot stage. In the worse case this function
>> >> will fallback to the get_random_bytes sub API.
>> >>
>> >> The config option name is not specific to the SLAB as this approach will
>> >> be extended to other allocators like SLUB.
>> >
>> > If this feature will be applied to the SLUB, it's better to put common
>> > code to mm/slab_common.c.
>> >
>>
>> I think it might be moved there once we implement the SLUB counterpart
>> but it is too early to define which part will be common.
>>
>> >>
>> >> Performance results highlighted no major changes:
>> >>
>> >> Netperf average on 10 runs:
>> >>
>> >> threads,base,change
>> >> 16,576943.10,585905.90 (101.55%)
>> >> 32,564082.00,569741.20 (101.00%)
>> >> 48,558334.30,561851.20 (100.63%)
>> >> 64,552025.20,556448.30 (100.80%)
>> >> 80,552294.40,551743.10 (99.90%)
>> >> 96,552435.30,547529.20 (99.11%)
>> >> 112,551320.60,550183.20 (99.79%)
>> >> 128,549138.30,550542.70 (100.26%)
>> >> 144,549344.50,544529.10 (99.12%)
>> >> 160,550360.80,539929.30 (98.10%)
>> >>
>> >> slab_test 1 run on boot. After is faster except for odd result on size
>> >> 2048.
>> >
>> > Hmm... It's odd result. It adds more logic and it should
>> > decrease performance. I guess it would be experimental error but
>> > do you have any analysis about this result?
>> >
>>
>> I don't. I am glad to redo the test. I found that slab_test has very 
>> different
>> result based on the heap state at the time of the test. If I run the
>> test multiple
>> times, I have really various results on with or without the mitigation (on
>> dedicated hardware).
>>
>> >>
>> >> Before:
>> >>
>> >> Single thread testing
>> >> =
>> >> 1. Kmalloc: Repeatedly allocate then free test
>> >> 1 times kmalloc(8) -> 137 cycles kfree -> 126 cycles
>> >> 1 times kmalloc(16) -> 118 cycles kfree -> 119 cycles
>> >> 1 times kmalloc(32) -> 112 cycles kfree -> 119 cycles
>> >> 1 times kmalloc(64) -> 126 cycles kfree -> 123 cycles
>> >> 1 times kmalloc(128) -> 135 cycles kfree -> 131 cycles
>> >> 1 times kmalloc(256) -> 165 cycles kfree -> 104 cycles
>> >> 1 times kmalloc(512) -> 174 cycles kfree -> 126 cycles
>> >> 1 times kmalloc(1024) -> 242 cycles kfree -> 160 cycles
>> >> 1 times kmalloc(2048) -> 478 cycles kfree -> 239 cycles
>> >> 1 times kmalloc(4096) -> 747 cycles kfree -> 364 cycles
>> >> 1 times kmalloc(8192) -> 774 cycles kfree -> 404 cycles
>> >> 1 times kmalloc(16384) -> 849 cycles kfree -> 430 cycles
>> >> 2. Kmalloc: alloc/free test
>> >> 1 times kmalloc(8)/kfree -> 118 cycles
>> >> 1 times kmalloc(16)/kfree -> 118 cycles
>> >> 1 times kmalloc(32)/kfree -> 118 cycles
>> >> 1 times kmalloc(64)/kfree -> 121 cycles
>> >> 1 times kmalloc(128)/kfree -> 118 cycles
>> >> 1 times kmalloc(256)/kfree -> 115 cycles
>> >> 1 times kmalloc(512)/kfree -> 115 cycles
>> >> 1 times kmalloc(1024)/kfree -> 115 cycles
>> >> 1 times kmalloc(2048)/kfree -> 115 cycles
>> >> 1 times kmalloc(4096)/kfree -> 115 cycles
>> >> 1 times kmalloc(8192)/kfree -> 115 cycles
>> >> 1 times kmalloc(16384)/kfree -> 115 cycles
>> >>
>> >> After:
>> >>
>> >> 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-20 Thread Joonsoo Kim
On Tue, Apr 19, 2016 at 09:44:54AM -0700, Thomas Garnier wrote:
> On Tue, Apr 19, 2016 at 12:15 AM, Joonsoo Kim  wrote:
> > On Mon, Apr 18, 2016 at 10:14:39AM -0700, Thomas Garnier wrote:
> >> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> >> SLAB freelist. The list is randomized during initialization of a new set
> >> of pages. The order on different freelist sizes is pre-computed at boot
> >> for performance. This security feature reduces the predictability of the
> >> kernel SLAB allocator against heap overflows rendering attacks much less
> >> stable.
> >
> > I'm not familiar on security but it doesn't look much secure than
> > before. Is there any other way to generate different sequence of freelist
> > for each new set of pages? Current approach using pre-computed array will
> > generate same sequence of freelist for all new set of pages having same size
> > class. Is it sufficient?
> >
> 
> I think it is sufficient. There is a tradeoff for performance. We could 
> randomly
> pick an object from the freelist every time (on slab_get_obj) but I
> think it will
> have significant impact (at least 3%).
> 
> >> For example this attack against SLUB (also applicable against SLAB)
> >> would be affected:
> >> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> >>
> >> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> >> a controllable heap is opened to new attacks not yet publicly discussed.
> >> A kernel heap overflow can be transformed to multiple use-after-free.
> >> This feature makes this type of attack harder too.
> >>
> >> To generate entropy, we use get_random_bytes_arch because 0 bits of
> >> entropy is available at that boot stage. In the worse case this function
> >> will fallback to the get_random_bytes sub API.
> >>
> >> The config option name is not specific to the SLAB as this approach will
> >> be extended to other allocators like SLUB.
> >
> > If this feature will be applied to the SLUB, it's better to put common
> > code to mm/slab_common.c.
> >
> 
> I think it might be moved there once we implement the SLUB counterpart
> but it is too early to define which part will be common.
> 
> >>
> >> Performance results highlighted no major changes:
> >>
> >> Netperf average on 10 runs:
> >>
> >> threads,base,change
> >> 16,576943.10,585905.90 (101.55%)
> >> 32,564082.00,569741.20 (101.00%)
> >> 48,558334.30,561851.20 (100.63%)
> >> 64,552025.20,556448.30 (100.80%)
> >> 80,552294.40,551743.10 (99.90%)
> >> 96,552435.30,547529.20 (99.11%)
> >> 112,551320.60,550183.20 (99.79%)
> >> 128,549138.30,550542.70 (100.26%)
> >> 144,549344.50,544529.10 (99.12%)
> >> 160,550360.80,539929.30 (98.10%)
> >>
> >> slab_test 1 run on boot. After is faster except for odd result on size
> >> 2048.
> >
> > Hmm... It's odd result. It adds more logic and it should
> > decrease performance. I guess it would be experimental error but
> > do you have any analysis about this result?
> >
> 
> I don't. I am glad to redo the test. I found that slab_test has very different
> result based on the heap state at the time of the test. If I run the
> test multiple
> times, I have really various results on with or without the mitigation (on
> dedicated hardware).
> 
> >>
> >> Before:
> >>
> >> Single thread testing
> >> =
> >> 1. Kmalloc: Repeatedly allocate then free test
> >> 1 times kmalloc(8) -> 137 cycles kfree -> 126 cycles
> >> 1 times kmalloc(16) -> 118 cycles kfree -> 119 cycles
> >> 1 times kmalloc(32) -> 112 cycles kfree -> 119 cycles
> >> 1 times kmalloc(64) -> 126 cycles kfree -> 123 cycles
> >> 1 times kmalloc(128) -> 135 cycles kfree -> 131 cycles
> >> 1 times kmalloc(256) -> 165 cycles kfree -> 104 cycles
> >> 1 times kmalloc(512) -> 174 cycles kfree -> 126 cycles
> >> 1 times kmalloc(1024) -> 242 cycles kfree -> 160 cycles
> >> 1 times kmalloc(2048) -> 478 cycles kfree -> 239 cycles
> >> 1 times kmalloc(4096) -> 747 cycles kfree -> 364 cycles
> >> 1 times kmalloc(8192) -> 774 cycles kfree -> 404 cycles
> >> 1 times kmalloc(16384) -> 849 cycles kfree -> 430 cycles
> >> 2. Kmalloc: alloc/free test
> >> 1 times kmalloc(8)/kfree -> 118 cycles
> >> 1 times kmalloc(16)/kfree -> 118 cycles
> >> 1 times kmalloc(32)/kfree -> 118 cycles
> >> 1 times kmalloc(64)/kfree -> 121 cycles
> >> 1 times kmalloc(128)/kfree -> 118 cycles
> >> 1 times kmalloc(256)/kfree -> 115 cycles
> >> 1 times kmalloc(512)/kfree -> 115 cycles
> >> 1 times kmalloc(1024)/kfree -> 115 cycles
> >> 1 times kmalloc(2048)/kfree -> 115 cycles
> >> 1 times kmalloc(4096)/kfree -> 115 cycles
> >> 1 times kmalloc(8192)/kfree -> 115 cycles
> >> 1 times kmalloc(16384)/kfree -> 115 cycles
> >>
> >> After:
> >>
> >> Single thread testing
> >> =
> >> 1. Kmalloc: Repeatedly allocate then free test
> >> 1 times kmalloc(8) -> 99 cycles kfree 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-20 Thread Joonsoo Kim
On Tue, Apr 19, 2016 at 09:44:54AM -0700, Thomas Garnier wrote:
> On Tue, Apr 19, 2016 at 12:15 AM, Joonsoo Kim  wrote:
> > On Mon, Apr 18, 2016 at 10:14:39AM -0700, Thomas Garnier wrote:
> >> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> >> SLAB freelist. The list is randomized during initialization of a new set
> >> of pages. The order on different freelist sizes is pre-computed at boot
> >> for performance. This security feature reduces the predictability of the
> >> kernel SLAB allocator against heap overflows rendering attacks much less
> >> stable.
> >
> > I'm not familiar on security but it doesn't look much secure than
> > before. Is there any other way to generate different sequence of freelist
> > for each new set of pages? Current approach using pre-computed array will
> > generate same sequence of freelist for all new set of pages having same size
> > class. Is it sufficient?
> >
> 
> I think it is sufficient. There is a tradeoff for performance. We could 
> randomly
> pick an object from the freelist every time (on slab_get_obj) but I
> think it will
> have significant impact (at least 3%).
> 
> >> For example this attack against SLUB (also applicable against SLAB)
> >> would be affected:
> >> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> >>
> >> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> >> a controllable heap is opened to new attacks not yet publicly discussed.
> >> A kernel heap overflow can be transformed to multiple use-after-free.
> >> This feature makes this type of attack harder too.
> >>
> >> To generate entropy, we use get_random_bytes_arch because 0 bits of
> >> entropy is available at that boot stage. In the worse case this function
> >> will fallback to the get_random_bytes sub API.
> >>
> >> The config option name is not specific to the SLAB as this approach will
> >> be extended to other allocators like SLUB.
> >
> > If this feature will be applied to the SLUB, it's better to put common
> > code to mm/slab_common.c.
> >
> 
> I think it might be moved there once we implement the SLUB counterpart
> but it is too early to define which part will be common.
> 
> >>
> >> Performance results highlighted no major changes:
> >>
> >> Netperf average on 10 runs:
> >>
> >> threads,base,change
> >> 16,576943.10,585905.90 (101.55%)
> >> 32,564082.00,569741.20 (101.00%)
> >> 48,558334.30,561851.20 (100.63%)
> >> 64,552025.20,556448.30 (100.80%)
> >> 80,552294.40,551743.10 (99.90%)
> >> 96,552435.30,547529.20 (99.11%)
> >> 112,551320.60,550183.20 (99.79%)
> >> 128,549138.30,550542.70 (100.26%)
> >> 144,549344.50,544529.10 (99.12%)
> >> 160,550360.80,539929.30 (98.10%)
> >>
> >> slab_test 1 run on boot. After is faster except for odd result on size
> >> 2048.
> >
> > Hmm... It's odd result. It adds more logic and it should
> > decrease performance. I guess it would be experimental error but
> > do you have any analysis about this result?
> >
> 
> I don't. I am glad to redo the test. I found that slab_test has very different
> result based on the heap state at the time of the test. If I run the
> test multiple
> times, I have really various results on with or without the mitigation (on
> dedicated hardware).
> 
> >>
> >> Before:
> >>
> >> Single thread testing
> >> =
> >> 1. Kmalloc: Repeatedly allocate then free test
> >> 1 times kmalloc(8) -> 137 cycles kfree -> 126 cycles
> >> 1 times kmalloc(16) -> 118 cycles kfree -> 119 cycles
> >> 1 times kmalloc(32) -> 112 cycles kfree -> 119 cycles
> >> 1 times kmalloc(64) -> 126 cycles kfree -> 123 cycles
> >> 1 times kmalloc(128) -> 135 cycles kfree -> 131 cycles
> >> 1 times kmalloc(256) -> 165 cycles kfree -> 104 cycles
> >> 1 times kmalloc(512) -> 174 cycles kfree -> 126 cycles
> >> 1 times kmalloc(1024) -> 242 cycles kfree -> 160 cycles
> >> 1 times kmalloc(2048) -> 478 cycles kfree -> 239 cycles
> >> 1 times kmalloc(4096) -> 747 cycles kfree -> 364 cycles
> >> 1 times kmalloc(8192) -> 774 cycles kfree -> 404 cycles
> >> 1 times kmalloc(16384) -> 849 cycles kfree -> 430 cycles
> >> 2. Kmalloc: alloc/free test
> >> 1 times kmalloc(8)/kfree -> 118 cycles
> >> 1 times kmalloc(16)/kfree -> 118 cycles
> >> 1 times kmalloc(32)/kfree -> 118 cycles
> >> 1 times kmalloc(64)/kfree -> 121 cycles
> >> 1 times kmalloc(128)/kfree -> 118 cycles
> >> 1 times kmalloc(256)/kfree -> 115 cycles
> >> 1 times kmalloc(512)/kfree -> 115 cycles
> >> 1 times kmalloc(1024)/kfree -> 115 cycles
> >> 1 times kmalloc(2048)/kfree -> 115 cycles
> >> 1 times kmalloc(4096)/kfree -> 115 cycles
> >> 1 times kmalloc(8192)/kfree -> 115 cycles
> >> 1 times kmalloc(16384)/kfree -> 115 cycles
> >>
> >> After:
> >>
> >> Single thread testing
> >> =
> >> 1. Kmalloc: Repeatedly allocate then free test
> >> 1 times kmalloc(8) -> 99 cycles kfree -> 84 cycles
> >> 1 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-19 Thread Thomas Garnier
On Tue, Apr 19, 2016 at 12:15 AM, Joonsoo Kim  wrote:
> On Mon, Apr 18, 2016 at 10:14:39AM -0700, Thomas Garnier wrote:
>> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>> SLAB freelist. The list is randomized during initialization of a new set
>> of pages. The order on different freelist sizes is pre-computed at boot
>> for performance. This security feature reduces the predictability of the
>> kernel SLAB allocator against heap overflows rendering attacks much less
>> stable.
>
> I'm not familiar on security but it doesn't look much secure than
> before. Is there any other way to generate different sequence of freelist
> for each new set of pages? Current approach using pre-computed array will
> generate same sequence of freelist for all new set of pages having same size
> class. Is it sufficient?
>

I think it is sufficient. There is a tradeoff for performance. We could randomly
pick an object from the freelist every time (on slab_get_obj) but I
think it will
have significant impact (at least 3%).

>> For example this attack against SLUB (also applicable against SLAB)
>> would be affected:
>> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>>
>> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>> a controllable heap is opened to new attacks not yet publicly discussed.
>> A kernel heap overflow can be transformed to multiple use-after-free.
>> This feature makes this type of attack harder too.
>>
>> To generate entropy, we use get_random_bytes_arch because 0 bits of
>> entropy is available at that boot stage. In the worse case this function
>> will fallback to the get_random_bytes sub API.
>>
>> The config option name is not specific to the SLAB as this approach will
>> be extended to other allocators like SLUB.
>
> If this feature will be applied to the SLUB, it's better to put common
> code to mm/slab_common.c.
>

I think it might be moved there once we implement the SLUB counterpart
but it is too early to define which part will be common.

>>
>> Performance results highlighted no major changes:
>>
>> Netperf average on 10 runs:
>>
>> threads,base,change
>> 16,576943.10,585905.90 (101.55%)
>> 32,564082.00,569741.20 (101.00%)
>> 48,558334.30,561851.20 (100.63%)
>> 64,552025.20,556448.30 (100.80%)
>> 80,552294.40,551743.10 (99.90%)
>> 96,552435.30,547529.20 (99.11%)
>> 112,551320.60,550183.20 (99.79%)
>> 128,549138.30,550542.70 (100.26%)
>> 144,549344.50,544529.10 (99.12%)
>> 160,550360.80,539929.30 (98.10%)
>>
>> slab_test 1 run on boot. After is faster except for odd result on size
>> 2048.
>
> Hmm... It's odd result. It adds more logic and it should
> decrease performance. I guess it would be experimental error but
> do you have any analysis about this result?
>

I don't. I am glad to redo the test. I found that slab_test has very different
result based on the heap state at the time of the test. If I run the
test multiple
times, I have really various results on with or without the mitigation (on
dedicated hardware).

>>
>> Before:
>>
>> Single thread testing
>> =
>> 1. Kmalloc: Repeatedly allocate then free test
>> 1 times kmalloc(8) -> 137 cycles kfree -> 126 cycles
>> 1 times kmalloc(16) -> 118 cycles kfree -> 119 cycles
>> 1 times kmalloc(32) -> 112 cycles kfree -> 119 cycles
>> 1 times kmalloc(64) -> 126 cycles kfree -> 123 cycles
>> 1 times kmalloc(128) -> 135 cycles kfree -> 131 cycles
>> 1 times kmalloc(256) -> 165 cycles kfree -> 104 cycles
>> 1 times kmalloc(512) -> 174 cycles kfree -> 126 cycles
>> 1 times kmalloc(1024) -> 242 cycles kfree -> 160 cycles
>> 1 times kmalloc(2048) -> 478 cycles kfree -> 239 cycles
>> 1 times kmalloc(4096) -> 747 cycles kfree -> 364 cycles
>> 1 times kmalloc(8192) -> 774 cycles kfree -> 404 cycles
>> 1 times kmalloc(16384) -> 849 cycles kfree -> 430 cycles
>> 2. Kmalloc: alloc/free test
>> 1 times kmalloc(8)/kfree -> 118 cycles
>> 1 times kmalloc(16)/kfree -> 118 cycles
>> 1 times kmalloc(32)/kfree -> 118 cycles
>> 1 times kmalloc(64)/kfree -> 121 cycles
>> 1 times kmalloc(128)/kfree -> 118 cycles
>> 1 times kmalloc(256)/kfree -> 115 cycles
>> 1 times kmalloc(512)/kfree -> 115 cycles
>> 1 times kmalloc(1024)/kfree -> 115 cycles
>> 1 times kmalloc(2048)/kfree -> 115 cycles
>> 1 times kmalloc(4096)/kfree -> 115 cycles
>> 1 times kmalloc(8192)/kfree -> 115 cycles
>> 1 times kmalloc(16384)/kfree -> 115 cycles
>>
>> After:
>>
>> Single thread testing
>> =
>> 1. Kmalloc: Repeatedly allocate then free test
>> 1 times kmalloc(8) -> 99 cycles kfree -> 84 cycles
>> 1 times kmalloc(16) -> 88 cycles kfree -> 83 cycles
>> 1 times kmalloc(32) -> 90 cycles kfree -> 81 cycles
>> 1 times kmalloc(64) -> 107 cycles kfree -> 97 cycles
>> 1 times kmalloc(128) -> 134 cycles kfree -> 89 cycles
>> 1 times kmalloc(256) -> 145 cycles 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-19 Thread Thomas Garnier
On Tue, Apr 19, 2016 at 12:15 AM, Joonsoo Kim  wrote:
> On Mon, Apr 18, 2016 at 10:14:39AM -0700, Thomas Garnier wrote:
>> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>> SLAB freelist. The list is randomized during initialization of a new set
>> of pages. The order on different freelist sizes is pre-computed at boot
>> for performance. This security feature reduces the predictability of the
>> kernel SLAB allocator against heap overflows rendering attacks much less
>> stable.
>
> I'm not familiar on security but it doesn't look much secure than
> before. Is there any other way to generate different sequence of freelist
> for each new set of pages? Current approach using pre-computed array will
> generate same sequence of freelist for all new set of pages having same size
> class. Is it sufficient?
>

I think it is sufficient. There is a tradeoff for performance. We could randomly
pick an object from the freelist every time (on slab_get_obj) but I
think it will
have significant impact (at least 3%).

>> For example this attack against SLUB (also applicable against SLAB)
>> would be affected:
>> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>>
>> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>> a controllable heap is opened to new attacks not yet publicly discussed.
>> A kernel heap overflow can be transformed to multiple use-after-free.
>> This feature makes this type of attack harder too.
>>
>> To generate entropy, we use get_random_bytes_arch because 0 bits of
>> entropy is available at that boot stage. In the worse case this function
>> will fallback to the get_random_bytes sub API.
>>
>> The config option name is not specific to the SLAB as this approach will
>> be extended to other allocators like SLUB.
>
> If this feature will be applied to the SLUB, it's better to put common
> code to mm/slab_common.c.
>

I think it might be moved there once we implement the SLUB counterpart
but it is too early to define which part will be common.

>>
>> Performance results highlighted no major changes:
>>
>> Netperf average on 10 runs:
>>
>> threads,base,change
>> 16,576943.10,585905.90 (101.55%)
>> 32,564082.00,569741.20 (101.00%)
>> 48,558334.30,561851.20 (100.63%)
>> 64,552025.20,556448.30 (100.80%)
>> 80,552294.40,551743.10 (99.90%)
>> 96,552435.30,547529.20 (99.11%)
>> 112,551320.60,550183.20 (99.79%)
>> 128,549138.30,550542.70 (100.26%)
>> 144,549344.50,544529.10 (99.12%)
>> 160,550360.80,539929.30 (98.10%)
>>
>> slab_test 1 run on boot. After is faster except for odd result on size
>> 2048.
>
> Hmm... It's odd result. It adds more logic and it should
> decrease performance. I guess it would be experimental error but
> do you have any analysis about this result?
>

I don't. I am glad to redo the test. I found that slab_test has very different
result based on the heap state at the time of the test. If I run the
test multiple
times, I have really various results on with or without the mitigation (on
dedicated hardware).

>>
>> Before:
>>
>> Single thread testing
>> =
>> 1. Kmalloc: Repeatedly allocate then free test
>> 1 times kmalloc(8) -> 137 cycles kfree -> 126 cycles
>> 1 times kmalloc(16) -> 118 cycles kfree -> 119 cycles
>> 1 times kmalloc(32) -> 112 cycles kfree -> 119 cycles
>> 1 times kmalloc(64) -> 126 cycles kfree -> 123 cycles
>> 1 times kmalloc(128) -> 135 cycles kfree -> 131 cycles
>> 1 times kmalloc(256) -> 165 cycles kfree -> 104 cycles
>> 1 times kmalloc(512) -> 174 cycles kfree -> 126 cycles
>> 1 times kmalloc(1024) -> 242 cycles kfree -> 160 cycles
>> 1 times kmalloc(2048) -> 478 cycles kfree -> 239 cycles
>> 1 times kmalloc(4096) -> 747 cycles kfree -> 364 cycles
>> 1 times kmalloc(8192) -> 774 cycles kfree -> 404 cycles
>> 1 times kmalloc(16384) -> 849 cycles kfree -> 430 cycles
>> 2. Kmalloc: alloc/free test
>> 1 times kmalloc(8)/kfree -> 118 cycles
>> 1 times kmalloc(16)/kfree -> 118 cycles
>> 1 times kmalloc(32)/kfree -> 118 cycles
>> 1 times kmalloc(64)/kfree -> 121 cycles
>> 1 times kmalloc(128)/kfree -> 118 cycles
>> 1 times kmalloc(256)/kfree -> 115 cycles
>> 1 times kmalloc(512)/kfree -> 115 cycles
>> 1 times kmalloc(1024)/kfree -> 115 cycles
>> 1 times kmalloc(2048)/kfree -> 115 cycles
>> 1 times kmalloc(4096)/kfree -> 115 cycles
>> 1 times kmalloc(8192)/kfree -> 115 cycles
>> 1 times kmalloc(16384)/kfree -> 115 cycles
>>
>> After:
>>
>> Single thread testing
>> =
>> 1. Kmalloc: Repeatedly allocate then free test
>> 1 times kmalloc(8) -> 99 cycles kfree -> 84 cycles
>> 1 times kmalloc(16) -> 88 cycles kfree -> 83 cycles
>> 1 times kmalloc(32) -> 90 cycles kfree -> 81 cycles
>> 1 times kmalloc(64) -> 107 cycles kfree -> 97 cycles
>> 1 times kmalloc(128) -> 134 cycles kfree -> 89 cycles
>> 1 times kmalloc(256) -> 145 cycles kfree -> 97 cycles
>> 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-19 Thread Joonsoo Kim
On Mon, Apr 18, 2016 at 10:14:39AM -0700, Thomas Garnier wrote:
> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> SLAB freelist. The list is randomized during initialization of a new set
> of pages. The order on different freelist sizes is pre-computed at boot
> for performance. This security feature reduces the predictability of the
> kernel SLAB allocator against heap overflows rendering attacks much less
> stable.

I'm not familiar on security but it doesn't look much secure than
before. Is there any other way to generate different sequence of freelist
for each new set of pages? Current approach using pre-computed array will
generate same sequence of freelist for all new set of pages having same size
class. Is it sufficient?

> For example this attack against SLUB (also applicable against SLAB)
> would be affected:
> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> 
> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> a controllable heap is opened to new attacks not yet publicly discussed.
> A kernel heap overflow can be transformed to multiple use-after-free.
> This feature makes this type of attack harder too.
> 
> To generate entropy, we use get_random_bytes_arch because 0 bits of
> entropy is available at that boot stage. In the worse case this function
> will fallback to the get_random_bytes sub API.
> 
> The config option name is not specific to the SLAB as this approach will
> be extended to other allocators like SLUB.

If this feature will be applied to the SLUB, it's better to put common
code to mm/slab_common.c.

> 
> Performance results highlighted no major changes:
> 
> Netperf average on 10 runs:
> 
> threads,base,change
> 16,576943.10,585905.90 (101.55%)
> 32,564082.00,569741.20 (101.00%)
> 48,558334.30,561851.20 (100.63%)
> 64,552025.20,556448.30 (100.80%)
> 80,552294.40,551743.10 (99.90%)
> 96,552435.30,547529.20 (99.11%)
> 112,551320.60,550183.20 (99.79%)
> 128,549138.30,550542.70 (100.26%)
> 144,549344.50,544529.10 (99.12%)
> 160,550360.80,539929.30 (98.10%)
> 
> slab_test 1 run on boot. After is faster except for odd result on size
> 2048.

Hmm... It's odd result. It adds more logic and it should
decrease performance. I guess it would be experimental error but
do you have any analysis about this result?

> 
> Before:
> 
> Single thread testing
> =
> 1. Kmalloc: Repeatedly allocate then free test
> 1 times kmalloc(8) -> 137 cycles kfree -> 126 cycles
> 1 times kmalloc(16) -> 118 cycles kfree -> 119 cycles
> 1 times kmalloc(32) -> 112 cycles kfree -> 119 cycles
> 1 times kmalloc(64) -> 126 cycles kfree -> 123 cycles
> 1 times kmalloc(128) -> 135 cycles kfree -> 131 cycles
> 1 times kmalloc(256) -> 165 cycles kfree -> 104 cycles
> 1 times kmalloc(512) -> 174 cycles kfree -> 126 cycles
> 1 times kmalloc(1024) -> 242 cycles kfree -> 160 cycles
> 1 times kmalloc(2048) -> 478 cycles kfree -> 239 cycles
> 1 times kmalloc(4096) -> 747 cycles kfree -> 364 cycles
> 1 times kmalloc(8192) -> 774 cycles kfree -> 404 cycles
> 1 times kmalloc(16384) -> 849 cycles kfree -> 430 cycles
> 2. Kmalloc: alloc/free test
> 1 times kmalloc(8)/kfree -> 118 cycles
> 1 times kmalloc(16)/kfree -> 118 cycles
> 1 times kmalloc(32)/kfree -> 118 cycles
> 1 times kmalloc(64)/kfree -> 121 cycles
> 1 times kmalloc(128)/kfree -> 118 cycles
> 1 times kmalloc(256)/kfree -> 115 cycles
> 1 times kmalloc(512)/kfree -> 115 cycles
> 1 times kmalloc(1024)/kfree -> 115 cycles
> 1 times kmalloc(2048)/kfree -> 115 cycles
> 1 times kmalloc(4096)/kfree -> 115 cycles
> 1 times kmalloc(8192)/kfree -> 115 cycles
> 1 times kmalloc(16384)/kfree -> 115 cycles
> 
> After:
> 
> Single thread testing
> =
> 1. Kmalloc: Repeatedly allocate then free test
> 1 times kmalloc(8) -> 99 cycles kfree -> 84 cycles
> 1 times kmalloc(16) -> 88 cycles kfree -> 83 cycles
> 1 times kmalloc(32) -> 90 cycles kfree -> 81 cycles
> 1 times kmalloc(64) -> 107 cycles kfree -> 97 cycles
> 1 times kmalloc(128) -> 134 cycles kfree -> 89 cycles
> 1 times kmalloc(256) -> 145 cycles kfree -> 97 cycles
> 1 times kmalloc(512) -> 177 cycles kfree -> 116 cycles
> 1 times kmalloc(1024) -> 223 cycles kfree -> 151 cycles
> 1 times kmalloc(2048) -> 1429 cycles kfree -> 221 cycles
> 1 times kmalloc(4096) -> 720 cycles kfree -> 348 cycles
> 1 times kmalloc(8192) -> 788 cycles kfree -> 393 cycles
> 1 times kmalloc(16384) -> 867 cycles kfree -> 433 cycles
> 2. Kmalloc: alloc/free test
> 1 times kmalloc(8)/kfree -> 115 cycles
> 1 times kmalloc(16)/kfree -> 115 cycles
> 1 times kmalloc(32)/kfree -> 115 cycles
> 1 times kmalloc(64)/kfree -> 120 cycles
> 1 times kmalloc(128)/kfree -> 127 cycles
> 1 times kmalloc(256)/kfree -> 119 cycles
> 1 times kmalloc(512)/kfree -> 112 cycles
> 1 times 

Re: [PATCH v2] mm: SLAB freelist randomization

2016-04-19 Thread Joonsoo Kim
On Mon, Apr 18, 2016 at 10:14:39AM -0700, Thomas Garnier wrote:
> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> SLAB freelist. The list is randomized during initialization of a new set
> of pages. The order on different freelist sizes is pre-computed at boot
> for performance. This security feature reduces the predictability of the
> kernel SLAB allocator against heap overflows rendering attacks much less
> stable.

I'm not familiar on security but it doesn't look much secure than
before. Is there any other way to generate different sequence of freelist
for each new set of pages? Current approach using pre-computed array will
generate same sequence of freelist for all new set of pages having same size
class. Is it sufficient?

> For example this attack against SLUB (also applicable against SLAB)
> would be affected:
> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> 
> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> a controllable heap is opened to new attacks not yet publicly discussed.
> A kernel heap overflow can be transformed to multiple use-after-free.
> This feature makes this type of attack harder too.
> 
> To generate entropy, we use get_random_bytes_arch because 0 bits of
> entropy is available at that boot stage. In the worse case this function
> will fallback to the get_random_bytes sub API.
> 
> The config option name is not specific to the SLAB as this approach will
> be extended to other allocators like SLUB.

If this feature will be applied to the SLUB, it's better to put common
code to mm/slab_common.c.

> 
> Performance results highlighted no major changes:
> 
> Netperf average on 10 runs:
> 
> threads,base,change
> 16,576943.10,585905.90 (101.55%)
> 32,564082.00,569741.20 (101.00%)
> 48,558334.30,561851.20 (100.63%)
> 64,552025.20,556448.30 (100.80%)
> 80,552294.40,551743.10 (99.90%)
> 96,552435.30,547529.20 (99.11%)
> 112,551320.60,550183.20 (99.79%)
> 128,549138.30,550542.70 (100.26%)
> 144,549344.50,544529.10 (99.12%)
> 160,550360.80,539929.30 (98.10%)
> 
> slab_test 1 run on boot. After is faster except for odd result on size
> 2048.

Hmm... It's odd result. It adds more logic and it should
decrease performance. I guess it would be experimental error but
do you have any analysis about this result?

> 
> Before:
> 
> Single thread testing
> =
> 1. Kmalloc: Repeatedly allocate then free test
> 1 times kmalloc(8) -> 137 cycles kfree -> 126 cycles
> 1 times kmalloc(16) -> 118 cycles kfree -> 119 cycles
> 1 times kmalloc(32) -> 112 cycles kfree -> 119 cycles
> 1 times kmalloc(64) -> 126 cycles kfree -> 123 cycles
> 1 times kmalloc(128) -> 135 cycles kfree -> 131 cycles
> 1 times kmalloc(256) -> 165 cycles kfree -> 104 cycles
> 1 times kmalloc(512) -> 174 cycles kfree -> 126 cycles
> 1 times kmalloc(1024) -> 242 cycles kfree -> 160 cycles
> 1 times kmalloc(2048) -> 478 cycles kfree -> 239 cycles
> 1 times kmalloc(4096) -> 747 cycles kfree -> 364 cycles
> 1 times kmalloc(8192) -> 774 cycles kfree -> 404 cycles
> 1 times kmalloc(16384) -> 849 cycles kfree -> 430 cycles
> 2. Kmalloc: alloc/free test
> 1 times kmalloc(8)/kfree -> 118 cycles
> 1 times kmalloc(16)/kfree -> 118 cycles
> 1 times kmalloc(32)/kfree -> 118 cycles
> 1 times kmalloc(64)/kfree -> 121 cycles
> 1 times kmalloc(128)/kfree -> 118 cycles
> 1 times kmalloc(256)/kfree -> 115 cycles
> 1 times kmalloc(512)/kfree -> 115 cycles
> 1 times kmalloc(1024)/kfree -> 115 cycles
> 1 times kmalloc(2048)/kfree -> 115 cycles
> 1 times kmalloc(4096)/kfree -> 115 cycles
> 1 times kmalloc(8192)/kfree -> 115 cycles
> 1 times kmalloc(16384)/kfree -> 115 cycles
> 
> After:
> 
> Single thread testing
> =
> 1. Kmalloc: Repeatedly allocate then free test
> 1 times kmalloc(8) -> 99 cycles kfree -> 84 cycles
> 1 times kmalloc(16) -> 88 cycles kfree -> 83 cycles
> 1 times kmalloc(32) -> 90 cycles kfree -> 81 cycles
> 1 times kmalloc(64) -> 107 cycles kfree -> 97 cycles
> 1 times kmalloc(128) -> 134 cycles kfree -> 89 cycles
> 1 times kmalloc(256) -> 145 cycles kfree -> 97 cycles
> 1 times kmalloc(512) -> 177 cycles kfree -> 116 cycles
> 1 times kmalloc(1024) -> 223 cycles kfree -> 151 cycles
> 1 times kmalloc(2048) -> 1429 cycles kfree -> 221 cycles
> 1 times kmalloc(4096) -> 720 cycles kfree -> 348 cycles
> 1 times kmalloc(8192) -> 788 cycles kfree -> 393 cycles
> 1 times kmalloc(16384) -> 867 cycles kfree -> 433 cycles
> 2. Kmalloc: alloc/free test
> 1 times kmalloc(8)/kfree -> 115 cycles
> 1 times kmalloc(16)/kfree -> 115 cycles
> 1 times kmalloc(32)/kfree -> 115 cycles
> 1 times kmalloc(64)/kfree -> 120 cycles
> 1 times kmalloc(128)/kfree -> 127 cycles
> 1 times kmalloc(256)/kfree -> 119 cycles
> 1 times kmalloc(512)/kfree -> 112 cycles
> 1 times