Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-29 Thread Joonsoo Kim
On Wed, Apr 27, 2016 at 10:39:29AM -0500, Christoph Lameter wrote:
> On Tue, 26 Apr 2016, Andrew Morton wrote:
> 
> > : CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> > : CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> > : identifier could be used for implementing randomisation in
> > : slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
> >
> > but this pearl appeared to pass unnoticed.
> 
> Ok. lets add SLAB here and then use this option for the other allocators
> as well.
> 
> > > + /* If it fails, we will just use the global lists */
> > > + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> > > + if (!cachep->random_seq)
> > > + return -ENOMEM;
> >
> > OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
> > instead ;)
> >
> > Questions for slab maintainers:
> >
> > What's going on with the gfp_flags in there?  kmem_cache_init_late()
> > passes GFP_NOWAIT into enable_cpucache().
> >
> > a) why the heck does it do that?  It's __init code!
> 
> enable_cpucache() was called when a slab cache was reconfigured by writing to 
> /proc/slabinfo.
> That was changed awhile back when the memcg changes were made ot slab. So
> now its ok to be made init code.
> 
> > Finally, all callers of enable_cpucache() (and hence of
> > cache_random_seq_create()) are __init, so we're unnecessarily bloating
> > up vmlinux.  Could someone please take a look at this as a separate
> > thing?
> 
> Hmmm. Well if that is the case then lots of stuff could be straightened
> out. Joonsoo?
> 

As I mentioned in other thread, enable_cpucache() can be called
whenever kmem_cache is created. It should not be __init.

Thanks.


Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-29 Thread Joonsoo Kim
On Wed, Apr 27, 2016 at 10:39:29AM -0500, Christoph Lameter wrote:
> On Tue, 26 Apr 2016, Andrew Morton wrote:
> 
> > : CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> > : CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> > : identifier could be used for implementing randomisation in
> > : slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
> >
> > but this pearl appeared to pass unnoticed.
> 
> Ok. lets add SLAB here and then use this option for the other allocators
> as well.
> 
> > > + /* If it fails, we will just use the global lists */
> > > + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> > > + if (!cachep->random_seq)
> > > + return -ENOMEM;
> >
> > OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
> > instead ;)
> >
> > Questions for slab maintainers:
> >
> > What's going on with the gfp_flags in there?  kmem_cache_init_late()
> > passes GFP_NOWAIT into enable_cpucache().
> >
> > a) why the heck does it do that?  It's __init code!
> 
> enable_cpucache() was called when a slab cache was reconfigured by writing to 
> /proc/slabinfo.
> That was changed awhile back when the memcg changes were made ot slab. So
> now its ok to be made init code.
> 
> > Finally, all callers of enable_cpucache() (and hence of
> > cache_random_seq_create()) are __init, so we're unnecessarily bloating
> > up vmlinux.  Could someone please take a look at this as a separate
> > thing?
> 
> Hmmm. Well if that is the case then lots of stuff could be straightened
> out. Joonsoo?
> 

As I mentioned in other thread, enable_cpucache() can be called
whenever kmem_cache is created. It should not be __init.

Thanks.


Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-27 Thread Christoph Lameter
On Tue, 26 Apr 2016, Andrew Morton wrote:

> : CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> : CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> : identifier could be used for implementing randomisation in
> : slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
>
> but this pearl appeared to pass unnoticed.

Ok. lets add SLAB here and then use this option for the other allocators
as well.

> > +   /* If it fails, we will just use the global lists */
> > +   cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> > +   if (!cachep->random_seq)
> > +   return -ENOMEM;
>
> OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
> instead ;)
>
> Questions for slab maintainers:
>
> What's going on with the gfp_flags in there?  kmem_cache_init_late()
> passes GFP_NOWAIT into enable_cpucache().
>
> a) why the heck does it do that?  It's __init code!

enable_cpucache() was called when a slab cache was reconfigured by writing to 
/proc/slabinfo.
That was changed awhile back when the memcg changes were made ot slab. So
now its ok to be made init code.

> Finally, all callers of enable_cpucache() (and hence of
> cache_random_seq_create()) are __init, so we're unnecessarily bloating
> up vmlinux.  Could someone please take a look at this as a separate
> thing?

Hmmm. Well if that is the case then lots of stuff could be straightened
out. Joonsoo?



Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-27 Thread Christoph Lameter
On Tue, 26 Apr 2016, Andrew Morton wrote:

> : CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> : CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> : identifier could be used for implementing randomisation in
> : slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
>
> but this pearl appeared to pass unnoticed.

Ok. lets add SLAB here and then use this option for the other allocators
as well.

> > +   /* If it fails, we will just use the global lists */
> > +   cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> > +   if (!cachep->random_seq)
> > +   return -ENOMEM;
>
> OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
> instead ;)
>
> Questions for slab maintainers:
>
> What's going on with the gfp_flags in there?  kmem_cache_init_late()
> passes GFP_NOWAIT into enable_cpucache().
>
> a) why the heck does it do that?  It's __init code!

enable_cpucache() was called when a slab cache was reconfigured by writing to 
/proc/slabinfo.
That was changed awhile back when the memcg changes were made ot slab. So
now its ok to be made init code.

> Finally, all callers of enable_cpucache() (and hence of
> cache_random_seq_create()) are __init, so we're unnecessarily bloating
> up vmlinux.  Could someone please take a look at this as a separate
> thing?

Hmmm. Well if that is the case then lots of stuff could be straightened
out. Joonsoo?



Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-27 Thread Christoph Lameter
On Tue, 26 Apr 2016, Thomas Garnier wrote:

> It was discussed a bit before. The intent is to have a similar feature
> for other kernel heap (I know it is possible for SLUB). That's why I
> think it make sense to have a similar config name used for all
> allocators.

Please use CONFIG_SLAB_FREELIST_RANDOM to signify that it is for all slab
allocators. Not SLAB specific.



Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-27 Thread Christoph Lameter
On Tue, 26 Apr 2016, Thomas Garnier wrote:

> It was discussed a bit before. The intent is to have a similar feature
> for other kernel heap (I know it is possible for SLUB). That's why I
> think it make sense to have a similar config name used for all
> allocators.

Please use CONFIG_SLAB_FREELIST_RANDOM to signify that it is for all slab
allocators. Not SLAB specific.



Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-26 Thread Joonsoo Kim
On Tue, Apr 26, 2016 at 04:17:43PM -0700, Andrew Morton wrote:
> On Tue, 26 Apr 2016 09:21:10 -0700 Thomas Garnier  wrote:
> 
> > Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> > SLAB freelist. The list is randomized during initialization of a new set
> > of pages. The order on different freelist sizes is pre-computed at boot
> > for performance. Each kmem_cache has its own randomized freelist. Before
> > pre-computed lists are available freelists are generated
> > dynamically. This security feature reduces the predictability of the
> > kernel SLAB allocator against heap overflows rendering attacks much less
> > stable.
> > 
> > For example this attack against SLUB (also applicable against SLAB)
> > would be affected:
> > https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> > 
> > Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> > a controllable heap is opened to new attacks not yet publicly discussed.
> > A kernel heap overflow can be transformed to multiple use-after-free.
> > This feature makes this type of attack harder too.
> > 
> > To generate entropy, we use get_random_bytes_arch because 0 bits of
> > entropy is available in the boot stage. In the worse case this function
> > will fallback to the get_random_bytes sub API. We also generate a shift
> > random number to shift pre-computed freelist for each new set of pages.
> > 
> > The config option name is not specific to the SLAB as this approach will
> > be extended to other allocators like SLUB.
> > 
> > Performance results highlighted no major changes:
> > 
> > Hackbench (running 90 10 times):
> > 
> > Before average: 0.0698
> > After average: 0.0663 (-5.01%)
> > 
> > slab_test 1 run on boot. Difference only seen on the 2048 size test
> > being the worse case scenario covered by freelist randomization. New
> > slab pages are constantly being created on the 1 allocations.
> > Variance should be mainly due to getting new pages every few
> > allocations.
> > 
> > ...
> >
> > --- a/include/linux/slab_def.h
> > +++ b/include/linux/slab_def.h
> > @@ -80,6 +80,10 @@ struct kmem_cache {
> > struct kasan_cache kasan_info;
> >  #endif
> >  
> > +#ifdef CONFIG_FREELIST_RANDOM
> > +   void *random_seq;
> > +#endif
> > +
> > struct kmem_cache_node *node[MAX_NUMNODES];
> >  };
> >  
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 0c66640..73453d0 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1742,6 +1742,15 @@ config SLOB
> >  
> >  endchoice
> >  
> > +config FREELIST_RANDOM
> > +   default n
> > +   depends on SLAB
> > +   bool "SLAB freelist randomization"
> > +   help
> > + Randomizes the freelist order used on creating new SLABs. This
> > + security feature reduces the predictability of the kernel slab
> > + allocator against heap overflows.
> 
> Against the v2 patch I didst observe:
> 
> : CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> : CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> : identifier could be used for implementing randomisation in
> : slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
> 
> but this pearl appeared to pass unnoticed.
> 
> >  config SLUB_CPU_PARTIAL
> > default y
> > depends on SLUB && SMP
> > diff --git a/mm/slab.c b/mm/slab.c
> > index b82ee6b..0ed728a 100644
> > --- a/mm/slab.c
> > +++ b/mm/slab.c
> > @@ -1230,6 +1230,61 @@ static void __init set_up_node(struct kmem_cache 
> > *cachep, int index)
> > }
> >  }
> >  
> > +#ifdef CONFIG_FREELIST_RANDOM
> > +static void freelist_randomize(struct rnd_state *state, freelist_idx_t 
> > *list,
> > +   size_t count)
> > +{
> > +   size_t i;
> > +   unsigned int rand;
> > +
> > +   for (i = 0; i < count; i++)
> > +   list[i] = i;
> > +
> > +   /* Fisher-Yates shuffle */
> > +   for (i = count - 1; i > 0; i--) {
> > +   rand = prandom_u32_state(state);
> > +   rand %= (i + 1);
> > +   swap(list[i], list[rand]);
> > +   }
> > +}
> > +
> > +/* Create a random sequence per cache */
> > +static int cache_random_seq_create(struct kmem_cache *cachep)
> > +{
> > +   unsigned int seed, count = cachep->num;
> > +   struct rnd_state state;
> > +
> > +   if (count < 2)
> > +   return 0;
> > +
> > +   /* If it fails, we will just use the global lists */
> > +   cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> > +   if (!cachep->random_seq)
> > +   return -ENOMEM;
> 
> OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
> instead ;)
> 
> Questions for slab maintainers:
> 
> What's going on with the gfp_flags in there?  kmem_cache_init_late()
> passes GFP_NOWAIT into enable_cpucache().
> 
> a) why the heck does it do that?  It's __init code!

Until some boot-up point, we should not enable interrupt.
In slab subsystem, If we use __GFP_DIRECT_RECLAIM, it will cause to
enable interrupt when allocating new 

Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-26 Thread Joonsoo Kim
On Tue, Apr 26, 2016 at 04:17:43PM -0700, Andrew Morton wrote:
> On Tue, 26 Apr 2016 09:21:10 -0700 Thomas Garnier  wrote:
> 
> > Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> > SLAB freelist. The list is randomized during initialization of a new set
> > of pages. The order on different freelist sizes is pre-computed at boot
> > for performance. Each kmem_cache has its own randomized freelist. Before
> > pre-computed lists are available freelists are generated
> > dynamically. This security feature reduces the predictability of the
> > kernel SLAB allocator against heap overflows rendering attacks much less
> > stable.
> > 
> > For example this attack against SLUB (also applicable against SLAB)
> > would be affected:
> > https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> > 
> > Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> > a controllable heap is opened to new attacks not yet publicly discussed.
> > A kernel heap overflow can be transformed to multiple use-after-free.
> > This feature makes this type of attack harder too.
> > 
> > To generate entropy, we use get_random_bytes_arch because 0 bits of
> > entropy is available in the boot stage. In the worse case this function
> > will fallback to the get_random_bytes sub API. We also generate a shift
> > random number to shift pre-computed freelist for each new set of pages.
> > 
> > The config option name is not specific to the SLAB as this approach will
> > be extended to other allocators like SLUB.
> > 
> > Performance results highlighted no major changes:
> > 
> > Hackbench (running 90 10 times):
> > 
> > Before average: 0.0698
> > After average: 0.0663 (-5.01%)
> > 
> > slab_test 1 run on boot. Difference only seen on the 2048 size test
> > being the worse case scenario covered by freelist randomization. New
> > slab pages are constantly being created on the 1 allocations.
> > Variance should be mainly due to getting new pages every few
> > allocations.
> > 
> > ...
> >
> > --- a/include/linux/slab_def.h
> > +++ b/include/linux/slab_def.h
> > @@ -80,6 +80,10 @@ struct kmem_cache {
> > struct kasan_cache kasan_info;
> >  #endif
> >  
> > +#ifdef CONFIG_FREELIST_RANDOM
> > +   void *random_seq;
> > +#endif
> > +
> > struct kmem_cache_node *node[MAX_NUMNODES];
> >  };
> >  
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 0c66640..73453d0 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1742,6 +1742,15 @@ config SLOB
> >  
> >  endchoice
> >  
> > +config FREELIST_RANDOM
> > +   default n
> > +   depends on SLAB
> > +   bool "SLAB freelist randomization"
> > +   help
> > + Randomizes the freelist order used on creating new SLABs. This
> > + security feature reduces the predictability of the kernel slab
> > + allocator against heap overflows.
> 
> Against the v2 patch I didst observe:
> 
> : CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> : CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> : identifier could be used for implementing randomisation in
> : slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
> 
> but this pearl appeared to pass unnoticed.
> 
> >  config SLUB_CPU_PARTIAL
> > default y
> > depends on SLUB && SMP
> > diff --git a/mm/slab.c b/mm/slab.c
> > index b82ee6b..0ed728a 100644
> > --- a/mm/slab.c
> > +++ b/mm/slab.c
> > @@ -1230,6 +1230,61 @@ static void __init set_up_node(struct kmem_cache 
> > *cachep, int index)
> > }
> >  }
> >  
> > +#ifdef CONFIG_FREELIST_RANDOM
> > +static void freelist_randomize(struct rnd_state *state, freelist_idx_t 
> > *list,
> > +   size_t count)
> > +{
> > +   size_t i;
> > +   unsigned int rand;
> > +
> > +   for (i = 0; i < count; i++)
> > +   list[i] = i;
> > +
> > +   /* Fisher-Yates shuffle */
> > +   for (i = count - 1; i > 0; i--) {
> > +   rand = prandom_u32_state(state);
> > +   rand %= (i + 1);
> > +   swap(list[i], list[rand]);
> > +   }
> > +}
> > +
> > +/* Create a random sequence per cache */
> > +static int cache_random_seq_create(struct kmem_cache *cachep)
> > +{
> > +   unsigned int seed, count = cachep->num;
> > +   struct rnd_state state;
> > +
> > +   if (count < 2)
> > +   return 0;
> > +
> > +   /* If it fails, we will just use the global lists */
> > +   cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> > +   if (!cachep->random_seq)
> > +   return -ENOMEM;
> 
> OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
> instead ;)
> 
> Questions for slab maintainers:
> 
> What's going on with the gfp_flags in there?  kmem_cache_init_late()
> passes GFP_NOWAIT into enable_cpucache().
> 
> a) why the heck does it do that?  It's __init code!

Until some boot-up point, we should not enable interrupt.
In slab subsystem, If we use __GFP_DIRECT_RECLAIM, it will cause to
enable interrupt when allocating new slab page. GFP_NOWAIT 

Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-26 Thread Thomas Garnier
On Tue, Apr 26, 2016 at 4:17 PM, Andrew Morton
 wrote:
> On Tue, 26 Apr 2016 09:21:10 -0700 Thomas Garnier  wrote:
>
>> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>> SLAB freelist. The list is randomized during initialization of a new set
>> of pages. The order on different freelist sizes is pre-computed at boot
>> for performance. Each kmem_cache has its own randomized freelist. Before
>> pre-computed lists are available freelists are generated
>> dynamically. This security feature reduces the predictability of the
>> kernel SLAB allocator against heap overflows rendering attacks much less
>> stable.
>>
>> For example this attack against SLUB (also applicable against SLAB)
>> would be affected:
>> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>>
>> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>> a controllable heap is opened to new attacks not yet publicly discussed.
>> A kernel heap overflow can be transformed to multiple use-after-free.
>> This feature makes this type of attack harder too.
>>
>> To generate entropy, we use get_random_bytes_arch because 0 bits of
>> entropy is available in the boot stage. In the worse case this function
>> will fallback to the get_random_bytes sub API. We also generate a shift
>> random number to shift pre-computed freelist for each new set of pages.
>>
>> The config option name is not specific to the SLAB as this approach will
>> be extended to other allocators like SLUB.
>>
>> Performance results highlighted no major changes:
>>
>> Hackbench (running 90 10 times):
>>
>> Before average: 0.0698
>> After average: 0.0663 (-5.01%)
>>
>> slab_test 1 run on boot. Difference only seen on the 2048 size test
>> being the worse case scenario covered by freelist randomization. New
>> slab pages are constantly being created on the 1 allocations.
>> Variance should be mainly due to getting new pages every few
>> allocations.
>>
>> ...
>>
>> --- a/include/linux/slab_def.h
>> +++ b/include/linux/slab_def.h
>> @@ -80,6 +80,10 @@ struct kmem_cache {
>>   struct kasan_cache kasan_info;
>>  #endif
>>
>> +#ifdef CONFIG_FREELIST_RANDOM
>> + void *random_seq;
>> +#endif
>> +
>>   struct kmem_cache_node *node[MAX_NUMNODES];
>>  };
>>
>> diff --git a/init/Kconfig b/init/Kconfig
>> index 0c66640..73453d0 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -1742,6 +1742,15 @@ config SLOB
>>
>>  endchoice
>>
>> +config FREELIST_RANDOM
>> + default n
>> + depends on SLAB
>> + bool "SLAB freelist randomization"
>> + help
>> +   Randomizes the freelist order used on creating new SLABs. This
>> +   security feature reduces the predictability of the kernel slab
>> +   allocator against heap overflows.
>
> Against the v2 patch I didst observe:
>
> : CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> : CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> : identifier could be used for implementing randomisation in
> : slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
>
> but this pearl appeared to pass unnoticed.
>

It was discussed a bit before. The intent is to have a similar feature
for other kernel heap (I know it is possible for SLUB). That's why I
think it make sense to have a similar config name used for all
allocators.

>>  config SLUB_CPU_PARTIAL
>>   default y
>>   depends on SLUB && SMP
>> diff --git a/mm/slab.c b/mm/slab.c
>> index b82ee6b..0ed728a 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -1230,6 +1230,61 @@ static void __init set_up_node(struct kmem_cache 
>> *cachep, int index)
>>   }
>>  }
>>
>> +#ifdef CONFIG_FREELIST_RANDOM
>> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t 
>> *list,
>> + size_t count)
>> +{
>> + size_t i;
>> + unsigned int rand;
>> +
>> + for (i = 0; i < count; i++)
>> + list[i] = i;
>> +
>> + /* Fisher-Yates shuffle */
>> + for (i = count - 1; i > 0; i--) {
>> + rand = prandom_u32_state(state);
>> + rand %= (i + 1);
>> + swap(list[i], list[rand]);
>> + }
>> +}
>> +
>> +/* Create a random sequence per cache */
>> +static int cache_random_seq_create(struct kmem_cache *cachep)
>> +{
>> + unsigned int seed, count = cachep->num;
>> + struct rnd_state state;
>> +
>> + if (count < 2)
>> + return 0;
>> +
>> + /* If it fails, we will just use the global lists */
>> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), 
>> GFP_KERNEL);
>> + if (!cachep->random_seq)
>> + return -ENOMEM;
>
> OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
> instead ;)
>

Yes, as Christophe asked.

> Questions for slab maintainers:
>
> What's going on with the gfp_flags in there?  kmem_cache_init_late()
> passes GFP_NOWAIT into enable_cpucache().
>
> a) why the heck does it do 

Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-26 Thread Thomas Garnier
On Tue, Apr 26, 2016 at 4:17 PM, Andrew Morton
 wrote:
> On Tue, 26 Apr 2016 09:21:10 -0700 Thomas Garnier  wrote:
>
>> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
>> SLAB freelist. The list is randomized during initialization of a new set
>> of pages. The order on different freelist sizes is pre-computed at boot
>> for performance. Each kmem_cache has its own randomized freelist. Before
>> pre-computed lists are available freelists are generated
>> dynamically. This security feature reduces the predictability of the
>> kernel SLAB allocator against heap overflows rendering attacks much less
>> stable.
>>
>> For example this attack against SLUB (also applicable against SLAB)
>> would be affected:
>> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
>>
>> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
>> a controllable heap is opened to new attacks not yet publicly discussed.
>> A kernel heap overflow can be transformed to multiple use-after-free.
>> This feature makes this type of attack harder too.
>>
>> To generate entropy, we use get_random_bytes_arch because 0 bits of
>> entropy is available in the boot stage. In the worse case this function
>> will fallback to the get_random_bytes sub API. We also generate a shift
>> random number to shift pre-computed freelist for each new set of pages.
>>
>> The config option name is not specific to the SLAB as this approach will
>> be extended to other allocators like SLUB.
>>
>> Performance results highlighted no major changes:
>>
>> Hackbench (running 90 10 times):
>>
>> Before average: 0.0698
>> After average: 0.0663 (-5.01%)
>>
>> slab_test 1 run on boot. Difference only seen on the 2048 size test
>> being the worse case scenario covered by freelist randomization. New
>> slab pages are constantly being created on the 1 allocations.
>> Variance should be mainly due to getting new pages every few
>> allocations.
>>
>> ...
>>
>> --- a/include/linux/slab_def.h
>> +++ b/include/linux/slab_def.h
>> @@ -80,6 +80,10 @@ struct kmem_cache {
>>   struct kasan_cache kasan_info;
>>  #endif
>>
>> +#ifdef CONFIG_FREELIST_RANDOM
>> + void *random_seq;
>> +#endif
>> +
>>   struct kmem_cache_node *node[MAX_NUMNODES];
>>  };
>>
>> diff --git a/init/Kconfig b/init/Kconfig
>> index 0c66640..73453d0 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -1742,6 +1742,15 @@ config SLOB
>>
>>  endchoice
>>
>> +config FREELIST_RANDOM
>> + default n
>> + depends on SLAB
>> + bool "SLAB freelist randomization"
>> + help
>> +   Randomizes the freelist order used on creating new SLABs. This
>> +   security feature reduces the predictability of the kernel slab
>> +   allocator against heap overflows.
>
> Against the v2 patch I didst observe:
>
> : CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> : CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> : identifier could be used for implementing randomisation in
> : slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
>
> but this pearl appeared to pass unnoticed.
>

It was discussed a bit before. The intent is to have a similar feature
for other kernel heap (I know it is possible for SLUB). That's why I
think it make sense to have a similar config name used for all
allocators.

>>  config SLUB_CPU_PARTIAL
>>   default y
>>   depends on SLUB && SMP
>> diff --git a/mm/slab.c b/mm/slab.c
>> index b82ee6b..0ed728a 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -1230,6 +1230,61 @@ static void __init set_up_node(struct kmem_cache 
>> *cachep, int index)
>>   }
>>  }
>>
>> +#ifdef CONFIG_FREELIST_RANDOM
>> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t 
>> *list,
>> + size_t count)
>> +{
>> + size_t i;
>> + unsigned int rand;
>> +
>> + for (i = 0; i < count; i++)
>> + list[i] = i;
>> +
>> + /* Fisher-Yates shuffle */
>> + for (i = count - 1; i > 0; i--) {
>> + rand = prandom_u32_state(state);
>> + rand %= (i + 1);
>> + swap(list[i], list[rand]);
>> + }
>> +}
>> +
>> +/* Create a random sequence per cache */
>> +static int cache_random_seq_create(struct kmem_cache *cachep)
>> +{
>> + unsigned int seed, count = cachep->num;
>> + struct rnd_state state;
>> +
>> + if (count < 2)
>> + return 0;
>> +
>> + /* If it fails, we will just use the global lists */
>> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), 
>> GFP_KERNEL);
>> + if (!cachep->random_seq)
>> + return -ENOMEM;
>
> OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
> instead ;)
>

Yes, as Christophe asked.

> Questions for slab maintainers:
>
> What's going on with the gfp_flags in there?  kmem_cache_init_late()
> passes GFP_NOWAIT into enable_cpucache().
>
> a) why the heck does it do that?  It's __init code!
>
> b) if there's a 

Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-26 Thread Andrew Morton
On Tue, 26 Apr 2016 09:21:10 -0700 Thomas Garnier  wrote:

> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> SLAB freelist. The list is randomized during initialization of a new set
> of pages. The order on different freelist sizes is pre-computed at boot
> for performance. Each kmem_cache has its own randomized freelist. Before
> pre-computed lists are available freelists are generated
> dynamically. This security feature reduces the predictability of the
> kernel SLAB allocator against heap overflows rendering attacks much less
> stable.
> 
> For example this attack against SLUB (also applicable against SLAB)
> would be affected:
> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> 
> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> a controllable heap is opened to new attacks not yet publicly discussed.
> A kernel heap overflow can be transformed to multiple use-after-free.
> This feature makes this type of attack harder too.
> 
> To generate entropy, we use get_random_bytes_arch because 0 bits of
> entropy is available in the boot stage. In the worse case this function
> will fallback to the get_random_bytes sub API. We also generate a shift
> random number to shift pre-computed freelist for each new set of pages.
> 
> The config option name is not specific to the SLAB as this approach will
> be extended to other allocators like SLUB.
> 
> Performance results highlighted no major changes:
> 
> Hackbench (running 90 10 times):
> 
> Before average: 0.0698
> After average: 0.0663 (-5.01%)
> 
> slab_test 1 run on boot. Difference only seen on the 2048 size test
> being the worse case scenario covered by freelist randomization. New
> slab pages are constantly being created on the 1 allocations.
> Variance should be mainly due to getting new pages every few
> allocations.
> 
> ...
>
> --- a/include/linux/slab_def.h
> +++ b/include/linux/slab_def.h
> @@ -80,6 +80,10 @@ struct kmem_cache {
>   struct kasan_cache kasan_info;
>  #endif
>  
> +#ifdef CONFIG_FREELIST_RANDOM
> + void *random_seq;
> +#endif
> +
>   struct kmem_cache_node *node[MAX_NUMNODES];
>  };
>  
> diff --git a/init/Kconfig b/init/Kconfig
> index 0c66640..73453d0 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1742,6 +1742,15 @@ config SLOB
>  
>  endchoice
>  
> +config FREELIST_RANDOM
> + default n
> + depends on SLAB
> + bool "SLAB freelist randomization"
> + help
> +   Randomizes the freelist order used on creating new SLABs. This
> +   security feature reduces the predictability of the kernel slab
> +   allocator against heap overflows.

Against the v2 patch I didst observe:

: CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
: CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
: identifier could be used for implementing randomisation in
: slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?

but this pearl appeared to pass unnoticed.

>  config SLUB_CPU_PARTIAL
>   default y
>   depends on SLUB && SMP
> diff --git a/mm/slab.c b/mm/slab.c
> index b82ee6b..0ed728a 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1230,6 +1230,61 @@ static void __init set_up_node(struct kmem_cache 
> *cachep, int index)
>   }
>  }
>  
> +#ifdef CONFIG_FREELIST_RANDOM
> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t *list,
> + size_t count)
> +{
> + size_t i;
> + unsigned int rand;
> +
> + for (i = 0; i < count; i++)
> + list[i] = i;
> +
> + /* Fisher-Yates shuffle */
> + for (i = count - 1; i > 0; i--) {
> + rand = prandom_u32_state(state);
> + rand %= (i + 1);
> + swap(list[i], list[rand]);
> + }
> +}
> +
> +/* Create a random sequence per cache */
> +static int cache_random_seq_create(struct kmem_cache *cachep)
> +{
> + unsigned int seed, count = cachep->num;
> + struct rnd_state state;
> +
> + if (count < 2)
> + return 0;
> +
> + /* If it fails, we will just use the global lists */
> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> + if (!cachep->random_seq)
> + return -ENOMEM;

OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
instead ;)

Questions for slab maintainers:

What's going on with the gfp_flags in there?  kmem_cache_init_late()
passes GFP_NOWAIT into enable_cpucache().

a) why the heck does it do that?  It's __init code!

b) if there's a legit reason then your new cache_random_seq_create()
should be getting its gfp_t from its caller, rather than blindly
assuming GFP_KERNEL.

c) kmem_cache_init_late() goes BUG on ENOMEM.  Generally that's OK in
__init code: we assume infinite memory during bootup.  But it's really
quite weird to use GFP_NOWAIT and then to go BUG if GFP_NOWAIT had its
predictable outcome (ie: failure).

Finally, all callers of 

Re: [PATCH v4] mm: SLAB freelist randomization

2016-04-26 Thread Andrew Morton
On Tue, 26 Apr 2016 09:21:10 -0700 Thomas Garnier  wrote:

> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> SLAB freelist. The list is randomized during initialization of a new set
> of pages. The order on different freelist sizes is pre-computed at boot
> for performance. Each kmem_cache has its own randomized freelist. Before
> pre-computed lists are available freelists are generated
> dynamically. This security feature reduces the predictability of the
> kernel SLAB allocator against heap overflows rendering attacks much less
> stable.
> 
> For example this attack against SLUB (also applicable against SLAB)
> would be affected:
> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> 
> Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> a controllable heap is opened to new attacks not yet publicly discussed.
> A kernel heap overflow can be transformed to multiple use-after-free.
> This feature makes this type of attack harder too.
> 
> To generate entropy, we use get_random_bytes_arch because 0 bits of
> entropy is available in the boot stage. In the worse case this function
> will fallback to the get_random_bytes sub API. We also generate a shift
> random number to shift pre-computed freelist for each new set of pages.
> 
> The config option name is not specific to the SLAB as this approach will
> be extended to other allocators like SLUB.
> 
> Performance results highlighted no major changes:
> 
> Hackbench (running 90 10 times):
> 
> Before average: 0.0698
> After average: 0.0663 (-5.01%)
> 
> slab_test 1 run on boot. Difference only seen on the 2048 size test
> being the worse case scenario covered by freelist randomization. New
> slab pages are constantly being created on the 1 allocations.
> Variance should be mainly due to getting new pages every few
> allocations.
> 
> ...
>
> --- a/include/linux/slab_def.h
> +++ b/include/linux/slab_def.h
> @@ -80,6 +80,10 @@ struct kmem_cache {
>   struct kasan_cache kasan_info;
>  #endif
>  
> +#ifdef CONFIG_FREELIST_RANDOM
> + void *random_seq;
> +#endif
> +
>   struct kmem_cache_node *node[MAX_NUMNODES];
>  };
>  
> diff --git a/init/Kconfig b/init/Kconfig
> index 0c66640..73453d0 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1742,6 +1742,15 @@ config SLOB
>  
>  endchoice
>  
> +config FREELIST_RANDOM
> + default n
> + depends on SLAB
> + bool "SLAB freelist randomization"
> + help
> +   Randomizes the freelist order used on creating new SLABs. This
> +   security feature reduces the predictability of the kernel slab
> +   allocator against heap overflows.

Against the v2 patch I didst observe:

: CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
: CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
: identifier could be used for implementing randomisation in
: slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?

but this pearl appeared to pass unnoticed.

>  config SLUB_CPU_PARTIAL
>   default y
>   depends on SLUB && SMP
> diff --git a/mm/slab.c b/mm/slab.c
> index b82ee6b..0ed728a 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1230,6 +1230,61 @@ static void __init set_up_node(struct kmem_cache 
> *cachep, int index)
>   }
>  }
>  
> +#ifdef CONFIG_FREELIST_RANDOM
> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t *list,
> + size_t count)
> +{
> + size_t i;
> + unsigned int rand;
> +
> + for (i = 0; i < count; i++)
> + list[i] = i;
> +
> + /* Fisher-Yates shuffle */
> + for (i = count - 1; i > 0; i--) {
> + rand = prandom_u32_state(state);
> + rand %= (i + 1);
> + swap(list[i], list[rand]);
> + }
> +}
> +
> +/* Create a random sequence per cache */
> +static int cache_random_seq_create(struct kmem_cache *cachep)
> +{
> + unsigned int seed, count = cachep->num;
> + struct rnd_state state;
> +
> + if (count < 2)
> + return 0;
> +
> + /* If it fails, we will just use the global lists */
> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> + if (!cachep->random_seq)
> + return -ENOMEM;

OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
instead ;)

Questions for slab maintainers:

What's going on with the gfp_flags in there?  kmem_cache_init_late()
passes GFP_NOWAIT into enable_cpucache().

a) why the heck does it do that?  It's __init code!

b) if there's a legit reason then your new cache_random_seq_create()
should be getting its gfp_t from its caller, rather than blindly
assuming GFP_KERNEL.

c) kmem_cache_init_late() goes BUG on ENOMEM.  Generally that's OK in
__init code: we assume infinite memory during bootup.  But it's really
quite weird to use GFP_NOWAIT and then to go BUG if GFP_NOWAIT had its
predictable outcome (ie: failure).

Finally, all callers of enable_cpucache() (and hence of