Re: [RFC v2 2/2] mm: SLUB Freelist randomization
On Wed, May 25, 2016 at 6:49 PM, Joonsoo Kimwrote: > 2016-05-25 6:15 GMT+09:00 Thomas Garnier : >> Implements Freelist randomization for the SLUB allocator. It was >> previous implemented for the SLAB allocator. Both use the same >> configuration option (CONFIG_SLAB_FREELIST_RANDOM). >> >> The list is randomized during initialization of a new set of pages. The >> order on different freelist sizes is pre-computed at boot for >> performance. Each kmem_cache has its own randomized freelist. This >> security feature reduces the predictability of the kernel SLUB allocator >> against heap overflows rendering attacks much less stable. >> >> For example these attacks exploit the predictability of the heap: >> - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) >> - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) >> >> Performance results: >> >> slab_test impact is between 3% to 4% on average: >> >> Before: >> >> Single thread testing >> = >> 1. Kmalloc: Repeatedly allocate then free test >> 10 times kmalloc(8) -> 49 cycles kfree -> 77 cycles >> 10 times kmalloc(16) -> 51 cycles kfree -> 79 cycles >> 10 times kmalloc(32) -> 53 cycles kfree -> 83 cycles >> 10 times kmalloc(64) -> 62 cycles kfree -> 90 cycles >> 10 times kmalloc(128) -> 81 cycles kfree -> 97 cycles >> 10 times kmalloc(256) -> 98 cycles kfree -> 121 cycles >> 10 times kmalloc(512) -> 95 cycles kfree -> 122 cycles >> 10 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles >> 10 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles >> 10 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles >> 2. Kmalloc: alloc/free test >> 10 times kmalloc(8)/kfree -> 70 cycles >> 10 times kmalloc(16)/kfree -> 70 cycles >> 10 times kmalloc(32)/kfree -> 70 cycles >> 10 times kmalloc(64)/kfree -> 70 cycles >> 10 times kmalloc(128)/kfree -> 70 cycles >> 10 times kmalloc(256)/kfree -> 69 cycles >> 10 times kmalloc(512)/kfree -> 70 cycles >> 10 times kmalloc(1024)/kfree -> 73 cycles >> 10 times kmalloc(2048)/kfree -> 72 cycles >> 10 times kmalloc(4096)/kfree -> 71 cycles >> >> After: >> >> Single thread testing >> = >> 1. Kmalloc: Repeatedly allocate then free test >> 10 times kmalloc(8) -> 57 cycles kfree -> 78 cycles >> 10 times kmalloc(16) -> 61 cycles kfree -> 81 cycles >> 10 times kmalloc(32) -> 76 cycles kfree -> 93 cycles >> 10 times kmalloc(64) -> 83 cycles kfree -> 94 cycles >> 10 times kmalloc(128) -> 106 cycles kfree -> 107 cycles >> 10 times kmalloc(256) -> 118 cycles kfree -> 117 cycles >> 10 times kmalloc(512) -> 114 cycles kfree -> 116 cycles >> 10 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles >> 10 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles >> 10 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles >> 2. Kmalloc: alloc/free test >> 10 times kmalloc(8)/kfree -> 66 cycles >> 10 times kmalloc(16)/kfree -> 66 cycles >> 10 times kmalloc(32)/kfree -> 66 cycles >> 10 times kmalloc(64)/kfree -> 66 cycles >> 10 times kmalloc(128)/kfree -> 65 cycles >> 10 times kmalloc(256)/kfree -> 67 cycles >> 10 times kmalloc(512)/kfree -> 67 cycles >> 10 times kmalloc(1024)/kfree -> 64 cycles >> 10 times kmalloc(2048)/kfree -> 67 cycles >> 10 times kmalloc(4096)/kfree -> 67 cycles >> >> Kernbench, before: >> >> Average Optimal load -j 12 Run (std deviation): >> Elapsed Time 101.873 (1.16069) >> User Time 1045.22 (1.60447) >> System Time 88.969 (0.559195) >> Percent CPU 1112.9 (13.8279) >> Context Switches 189140 (2282.15) >> Sleeps 99008.6 (768.091) >> >> After: >> >> Average Optimal load -j 12 Run (std deviation): >> Elapsed Time 102.47 (0.562732) >> User Time 1045.3 (1.34263) >> System Time 88.311 (0.342554) >> Percent CPU 1105.8 (6.49444) >> Context Switches 189081 (2355.78) >> Sleeps 99231.5 (800.358) >> >> Signed-off-by: Thomas Garnier >> --- >> Based on 0e01df100b6bf22a1de61b66657502a6454153c5 >> --- >> include/linux/slub_def.h | 8 +++ >> init/Kconfig | 4 +- >> mm/slub.c| 133 >> --- >> 3 files changed, 136 insertions(+), 9 deletions(-) >> >> diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h >> index 665cd0c..22d487e 100644 >> --- a/include/linux/slub_def.h >> +++ b/include/linux/slub_def.h >> @@ -56,6 +56,9 @@ struct kmem_cache_order_objects { >> unsigned long x; >> }; >> >> +/* Index used for freelist randomization */ >> +typedef unsigned int freelist_idx_t; >> + >> /* >> * Slab cache management. >> */ >> @@ -99,6 +102,11 @@ struct kmem_cache { >> */ >> int remote_node_defrag_ratio; >> #endif >> + >> +#ifdef CONFIG_SLAB_FREELIST_RANDOM >> + freelist_idx_t *random_seq; >> +#endif >> + >> struct kmem_cache_node *node[MAX_NUMNODES]; >>
Re: [RFC v2 2/2] mm: SLUB Freelist randomization
On Wed, May 25, 2016 at 6:49 PM, Joonsoo Kim wrote: > 2016-05-25 6:15 GMT+09:00 Thomas Garnier : >> Implements Freelist randomization for the SLUB allocator. It was >> previous implemented for the SLAB allocator. Both use the same >> configuration option (CONFIG_SLAB_FREELIST_RANDOM). >> >> The list is randomized during initialization of a new set of pages. The >> order on different freelist sizes is pre-computed at boot for >> performance. Each kmem_cache has its own randomized freelist. This >> security feature reduces the predictability of the kernel SLUB allocator >> against heap overflows rendering attacks much less stable. >> >> For example these attacks exploit the predictability of the heap: >> - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) >> - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) >> >> Performance results: >> >> slab_test impact is between 3% to 4% on average: >> >> Before: >> >> Single thread testing >> = >> 1. Kmalloc: Repeatedly allocate then free test >> 10 times kmalloc(8) -> 49 cycles kfree -> 77 cycles >> 10 times kmalloc(16) -> 51 cycles kfree -> 79 cycles >> 10 times kmalloc(32) -> 53 cycles kfree -> 83 cycles >> 10 times kmalloc(64) -> 62 cycles kfree -> 90 cycles >> 10 times kmalloc(128) -> 81 cycles kfree -> 97 cycles >> 10 times kmalloc(256) -> 98 cycles kfree -> 121 cycles >> 10 times kmalloc(512) -> 95 cycles kfree -> 122 cycles >> 10 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles >> 10 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles >> 10 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles >> 2. Kmalloc: alloc/free test >> 10 times kmalloc(8)/kfree -> 70 cycles >> 10 times kmalloc(16)/kfree -> 70 cycles >> 10 times kmalloc(32)/kfree -> 70 cycles >> 10 times kmalloc(64)/kfree -> 70 cycles >> 10 times kmalloc(128)/kfree -> 70 cycles >> 10 times kmalloc(256)/kfree -> 69 cycles >> 10 times kmalloc(512)/kfree -> 70 cycles >> 10 times kmalloc(1024)/kfree -> 73 cycles >> 10 times kmalloc(2048)/kfree -> 72 cycles >> 10 times kmalloc(4096)/kfree -> 71 cycles >> >> After: >> >> Single thread testing >> = >> 1. Kmalloc: Repeatedly allocate then free test >> 10 times kmalloc(8) -> 57 cycles kfree -> 78 cycles >> 10 times kmalloc(16) -> 61 cycles kfree -> 81 cycles >> 10 times kmalloc(32) -> 76 cycles kfree -> 93 cycles >> 10 times kmalloc(64) -> 83 cycles kfree -> 94 cycles >> 10 times kmalloc(128) -> 106 cycles kfree -> 107 cycles >> 10 times kmalloc(256) -> 118 cycles kfree -> 117 cycles >> 10 times kmalloc(512) -> 114 cycles kfree -> 116 cycles >> 10 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles >> 10 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles >> 10 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles >> 2. Kmalloc: alloc/free test >> 10 times kmalloc(8)/kfree -> 66 cycles >> 10 times kmalloc(16)/kfree -> 66 cycles >> 10 times kmalloc(32)/kfree -> 66 cycles >> 10 times kmalloc(64)/kfree -> 66 cycles >> 10 times kmalloc(128)/kfree -> 65 cycles >> 10 times kmalloc(256)/kfree -> 67 cycles >> 10 times kmalloc(512)/kfree -> 67 cycles >> 10 times kmalloc(1024)/kfree -> 64 cycles >> 10 times kmalloc(2048)/kfree -> 67 cycles >> 10 times kmalloc(4096)/kfree -> 67 cycles >> >> Kernbench, before: >> >> Average Optimal load -j 12 Run (std deviation): >> Elapsed Time 101.873 (1.16069) >> User Time 1045.22 (1.60447) >> System Time 88.969 (0.559195) >> Percent CPU 1112.9 (13.8279) >> Context Switches 189140 (2282.15) >> Sleeps 99008.6 (768.091) >> >> After: >> >> Average Optimal load -j 12 Run (std deviation): >> Elapsed Time 102.47 (0.562732) >> User Time 1045.3 (1.34263) >> System Time 88.311 (0.342554) >> Percent CPU 1105.8 (6.49444) >> Context Switches 189081 (2355.78) >> Sleeps 99231.5 (800.358) >> >> Signed-off-by: Thomas Garnier >> --- >> Based on 0e01df100b6bf22a1de61b66657502a6454153c5 >> --- >> include/linux/slub_def.h | 8 +++ >> init/Kconfig | 4 +- >> mm/slub.c| 133 >> --- >> 3 files changed, 136 insertions(+), 9 deletions(-) >> >> diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h >> index 665cd0c..22d487e 100644 >> --- a/include/linux/slub_def.h >> +++ b/include/linux/slub_def.h >> @@ -56,6 +56,9 @@ struct kmem_cache_order_objects { >> unsigned long x; >> }; >> >> +/* Index used for freelist randomization */ >> +typedef unsigned int freelist_idx_t; >> + >> /* >> * Slab cache management. >> */ >> @@ -99,6 +102,11 @@ struct kmem_cache { >> */ >> int remote_node_defrag_ratio; >> #endif >> + >> +#ifdef CONFIG_SLAB_FREELIST_RANDOM >> + freelist_idx_t *random_seq; >> +#endif >> + >> struct kmem_cache_node *node[MAX_NUMNODES]; >> }; >> >> diff --git a/init/Kconfig b/init/Kconfig >> index
Re: [RFC v2 2/2] mm: SLUB Freelist randomization
On Wed, May 25, 2016 at 3:25 PM, Kees Cookwrote: > On Tue, May 24, 2016 at 2:15 PM, Thomas Garnier wrote: >> Implements Freelist randomization for the SLUB allocator. It was >> previous implemented for the SLAB allocator. Both use the same >> configuration option (CONFIG_SLAB_FREELIST_RANDOM). >> >> The list is randomized during initialization of a new set of pages. The >> order on different freelist sizes is pre-computed at boot for >> performance. Each kmem_cache has its own randomized freelist. This >> security feature reduces the predictability of the kernel SLUB allocator >> against heap overflows rendering attacks much less stable. >> >> For example these attacks exploit the predictability of the heap: >> - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) >> - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) >> >> Performance results: >> >> slab_test impact is between 3% to 4% on average: > > Seems like slab_test is pretty intensive (so the impact appears > higher). On a more "regular" load like kernbench, the impact seems to > be almost 0. Is that accurate? > Yes, because the slab_test done is more intensive on a single thread. It will show higher perf impact than just a global testing. The overall impact on the system is of course much smaller. I will detail that on the performance details. > Regardless, please consider both patches: > > Reviewed-by: Kees Cook > > -Kees > >> >> Before: >> >> Single thread testing >> = >> 1. Kmalloc: Repeatedly allocate then free test >> 10 times kmalloc(8) -> 49 cycles kfree -> 77 cycles >> 10 times kmalloc(16) -> 51 cycles kfree -> 79 cycles >> 10 times kmalloc(32) -> 53 cycles kfree -> 83 cycles >> 10 times kmalloc(64) -> 62 cycles kfree -> 90 cycles >> 10 times kmalloc(128) -> 81 cycles kfree -> 97 cycles >> 10 times kmalloc(256) -> 98 cycles kfree -> 121 cycles >> 10 times kmalloc(512) -> 95 cycles kfree -> 122 cycles >> 10 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles >> 10 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles >> 10 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles >> 2. Kmalloc: alloc/free test >> 10 times kmalloc(8)/kfree -> 70 cycles >> 10 times kmalloc(16)/kfree -> 70 cycles >> 10 times kmalloc(32)/kfree -> 70 cycles >> 10 times kmalloc(64)/kfree -> 70 cycles >> 10 times kmalloc(128)/kfree -> 70 cycles >> 10 times kmalloc(256)/kfree -> 69 cycles >> 10 times kmalloc(512)/kfree -> 70 cycles >> 10 times kmalloc(1024)/kfree -> 73 cycles >> 10 times kmalloc(2048)/kfree -> 72 cycles >> 10 times kmalloc(4096)/kfree -> 71 cycles >> >> After: >> >> Single thread testing >> = >> 1. Kmalloc: Repeatedly allocate then free test >> 10 times kmalloc(8) -> 57 cycles kfree -> 78 cycles >> 10 times kmalloc(16) -> 61 cycles kfree -> 81 cycles >> 10 times kmalloc(32) -> 76 cycles kfree -> 93 cycles >> 10 times kmalloc(64) -> 83 cycles kfree -> 94 cycles >> 10 times kmalloc(128) -> 106 cycles kfree -> 107 cycles >> 10 times kmalloc(256) -> 118 cycles kfree -> 117 cycles >> 10 times kmalloc(512) -> 114 cycles kfree -> 116 cycles >> 10 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles >> 10 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles >> 10 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles >> 2. Kmalloc: alloc/free test >> 10 times kmalloc(8)/kfree -> 66 cycles >> 10 times kmalloc(16)/kfree -> 66 cycles >> 10 times kmalloc(32)/kfree -> 66 cycles >> 10 times kmalloc(64)/kfree -> 66 cycles >> 10 times kmalloc(128)/kfree -> 65 cycles >> 10 times kmalloc(256)/kfree -> 67 cycles >> 10 times kmalloc(512)/kfree -> 67 cycles >> 10 times kmalloc(1024)/kfree -> 64 cycles >> 10 times kmalloc(2048)/kfree -> 67 cycles >> 10 times kmalloc(4096)/kfree -> 67 cycles >> >> Kernbench, before: >> >> Average Optimal load -j 12 Run (std deviation): >> Elapsed Time 101.873 (1.16069) >> User Time 1045.22 (1.60447) >> System Time 88.969 (0.559195) >> Percent CPU 1112.9 (13.8279) >> Context Switches 189140 (2282.15) >> Sleeps 99008.6 (768.091) >> >> After: >> >> Average Optimal load -j 12 Run (std deviation): >> Elapsed Time 102.47 (0.562732) >> User Time 1045.3 (1.34263) >> System Time 88.311 (0.342554) >> Percent CPU 1105.8 (6.49444) >> Context Switches 189081 (2355.78) >> Sleeps 99231.5 (800.358) >> >> Signed-off-by: Thomas Garnier >> --- >> Based on 0e01df100b6bf22a1de61b66657502a6454153c5 >> --- >> include/linux/slub_def.h | 8 +++ >> init/Kconfig | 4 +- >> mm/slub.c| 133 >> --- >> 3 files changed, 136 insertions(+), 9 deletions(-) >> >> diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h >> index 665cd0c..22d487e 100644 >> ---
Re: [RFC v2 2/2] mm: SLUB Freelist randomization
On Wed, May 25, 2016 at 3:25 PM, Kees Cook wrote: > On Tue, May 24, 2016 at 2:15 PM, Thomas Garnier wrote: >> Implements Freelist randomization for the SLUB allocator. It was >> previous implemented for the SLAB allocator. Both use the same >> configuration option (CONFIG_SLAB_FREELIST_RANDOM). >> >> The list is randomized during initialization of a new set of pages. The >> order on different freelist sizes is pre-computed at boot for >> performance. Each kmem_cache has its own randomized freelist. This >> security feature reduces the predictability of the kernel SLUB allocator >> against heap overflows rendering attacks much less stable. >> >> For example these attacks exploit the predictability of the heap: >> - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) >> - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) >> >> Performance results: >> >> slab_test impact is between 3% to 4% on average: > > Seems like slab_test is pretty intensive (so the impact appears > higher). On a more "regular" load like kernbench, the impact seems to > be almost 0. Is that accurate? > Yes, because the slab_test done is more intensive on a single thread. It will show higher perf impact than just a global testing. The overall impact on the system is of course much smaller. I will detail that on the performance details. > Regardless, please consider both patches: > > Reviewed-by: Kees Cook > > -Kees > >> >> Before: >> >> Single thread testing >> = >> 1. Kmalloc: Repeatedly allocate then free test >> 10 times kmalloc(8) -> 49 cycles kfree -> 77 cycles >> 10 times kmalloc(16) -> 51 cycles kfree -> 79 cycles >> 10 times kmalloc(32) -> 53 cycles kfree -> 83 cycles >> 10 times kmalloc(64) -> 62 cycles kfree -> 90 cycles >> 10 times kmalloc(128) -> 81 cycles kfree -> 97 cycles >> 10 times kmalloc(256) -> 98 cycles kfree -> 121 cycles >> 10 times kmalloc(512) -> 95 cycles kfree -> 122 cycles >> 10 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles >> 10 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles >> 10 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles >> 2. Kmalloc: alloc/free test >> 10 times kmalloc(8)/kfree -> 70 cycles >> 10 times kmalloc(16)/kfree -> 70 cycles >> 10 times kmalloc(32)/kfree -> 70 cycles >> 10 times kmalloc(64)/kfree -> 70 cycles >> 10 times kmalloc(128)/kfree -> 70 cycles >> 10 times kmalloc(256)/kfree -> 69 cycles >> 10 times kmalloc(512)/kfree -> 70 cycles >> 10 times kmalloc(1024)/kfree -> 73 cycles >> 10 times kmalloc(2048)/kfree -> 72 cycles >> 10 times kmalloc(4096)/kfree -> 71 cycles >> >> After: >> >> Single thread testing >> = >> 1. Kmalloc: Repeatedly allocate then free test >> 10 times kmalloc(8) -> 57 cycles kfree -> 78 cycles >> 10 times kmalloc(16) -> 61 cycles kfree -> 81 cycles >> 10 times kmalloc(32) -> 76 cycles kfree -> 93 cycles >> 10 times kmalloc(64) -> 83 cycles kfree -> 94 cycles >> 10 times kmalloc(128) -> 106 cycles kfree -> 107 cycles >> 10 times kmalloc(256) -> 118 cycles kfree -> 117 cycles >> 10 times kmalloc(512) -> 114 cycles kfree -> 116 cycles >> 10 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles >> 10 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles >> 10 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles >> 2. Kmalloc: alloc/free test >> 10 times kmalloc(8)/kfree -> 66 cycles >> 10 times kmalloc(16)/kfree -> 66 cycles >> 10 times kmalloc(32)/kfree -> 66 cycles >> 10 times kmalloc(64)/kfree -> 66 cycles >> 10 times kmalloc(128)/kfree -> 65 cycles >> 10 times kmalloc(256)/kfree -> 67 cycles >> 10 times kmalloc(512)/kfree -> 67 cycles >> 10 times kmalloc(1024)/kfree -> 64 cycles >> 10 times kmalloc(2048)/kfree -> 67 cycles >> 10 times kmalloc(4096)/kfree -> 67 cycles >> >> Kernbench, before: >> >> Average Optimal load -j 12 Run (std deviation): >> Elapsed Time 101.873 (1.16069) >> User Time 1045.22 (1.60447) >> System Time 88.969 (0.559195) >> Percent CPU 1112.9 (13.8279) >> Context Switches 189140 (2282.15) >> Sleeps 99008.6 (768.091) >> >> After: >> >> Average Optimal load -j 12 Run (std deviation): >> Elapsed Time 102.47 (0.562732) >> User Time 1045.3 (1.34263) >> System Time 88.311 (0.342554) >> Percent CPU 1105.8 (6.49444) >> Context Switches 189081 (2355.78) >> Sleeps 99231.5 (800.358) >> >> Signed-off-by: Thomas Garnier >> --- >> Based on 0e01df100b6bf22a1de61b66657502a6454153c5 >> --- >> include/linux/slub_def.h | 8 +++ >> init/Kconfig | 4 +- >> mm/slub.c| 133 >> --- >> 3 files changed, 136 insertions(+), 9 deletions(-) >> >> diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h >> index 665cd0c..22d487e 100644 >> --- a/include/linux/slub_def.h >> +++ b/include/linux/slub_def.h >> @@ -56,6 +56,9 @@ struct
Re: [RFC v2 2/2] mm: SLUB Freelist randomization
2016-05-25 6:15 GMT+09:00 Thomas Garnier: > Implements Freelist randomization for the SLUB allocator. It was > previous implemented for the SLAB allocator. Both use the same > configuration option (CONFIG_SLAB_FREELIST_RANDOM). > > The list is randomized during initialization of a new set of pages. The > order on different freelist sizes is pre-computed at boot for > performance. Each kmem_cache has its own randomized freelist. This > security feature reduces the predictability of the kernel SLUB allocator > against heap overflows rendering attacks much less stable. > > For example these attacks exploit the predictability of the heap: > - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) > - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) > > Performance results: > > slab_test impact is between 3% to 4% on average: > > Before: > > Single thread testing > = > 1. Kmalloc: Repeatedly allocate then free test > 10 times kmalloc(8) -> 49 cycles kfree -> 77 cycles > 10 times kmalloc(16) -> 51 cycles kfree -> 79 cycles > 10 times kmalloc(32) -> 53 cycles kfree -> 83 cycles > 10 times kmalloc(64) -> 62 cycles kfree -> 90 cycles > 10 times kmalloc(128) -> 81 cycles kfree -> 97 cycles > 10 times kmalloc(256) -> 98 cycles kfree -> 121 cycles > 10 times kmalloc(512) -> 95 cycles kfree -> 122 cycles > 10 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles > 10 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles > 10 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles > 2. Kmalloc: alloc/free test > 10 times kmalloc(8)/kfree -> 70 cycles > 10 times kmalloc(16)/kfree -> 70 cycles > 10 times kmalloc(32)/kfree -> 70 cycles > 10 times kmalloc(64)/kfree -> 70 cycles > 10 times kmalloc(128)/kfree -> 70 cycles > 10 times kmalloc(256)/kfree -> 69 cycles > 10 times kmalloc(512)/kfree -> 70 cycles > 10 times kmalloc(1024)/kfree -> 73 cycles > 10 times kmalloc(2048)/kfree -> 72 cycles > 10 times kmalloc(4096)/kfree -> 71 cycles > > After: > > Single thread testing > = > 1. Kmalloc: Repeatedly allocate then free test > 10 times kmalloc(8) -> 57 cycles kfree -> 78 cycles > 10 times kmalloc(16) -> 61 cycles kfree -> 81 cycles > 10 times kmalloc(32) -> 76 cycles kfree -> 93 cycles > 10 times kmalloc(64) -> 83 cycles kfree -> 94 cycles > 10 times kmalloc(128) -> 106 cycles kfree -> 107 cycles > 10 times kmalloc(256) -> 118 cycles kfree -> 117 cycles > 10 times kmalloc(512) -> 114 cycles kfree -> 116 cycles > 10 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles > 10 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles > 10 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles > 2. Kmalloc: alloc/free test > 10 times kmalloc(8)/kfree -> 66 cycles > 10 times kmalloc(16)/kfree -> 66 cycles > 10 times kmalloc(32)/kfree -> 66 cycles > 10 times kmalloc(64)/kfree -> 66 cycles > 10 times kmalloc(128)/kfree -> 65 cycles > 10 times kmalloc(256)/kfree -> 67 cycles > 10 times kmalloc(512)/kfree -> 67 cycles > 10 times kmalloc(1024)/kfree -> 64 cycles > 10 times kmalloc(2048)/kfree -> 67 cycles > 10 times kmalloc(4096)/kfree -> 67 cycles > > Kernbench, before: > > Average Optimal load -j 12 Run (std deviation): > Elapsed Time 101.873 (1.16069) > User Time 1045.22 (1.60447) > System Time 88.969 (0.559195) > Percent CPU 1112.9 (13.8279) > Context Switches 189140 (2282.15) > Sleeps 99008.6 (768.091) > > After: > > Average Optimal load -j 12 Run (std deviation): > Elapsed Time 102.47 (0.562732) > User Time 1045.3 (1.34263) > System Time 88.311 (0.342554) > Percent CPU 1105.8 (6.49444) > Context Switches 189081 (2355.78) > Sleeps 99231.5 (800.358) > > Signed-off-by: Thomas Garnier > --- > Based on 0e01df100b6bf22a1de61b66657502a6454153c5 > --- > include/linux/slub_def.h | 8 +++ > init/Kconfig | 4 +- > mm/slub.c| 133 > --- > 3 files changed, 136 insertions(+), 9 deletions(-) > > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h > index 665cd0c..22d487e 100644 > --- a/include/linux/slub_def.h > +++ b/include/linux/slub_def.h > @@ -56,6 +56,9 @@ struct kmem_cache_order_objects { > unsigned long x; > }; > > +/* Index used for freelist randomization */ > +typedef unsigned int freelist_idx_t; > + > /* > * Slab cache management. > */ > @@ -99,6 +102,11 @@ struct kmem_cache { > */ > int remote_node_defrag_ratio; > #endif > + > +#ifdef CONFIG_SLAB_FREELIST_RANDOM > + freelist_idx_t *random_seq; > +#endif > + > struct kmem_cache_node *node[MAX_NUMNODES]; > }; > > diff --git a/init/Kconfig b/init/Kconfig > index a9c4aefd..fbb6678 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -1771,10 +1771,10 @@ endchoice > > config SLAB_FREELIST_RANDOM >
Re: [RFC v2 2/2] mm: SLUB Freelist randomization
2016-05-25 6:15 GMT+09:00 Thomas Garnier : > Implements Freelist randomization for the SLUB allocator. It was > previous implemented for the SLAB allocator. Both use the same > configuration option (CONFIG_SLAB_FREELIST_RANDOM). > > The list is randomized during initialization of a new set of pages. The > order on different freelist sizes is pre-computed at boot for > performance. Each kmem_cache has its own randomized freelist. This > security feature reduces the predictability of the kernel SLUB allocator > against heap overflows rendering attacks much less stable. > > For example these attacks exploit the predictability of the heap: > - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) > - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) > > Performance results: > > slab_test impact is between 3% to 4% on average: > > Before: > > Single thread testing > = > 1. Kmalloc: Repeatedly allocate then free test > 10 times kmalloc(8) -> 49 cycles kfree -> 77 cycles > 10 times kmalloc(16) -> 51 cycles kfree -> 79 cycles > 10 times kmalloc(32) -> 53 cycles kfree -> 83 cycles > 10 times kmalloc(64) -> 62 cycles kfree -> 90 cycles > 10 times kmalloc(128) -> 81 cycles kfree -> 97 cycles > 10 times kmalloc(256) -> 98 cycles kfree -> 121 cycles > 10 times kmalloc(512) -> 95 cycles kfree -> 122 cycles > 10 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles > 10 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles > 10 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles > 2. Kmalloc: alloc/free test > 10 times kmalloc(8)/kfree -> 70 cycles > 10 times kmalloc(16)/kfree -> 70 cycles > 10 times kmalloc(32)/kfree -> 70 cycles > 10 times kmalloc(64)/kfree -> 70 cycles > 10 times kmalloc(128)/kfree -> 70 cycles > 10 times kmalloc(256)/kfree -> 69 cycles > 10 times kmalloc(512)/kfree -> 70 cycles > 10 times kmalloc(1024)/kfree -> 73 cycles > 10 times kmalloc(2048)/kfree -> 72 cycles > 10 times kmalloc(4096)/kfree -> 71 cycles > > After: > > Single thread testing > = > 1. Kmalloc: Repeatedly allocate then free test > 10 times kmalloc(8) -> 57 cycles kfree -> 78 cycles > 10 times kmalloc(16) -> 61 cycles kfree -> 81 cycles > 10 times kmalloc(32) -> 76 cycles kfree -> 93 cycles > 10 times kmalloc(64) -> 83 cycles kfree -> 94 cycles > 10 times kmalloc(128) -> 106 cycles kfree -> 107 cycles > 10 times kmalloc(256) -> 118 cycles kfree -> 117 cycles > 10 times kmalloc(512) -> 114 cycles kfree -> 116 cycles > 10 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles > 10 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles > 10 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles > 2. Kmalloc: alloc/free test > 10 times kmalloc(8)/kfree -> 66 cycles > 10 times kmalloc(16)/kfree -> 66 cycles > 10 times kmalloc(32)/kfree -> 66 cycles > 10 times kmalloc(64)/kfree -> 66 cycles > 10 times kmalloc(128)/kfree -> 65 cycles > 10 times kmalloc(256)/kfree -> 67 cycles > 10 times kmalloc(512)/kfree -> 67 cycles > 10 times kmalloc(1024)/kfree -> 64 cycles > 10 times kmalloc(2048)/kfree -> 67 cycles > 10 times kmalloc(4096)/kfree -> 67 cycles > > Kernbench, before: > > Average Optimal load -j 12 Run (std deviation): > Elapsed Time 101.873 (1.16069) > User Time 1045.22 (1.60447) > System Time 88.969 (0.559195) > Percent CPU 1112.9 (13.8279) > Context Switches 189140 (2282.15) > Sleeps 99008.6 (768.091) > > After: > > Average Optimal load -j 12 Run (std deviation): > Elapsed Time 102.47 (0.562732) > User Time 1045.3 (1.34263) > System Time 88.311 (0.342554) > Percent CPU 1105.8 (6.49444) > Context Switches 189081 (2355.78) > Sleeps 99231.5 (800.358) > > Signed-off-by: Thomas Garnier > --- > Based on 0e01df100b6bf22a1de61b66657502a6454153c5 > --- > include/linux/slub_def.h | 8 +++ > init/Kconfig | 4 +- > mm/slub.c| 133 > --- > 3 files changed, 136 insertions(+), 9 deletions(-) > > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h > index 665cd0c..22d487e 100644 > --- a/include/linux/slub_def.h > +++ b/include/linux/slub_def.h > @@ -56,6 +56,9 @@ struct kmem_cache_order_objects { > unsigned long x; > }; > > +/* Index used for freelist randomization */ > +typedef unsigned int freelist_idx_t; > + > /* > * Slab cache management. > */ > @@ -99,6 +102,11 @@ struct kmem_cache { > */ > int remote_node_defrag_ratio; > #endif > + > +#ifdef CONFIG_SLAB_FREELIST_RANDOM > + freelist_idx_t *random_seq; > +#endif > + > struct kmem_cache_node *node[MAX_NUMNODES]; > }; > > diff --git a/init/Kconfig b/init/Kconfig > index a9c4aefd..fbb6678 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -1771,10 +1771,10 @@ endchoice > > config SLAB_FREELIST_RANDOM > default n > - depends on SLAB > +
Re: [RFC v2 2/2] mm: SLUB Freelist randomization
On Tue, May 24, 2016 at 2:15 PM, Thomas Garnierwrote: > Implements Freelist randomization for the SLUB allocator. It was > previous implemented for the SLAB allocator. Both use the same > configuration option (CONFIG_SLAB_FREELIST_RANDOM). > > The list is randomized during initialization of a new set of pages. The > order on different freelist sizes is pre-computed at boot for > performance. Each kmem_cache has its own randomized freelist. This > security feature reduces the predictability of the kernel SLUB allocator > against heap overflows rendering attacks much less stable. > > For example these attacks exploit the predictability of the heap: > - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) > - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) > > Performance results: > > slab_test impact is between 3% to 4% on average: Seems like slab_test is pretty intensive (so the impact appears higher). On a more "regular" load like kernbench, the impact seems to be almost 0. Is that accurate? Regardless, please consider both patches: Reviewed-by: Kees Cook -Kees > > Before: > > Single thread testing > = > 1. Kmalloc: Repeatedly allocate then free test > 10 times kmalloc(8) -> 49 cycles kfree -> 77 cycles > 10 times kmalloc(16) -> 51 cycles kfree -> 79 cycles > 10 times kmalloc(32) -> 53 cycles kfree -> 83 cycles > 10 times kmalloc(64) -> 62 cycles kfree -> 90 cycles > 10 times kmalloc(128) -> 81 cycles kfree -> 97 cycles > 10 times kmalloc(256) -> 98 cycles kfree -> 121 cycles > 10 times kmalloc(512) -> 95 cycles kfree -> 122 cycles > 10 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles > 10 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles > 10 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles > 2. Kmalloc: alloc/free test > 10 times kmalloc(8)/kfree -> 70 cycles > 10 times kmalloc(16)/kfree -> 70 cycles > 10 times kmalloc(32)/kfree -> 70 cycles > 10 times kmalloc(64)/kfree -> 70 cycles > 10 times kmalloc(128)/kfree -> 70 cycles > 10 times kmalloc(256)/kfree -> 69 cycles > 10 times kmalloc(512)/kfree -> 70 cycles > 10 times kmalloc(1024)/kfree -> 73 cycles > 10 times kmalloc(2048)/kfree -> 72 cycles > 10 times kmalloc(4096)/kfree -> 71 cycles > > After: > > Single thread testing > = > 1. Kmalloc: Repeatedly allocate then free test > 10 times kmalloc(8) -> 57 cycles kfree -> 78 cycles > 10 times kmalloc(16) -> 61 cycles kfree -> 81 cycles > 10 times kmalloc(32) -> 76 cycles kfree -> 93 cycles > 10 times kmalloc(64) -> 83 cycles kfree -> 94 cycles > 10 times kmalloc(128) -> 106 cycles kfree -> 107 cycles > 10 times kmalloc(256) -> 118 cycles kfree -> 117 cycles > 10 times kmalloc(512) -> 114 cycles kfree -> 116 cycles > 10 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles > 10 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles > 10 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles > 2. Kmalloc: alloc/free test > 10 times kmalloc(8)/kfree -> 66 cycles > 10 times kmalloc(16)/kfree -> 66 cycles > 10 times kmalloc(32)/kfree -> 66 cycles > 10 times kmalloc(64)/kfree -> 66 cycles > 10 times kmalloc(128)/kfree -> 65 cycles > 10 times kmalloc(256)/kfree -> 67 cycles > 10 times kmalloc(512)/kfree -> 67 cycles > 10 times kmalloc(1024)/kfree -> 64 cycles > 10 times kmalloc(2048)/kfree -> 67 cycles > 10 times kmalloc(4096)/kfree -> 67 cycles > > Kernbench, before: > > Average Optimal load -j 12 Run (std deviation): > Elapsed Time 101.873 (1.16069) > User Time 1045.22 (1.60447) > System Time 88.969 (0.559195) > Percent CPU 1112.9 (13.8279) > Context Switches 189140 (2282.15) > Sleeps 99008.6 (768.091) > > After: > > Average Optimal load -j 12 Run (std deviation): > Elapsed Time 102.47 (0.562732) > User Time 1045.3 (1.34263) > System Time 88.311 (0.342554) > Percent CPU 1105.8 (6.49444) > Context Switches 189081 (2355.78) > Sleeps 99231.5 (800.358) > > Signed-off-by: Thomas Garnier > --- > Based on 0e01df100b6bf22a1de61b66657502a6454153c5 > --- > include/linux/slub_def.h | 8 +++ > init/Kconfig | 4 +- > mm/slub.c| 133 > --- > 3 files changed, 136 insertions(+), 9 deletions(-) > > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h > index 665cd0c..22d487e 100644 > --- a/include/linux/slub_def.h > +++ b/include/linux/slub_def.h > @@ -56,6 +56,9 @@ struct kmem_cache_order_objects { > unsigned long x; > }; > > +/* Index used for freelist randomization */ > +typedef unsigned int freelist_idx_t; > + > /* > * Slab cache management. > */ > @@ -99,6 +102,11 @@ struct kmem_cache { > */ > int remote_node_defrag_ratio; > #endif > + > +#ifdef CONFIG_SLAB_FREELIST_RANDOM > + freelist_idx_t
Re: [RFC v2 2/2] mm: SLUB Freelist randomization
On Tue, May 24, 2016 at 2:15 PM, Thomas Garnier wrote: > Implements Freelist randomization for the SLUB allocator. It was > previous implemented for the SLAB allocator. Both use the same > configuration option (CONFIG_SLAB_FREELIST_RANDOM). > > The list is randomized during initialization of a new set of pages. The > order on different freelist sizes is pre-computed at boot for > performance. Each kmem_cache has its own randomized freelist. This > security feature reduces the predictability of the kernel SLUB allocator > against heap overflows rendering attacks much less stable. > > For example these attacks exploit the predictability of the heap: > - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) > - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) > > Performance results: > > slab_test impact is between 3% to 4% on average: Seems like slab_test is pretty intensive (so the impact appears higher). On a more "regular" load like kernbench, the impact seems to be almost 0. Is that accurate? Regardless, please consider both patches: Reviewed-by: Kees Cook -Kees > > Before: > > Single thread testing > = > 1. Kmalloc: Repeatedly allocate then free test > 10 times kmalloc(8) -> 49 cycles kfree -> 77 cycles > 10 times kmalloc(16) -> 51 cycles kfree -> 79 cycles > 10 times kmalloc(32) -> 53 cycles kfree -> 83 cycles > 10 times kmalloc(64) -> 62 cycles kfree -> 90 cycles > 10 times kmalloc(128) -> 81 cycles kfree -> 97 cycles > 10 times kmalloc(256) -> 98 cycles kfree -> 121 cycles > 10 times kmalloc(512) -> 95 cycles kfree -> 122 cycles > 10 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles > 10 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles > 10 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles > 2. Kmalloc: alloc/free test > 10 times kmalloc(8)/kfree -> 70 cycles > 10 times kmalloc(16)/kfree -> 70 cycles > 10 times kmalloc(32)/kfree -> 70 cycles > 10 times kmalloc(64)/kfree -> 70 cycles > 10 times kmalloc(128)/kfree -> 70 cycles > 10 times kmalloc(256)/kfree -> 69 cycles > 10 times kmalloc(512)/kfree -> 70 cycles > 10 times kmalloc(1024)/kfree -> 73 cycles > 10 times kmalloc(2048)/kfree -> 72 cycles > 10 times kmalloc(4096)/kfree -> 71 cycles > > After: > > Single thread testing > = > 1. Kmalloc: Repeatedly allocate then free test > 10 times kmalloc(8) -> 57 cycles kfree -> 78 cycles > 10 times kmalloc(16) -> 61 cycles kfree -> 81 cycles > 10 times kmalloc(32) -> 76 cycles kfree -> 93 cycles > 10 times kmalloc(64) -> 83 cycles kfree -> 94 cycles > 10 times kmalloc(128) -> 106 cycles kfree -> 107 cycles > 10 times kmalloc(256) -> 118 cycles kfree -> 117 cycles > 10 times kmalloc(512) -> 114 cycles kfree -> 116 cycles > 10 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles > 10 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles > 10 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles > 2. Kmalloc: alloc/free test > 10 times kmalloc(8)/kfree -> 66 cycles > 10 times kmalloc(16)/kfree -> 66 cycles > 10 times kmalloc(32)/kfree -> 66 cycles > 10 times kmalloc(64)/kfree -> 66 cycles > 10 times kmalloc(128)/kfree -> 65 cycles > 10 times kmalloc(256)/kfree -> 67 cycles > 10 times kmalloc(512)/kfree -> 67 cycles > 10 times kmalloc(1024)/kfree -> 64 cycles > 10 times kmalloc(2048)/kfree -> 67 cycles > 10 times kmalloc(4096)/kfree -> 67 cycles > > Kernbench, before: > > Average Optimal load -j 12 Run (std deviation): > Elapsed Time 101.873 (1.16069) > User Time 1045.22 (1.60447) > System Time 88.969 (0.559195) > Percent CPU 1112.9 (13.8279) > Context Switches 189140 (2282.15) > Sleeps 99008.6 (768.091) > > After: > > Average Optimal load -j 12 Run (std deviation): > Elapsed Time 102.47 (0.562732) > User Time 1045.3 (1.34263) > System Time 88.311 (0.342554) > Percent CPU 1105.8 (6.49444) > Context Switches 189081 (2355.78) > Sleeps 99231.5 (800.358) > > Signed-off-by: Thomas Garnier > --- > Based on 0e01df100b6bf22a1de61b66657502a6454153c5 > --- > include/linux/slub_def.h | 8 +++ > init/Kconfig | 4 +- > mm/slub.c| 133 > --- > 3 files changed, 136 insertions(+), 9 deletions(-) > > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h > index 665cd0c..22d487e 100644 > --- a/include/linux/slub_def.h > +++ b/include/linux/slub_def.h > @@ -56,6 +56,9 @@ struct kmem_cache_order_objects { > unsigned long x; > }; > > +/* Index used for freelist randomization */ > +typedef unsigned int freelist_idx_t; > + > /* > * Slab cache management. > */ > @@ -99,6 +102,11 @@ struct kmem_cache { > */ > int remote_node_defrag_ratio; > #endif > + > +#ifdef CONFIG_SLAB_FREELIST_RANDOM > + freelist_idx_t *random_seq; > +#endif > + > struct kmem_cache_node
[RFC v2 2/2] mm: SLUB Freelist randomization
Implements Freelist randomization for the SLUB allocator. It was previous implemented for the SLAB allocator. Both use the same configuration option (CONFIG_SLAB_FREELIST_RANDOM). The list is randomized during initialization of a new set of pages. The order on different freelist sizes is pre-computed at boot for performance. Each kmem_cache has its own randomized freelist. This security feature reduces the predictability of the kernel SLUB allocator against heap overflows rendering attacks much less stable. For example these attacks exploit the predictability of the heap: - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) Performance results: slab_test impact is between 3% to 4% on average: Before: Single thread testing = 1. Kmalloc: Repeatedly allocate then free test 10 times kmalloc(8) -> 49 cycles kfree -> 77 cycles 10 times kmalloc(16) -> 51 cycles kfree -> 79 cycles 10 times kmalloc(32) -> 53 cycles kfree -> 83 cycles 10 times kmalloc(64) -> 62 cycles kfree -> 90 cycles 10 times kmalloc(128) -> 81 cycles kfree -> 97 cycles 10 times kmalloc(256) -> 98 cycles kfree -> 121 cycles 10 times kmalloc(512) -> 95 cycles kfree -> 122 cycles 10 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles 10 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles 10 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles 2. Kmalloc: alloc/free test 10 times kmalloc(8)/kfree -> 70 cycles 10 times kmalloc(16)/kfree -> 70 cycles 10 times kmalloc(32)/kfree -> 70 cycles 10 times kmalloc(64)/kfree -> 70 cycles 10 times kmalloc(128)/kfree -> 70 cycles 10 times kmalloc(256)/kfree -> 69 cycles 10 times kmalloc(512)/kfree -> 70 cycles 10 times kmalloc(1024)/kfree -> 73 cycles 10 times kmalloc(2048)/kfree -> 72 cycles 10 times kmalloc(4096)/kfree -> 71 cycles After: Single thread testing = 1. Kmalloc: Repeatedly allocate then free test 10 times kmalloc(8) -> 57 cycles kfree -> 78 cycles 10 times kmalloc(16) -> 61 cycles kfree -> 81 cycles 10 times kmalloc(32) -> 76 cycles kfree -> 93 cycles 10 times kmalloc(64) -> 83 cycles kfree -> 94 cycles 10 times kmalloc(128) -> 106 cycles kfree -> 107 cycles 10 times kmalloc(256) -> 118 cycles kfree -> 117 cycles 10 times kmalloc(512) -> 114 cycles kfree -> 116 cycles 10 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles 10 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles 10 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles 2. Kmalloc: alloc/free test 10 times kmalloc(8)/kfree -> 66 cycles 10 times kmalloc(16)/kfree -> 66 cycles 10 times kmalloc(32)/kfree -> 66 cycles 10 times kmalloc(64)/kfree -> 66 cycles 10 times kmalloc(128)/kfree -> 65 cycles 10 times kmalloc(256)/kfree -> 67 cycles 10 times kmalloc(512)/kfree -> 67 cycles 10 times kmalloc(1024)/kfree -> 64 cycles 10 times kmalloc(2048)/kfree -> 67 cycles 10 times kmalloc(4096)/kfree -> 67 cycles Kernbench, before: Average Optimal load -j 12 Run (std deviation): Elapsed Time 101.873 (1.16069) User Time 1045.22 (1.60447) System Time 88.969 (0.559195) Percent CPU 1112.9 (13.8279) Context Switches 189140 (2282.15) Sleeps 99008.6 (768.091) After: Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.47 (0.562732) User Time 1045.3 (1.34263) System Time 88.311 (0.342554) Percent CPU 1105.8 (6.49444) Context Switches 189081 (2355.78) Sleeps 99231.5 (800.358) Signed-off-by: Thomas Garnier--- Based on 0e01df100b6bf22a1de61b66657502a6454153c5 --- include/linux/slub_def.h | 8 +++ init/Kconfig | 4 +- mm/slub.c| 133 --- 3 files changed, 136 insertions(+), 9 deletions(-) diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index 665cd0c..22d487e 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -56,6 +56,9 @@ struct kmem_cache_order_objects { unsigned long x; }; +/* Index used for freelist randomization */ +typedef unsigned int freelist_idx_t; + /* * Slab cache management. */ @@ -99,6 +102,11 @@ struct kmem_cache { */ int remote_node_defrag_ratio; #endif + +#ifdef CONFIG_SLAB_FREELIST_RANDOM + freelist_idx_t *random_seq; +#endif + struct kmem_cache_node *node[MAX_NUMNODES]; }; diff --git a/init/Kconfig b/init/Kconfig index a9c4aefd..fbb6678 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1771,10 +1771,10 @@ endchoice config SLAB_FREELIST_RANDOM default n - depends on SLAB + depends on SLAB || SLUB bool "SLAB freelist randomization" help - Randomizes the freelist order used on creating new SLABs. This + Randomizes the freelist order used on creating new pages. This security feature reduces the
[RFC v2 2/2] mm: SLUB Freelist randomization
Implements Freelist randomization for the SLUB allocator. It was previous implemented for the SLAB allocator. Both use the same configuration option (CONFIG_SLAB_FREELIST_RANDOM). The list is randomized during initialization of a new set of pages. The order on different freelist sizes is pre-computed at boot for performance. Each kmem_cache has its own randomized freelist. This security feature reduces the predictability of the kernel SLUB allocator against heap overflows rendering attacks much less stable. For example these attacks exploit the predictability of the heap: - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) Performance results: slab_test impact is between 3% to 4% on average: Before: Single thread testing = 1. Kmalloc: Repeatedly allocate then free test 10 times kmalloc(8) -> 49 cycles kfree -> 77 cycles 10 times kmalloc(16) -> 51 cycles kfree -> 79 cycles 10 times kmalloc(32) -> 53 cycles kfree -> 83 cycles 10 times kmalloc(64) -> 62 cycles kfree -> 90 cycles 10 times kmalloc(128) -> 81 cycles kfree -> 97 cycles 10 times kmalloc(256) -> 98 cycles kfree -> 121 cycles 10 times kmalloc(512) -> 95 cycles kfree -> 122 cycles 10 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles 10 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles 10 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles 2. Kmalloc: alloc/free test 10 times kmalloc(8)/kfree -> 70 cycles 10 times kmalloc(16)/kfree -> 70 cycles 10 times kmalloc(32)/kfree -> 70 cycles 10 times kmalloc(64)/kfree -> 70 cycles 10 times kmalloc(128)/kfree -> 70 cycles 10 times kmalloc(256)/kfree -> 69 cycles 10 times kmalloc(512)/kfree -> 70 cycles 10 times kmalloc(1024)/kfree -> 73 cycles 10 times kmalloc(2048)/kfree -> 72 cycles 10 times kmalloc(4096)/kfree -> 71 cycles After: Single thread testing = 1. Kmalloc: Repeatedly allocate then free test 10 times kmalloc(8) -> 57 cycles kfree -> 78 cycles 10 times kmalloc(16) -> 61 cycles kfree -> 81 cycles 10 times kmalloc(32) -> 76 cycles kfree -> 93 cycles 10 times kmalloc(64) -> 83 cycles kfree -> 94 cycles 10 times kmalloc(128) -> 106 cycles kfree -> 107 cycles 10 times kmalloc(256) -> 118 cycles kfree -> 117 cycles 10 times kmalloc(512) -> 114 cycles kfree -> 116 cycles 10 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles 10 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles 10 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles 2. Kmalloc: alloc/free test 10 times kmalloc(8)/kfree -> 66 cycles 10 times kmalloc(16)/kfree -> 66 cycles 10 times kmalloc(32)/kfree -> 66 cycles 10 times kmalloc(64)/kfree -> 66 cycles 10 times kmalloc(128)/kfree -> 65 cycles 10 times kmalloc(256)/kfree -> 67 cycles 10 times kmalloc(512)/kfree -> 67 cycles 10 times kmalloc(1024)/kfree -> 64 cycles 10 times kmalloc(2048)/kfree -> 67 cycles 10 times kmalloc(4096)/kfree -> 67 cycles Kernbench, before: Average Optimal load -j 12 Run (std deviation): Elapsed Time 101.873 (1.16069) User Time 1045.22 (1.60447) System Time 88.969 (0.559195) Percent CPU 1112.9 (13.8279) Context Switches 189140 (2282.15) Sleeps 99008.6 (768.091) After: Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.47 (0.562732) User Time 1045.3 (1.34263) System Time 88.311 (0.342554) Percent CPU 1105.8 (6.49444) Context Switches 189081 (2355.78) Sleeps 99231.5 (800.358) Signed-off-by: Thomas Garnier --- Based on 0e01df100b6bf22a1de61b66657502a6454153c5 --- include/linux/slub_def.h | 8 +++ init/Kconfig | 4 +- mm/slub.c| 133 --- 3 files changed, 136 insertions(+), 9 deletions(-) diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index 665cd0c..22d487e 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -56,6 +56,9 @@ struct kmem_cache_order_objects { unsigned long x; }; +/* Index used for freelist randomization */ +typedef unsigned int freelist_idx_t; + /* * Slab cache management. */ @@ -99,6 +102,11 @@ struct kmem_cache { */ int remote_node_defrag_ratio; #endif + +#ifdef CONFIG_SLAB_FREELIST_RANDOM + freelist_idx_t *random_seq; +#endif + struct kmem_cache_node *node[MAX_NUMNODES]; }; diff --git a/init/Kconfig b/init/Kconfig index a9c4aefd..fbb6678 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1771,10 +1771,10 @@ endchoice config SLAB_FREELIST_RANDOM default n - depends on SLAB + depends on SLAB || SLUB bool "SLAB freelist randomization" help - Randomizes the freelist order used on creating new SLABs. This + Randomizes the freelist order used on creating new pages. This security feature reduces the predictability of the kernel