On Sun, Nov 29, 2020 at 1:20 PM Topi Miettinen <toiwo...@gmail.com> wrote:
>
> Writing a new value of 3 to /proc/sys/kernel/randomize_va_space
> enables full randomization of memory mappings created with mmap(NULL,
> ...). With 2, the base of the VMA used for such mappings is random,
> but the mappings are created in predictable places within the VMA and
> in sequential order. With 3, new VMAs are created to fully randomize
> the mappings.
>
> Also mremap(..., MREMAP_MAYMOVE) will move the mappings even if not
> necessary and the location of stack and vdso are also randomized.
>
> The method is to randomize the new address without considering
> VMAs. If the address fails checks because of overlap with the stack
> area (or in case of mremap(), overlap with the old mapping), the
> operation is retried a few times before falling back to old method.
>
> On 32 bit systems this may cause problems due to increased VM
> fragmentation if the address space gets crowded.
>
> On all systems, it will reduce performance and increase memory usage
> due to less efficient use of page tables and inability to merge
> adjacent VMAs with compatible attributes. In the worst case,
> additional page table entries of up to 4 pages are created for each
> mapping, so with small mappings there's considerable penalty.
>
> In this example with sysctl.kernel.randomize_va_space = 2, dynamic
> loader, libc, anonymous memory reserved with mmap() and locale-archive
> are located close to each other:
>
> $ cat /proc/self/maps (only first line for each object shown for brevity)
> 5acea452d000-5acea452f000 r--p 00000000 fe:0c 1868624                    
> /usr/bin/cat
> 74f438f90000-74f4394f2000 r--p 00000000 fe:0c 2473999                    
> /usr/lib/locale/locale-archive
> 74f4394f2000-74f4395f2000 rw-p 00000000 00:00 0
> 74f4395f2000-74f439617000 r--p 00000000 fe:0c 2402332                    
> /usr/lib/x86_64-linux-gnu/libc-2.31.so
> 74f4397b3000-74f4397b9000 rw-p 00000000 00:00 0
> 74f4397e5000-74f4397e6000 r--p 00000000 fe:0c 2400754                    
> /usr/lib/x86_64-linux-gnu/ld-2.31.so
> 74f439811000-74f439812000 rw-p 00000000 00:00 0
> 7fffdca0d000-7fffdca2e000 rw-p 00000000 00:00 0                          
> [stack]
> 7fffdcb49000-7fffdcb4d000 r--p 00000000 00:00 0                          
> [vvar]
> 7fffdcb4d000-7fffdcb4f000 r-xp 00000000 00:00 0                          
> [vdso]
>
> With sysctl.kernel.randomize_va_space = 3, they are located at
> unrelated addresses and the order is random:
>
> $ echo 3 > /proc/sys/kernel/randomize_va_space
> $ cat /proc/self/maps (only first line for each object shown for brevity)
> 3850520000-3850620000 rw-p 00000000 00:00 0
> 28cfb4c8000-28cfb4cc000 r--p 00000000 00:00 0                            
> [vvar]
> 28cfb4cc000-28cfb4ce000 r-xp 00000000 00:00 0                            
> [vdso]
> 9e74c385000-9e74c387000 rw-p 00000000 00:00 0
> a42e0233000-a42e0234000 r--p 00000000 fe:0c 2400754                      
> /usr/lib/x86_64-linux-gnu/ld-2.31.so
> a42e025f000-a42e0260000 rw-p 00000000 00:00 0
> bea40427000-bea4044c000 r--p 00000000 fe:0c 2402332                      
> /usr/lib/x86_64-linux-gnu/libc-2.31.so
> bea405e8000-bea405ec000 rw-p 00000000 00:00 0
> f6d446fa000-f6d44c5c000 r--p 00000000 fe:0c 2473999                      
> /usr/lib/locale/locale-archive
> fcfbf684000-fcfbf6a5000 rw-p 00000000 00:00 0                            
> [stack]
> 619aba62d000-619aba62f000 r--p 00000000 fe:0c 1868624                    
> /usr/bin/cat
>
> CC: Andrew Morton <a...@linux-foundation.org>
> CC: Jann Horn <ja...@google.com>
> CC: Kees Cook <keesc...@chromium.org>
> CC: Matthew Wilcox <wi...@infradead.org>
> CC: Mike Rapoport <r...@kernel.org>
> CC: Linux API <linux-...@vger.kernel.org>
> Signed-off-by: Topi Miettinen <toiwo...@gmail.com>
> ---
> v2: also randomize mremap(..., MREMAP_MAYMOVE)
> v3: avoid stack area and retry in case of bad random address (Jann
> Horn), improve description in kernel.rst (Matthew Wilcox)
> v4:
> - use /proc/$pid/maps in the example (Mike Rapaport)
> - CCs (Andrew Morton)
> - only check randomize_va_space == 3
> v5: randomize also vdso and stack
> ---
>  Documentation/admin-guide/hw-vuln/spectre.rst |  6 ++--
>  Documentation/admin-guide/sysctl/kernel.rst   | 20 +++++++++++++
>  arch/x86/entry/vdso/vma.c                     | 26 +++++++++++++++-
>  include/linux/mm.h                            |  8 +++++
>  init/Kconfig                                  |  2 +-
>  mm/mmap.c                                     | 30 +++++++++++++------
>  mm/mremap.c                                   | 27 +++++++++++++++++
>  mm/util.c                                     |  6 ++++
>  8 files changed, 111 insertions(+), 14 deletions(-)
>
> diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst 
> b/Documentation/admin-guide/hw-vuln/spectre.rst
> index e05e581af5cf..9ea250522077 100644
> --- a/Documentation/admin-guide/hw-vuln/spectre.rst
> +++ b/Documentation/admin-guide/hw-vuln/spectre.rst
> @@ -254,7 +254,7 @@ Spectre variant 2
>     left by the previous process will also be cleared.
>
>     User programs should use address space randomization to make attacks
> -   more difficult (Set /proc/sys/kernel/randomize_va_space = 1 or 2).
> +   more difficult (Set /proc/sys/kernel/randomize_va_space = 1, 2 or 3).
>
>  3. A virtualized guest attacking the host
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> @@ -499,8 +499,8 @@ Spectre variant 2
>     more overhead and run slower.
>
>     User programs should use address space randomization
> -   (/proc/sys/kernel/randomize_va_space = 1 or 2) to make attacks more
> -   difficult.
> +   (/proc/sys/kernel/randomize_va_space = 1, 2 or 3) to make attacks
> +   more difficult.
>
>  3. VM mitigation
>  ^^^^^^^^^^^^^^^^
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst 
> b/Documentation/admin-guide/sysctl/kernel.rst
> index d4b32cc32bb7..806e3b29d2b5 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -1060,6 +1060,26 @@ that support this feature.
>      Systems with ancient and/or broken binaries should be configured
>      with ``CONFIG_COMPAT_BRK`` enabled, which excludes the heap from process
>      address space randomization.
> +
> +3   Additionally enable full randomization of memory mappings created
> +    with mmap(NULL, ...). With 2, the base of the VMA used for such
> +    mappings is random, but the mappings are created in predictable
> +    places within the VMA and in sequential order. With 3, new VMAs
> +    are created to fully randomize the mappings.
> +
> +    Also mremap(..., MREMAP_MAYMOVE) will move the mappings even if
> +    not necessary and the location of stack and vdso are also
> +    randomized.
> +
> +    On 32 bit systems this may cause problems due to increased VM
> +    fragmentation if the address space gets crowded.
> +
> +    On all systems, it will reduce performance and increase memory
> +    usage due to less efficient use of page tables and inability to
> +    merge adjacent VMAs with compatible attributes. In the worst case,
> +    additional page table entries of up to 4 pages are created for
> +    each mapping, so with small mappings there's considerable penalty.
> +
>  ==  
> ===========================================================================
>
>
> diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
> index 9185cb1d13b9..03ea884822e3 100644
> --- a/arch/x86/entry/vdso/vma.c
> +++ b/arch/x86/entry/vdso/vma.c
> @@ -12,6 +12,7 @@
>  #include <linux/init.h>
>  #include <linux/random.h>
>  #include <linux/elf.h>
> +#include <linux/elf-randomize.h>
>  #include <linux/cpu.h>
>  #include <linux/ptrace.h>
>  #include <linux/time_namespace.h>
> @@ -32,6 +33,8 @@
>         const size_t name ## _offset = offset;
>  #include <asm/vvar.h>
>
> +#define MAX_RANDOM_VDSO_RETRIES                        5
> +
>  struct vdso_data *arch_get_vdso_data(void *vvar_page)
>  {
>         return (struct vdso_data *)(vvar_page + _vdso_data_offset);
> @@ -361,7 +364,28 @@ static unsigned long vdso_addr(unsigned long start, 
> unsigned len)
>
>  static int map_vdso_randomized(const struct vdso_image *image)
>  {
> -       unsigned long addr = vdso_addr(current->mm->start_stack, 
> image->size-image->sym_vvar_start);
> +       unsigned long addr;
> +
> +       if (randomize_va_space == 3) {
> +               /*
> +                * Randomize vdso address.
> +                */
> +               int i = MAX_RANDOM_VDSO_RETRIES;
> +
> +               do {
> +                       int ret;
> +
> +                       /* Try a few times to find a free area */
> +                       addr = arch_mmap_rnd();
> +
> +                       ret = map_vdso(image, addr);
> +                       if (!IS_ERR_VALUE(ret))
> +                               return ret;
> +               } while (--i >= 0);
> +
> +               /* Give up and try the less random way */
> +       }
> +       addr = vdso_addr(current->mm->start_stack, 
> image->size-image->sym_vvar_start);

This is IMO rather ugly.  You're picking random numbers and throwing
them at map_vdso(), which throws them at get_unmapped_area(), which
will validate them.  And you duplicate the same ugly loop later on.

How about instead pushing this logic into get_unmapped_area()?

--Andy

Reply via email to