On Wed, 25 Jun 2014, Konstantin Khlebnikov wrote:

> Shared anonymous mapping created without MAP_NORESERVE holds memory
> reservation for whole range of shmem segment. Usually there is no way to
> change its size, but /proc/<pid>/map_files/...
> (available if CONFIG_CHECKPOINT_RESTORE=y) allows to do that.
> 
> This patch adjust memory reservation in shmem_setattr().
> 
> Signed-off-by: Konstantin Khlebnikov <koc...@gmail.com>

Acked-by: Hugh Dickins <hu...@google.com>

Thank you, I knew nothing about this backdoor to shmem objects.  Scary.
Was this really the only problem map_files access leads to?  If you
did not do so already, please try to think through other possibilities.

I haven't begun, but perhaps it's not so bad.  I guess the interaction
with mremap extension is benign - it's annoyed people in the past that
the underlying shmem object is not extended, but now here's a way that
it can be. 

(I'll leave it to others comment on 3/3 if they wish.)

> 
> ---
> 
> exploit:
> 
> #include <sys/mman.h>
> #include <unistd.h>
> #include <stdio.h>
> 
> int main(int argc, char **argv)
> {
>       unsigned long addr;
>       char path[100];
> 
>       /* charge 4KiB */
>       addr = (unsigned long)mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
> MAP_SHARED|MAP_ANONYMOUS, -1, 0);
>       sprintf(path, "/proc/self/map_files/%lx-%lx", addr, addr + 4096);
>       truncate(path, 1 << 30);
>       /* uncharge 1GiB */
> }
> ---
>  mm/shmem.c |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 0aabcbd..a3c49d6 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -149,6 +149,19 @@ static inline void shmem_unacct_size(unsigned long 
> flags, loff_t size)
>               vm_unacct_memory(VM_ACCT(size));
>  }
>  
> +static inline int shmem_reacct_size(unsigned long flags,
> +             loff_t oldsize, loff_t newsize)
> +{
> +     if (!(flags & VM_NORESERVE)) {
> +             if (VM_ACCT(newsize) > VM_ACCT(oldsize))
> +                     return security_vm_enough_memory_mm(current->mm,
> +                                     VM_ACCT(newsize) - VM_ACCT(oldsize));
> +             else if (VM_ACCT(newsize) < VM_ACCT(oldsize))
> +                     vm_unacct_memory(VM_ACCT(oldsize) - VM_ACCT(newsize));
> +     }
> +     return 0;
> +}
> +
>  /*
>   * ... whereas tmpfs objects are accounted incrementally as
>   * pages are allocated, in order to allow huge sparse files.
> @@ -543,6 +556,10 @@ static int shmem_setattr(struct dentry *dentry, struct 
> iattr *attr)
>               loff_t newsize = attr->ia_size;
>  
>               if (newsize != oldsize) {
> +                     error = shmem_reacct_size(SHMEM_I(inode)->flags,
> +                                     oldsize, newsize);
> +                     if (error)
> +                             return error;
>                       i_size_write(inode, newsize);
>                       inode->i_ctime = inode->i_mtime = CURRENT_TIME;
>               }
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to