[ 
https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240440#comment-16240440
 ] 

Jason Lowe commented on YARN-7197:
----------------------------------

bq. Explode might be exaggeration.

Yes, by "explode" I mean the OS will fail the container creation, and that's 
where the problem lies in some cases.  Here's an example scenario:

# Admin has configured {{/etc}} in the whitelist and {{/etc/shadow}} in the 
blacklist.
# User runs a container that bind-mounts {{/etc}} but not for malicious reasons
# YARN needs to bind-mount over {{/etc/shadow}} to prevent the visibility of 
that file on the host from within the container
# Attempting to bind-mount a directory over a file fails, and the container 
does not launch

The current patch doesn't handle this case properly because it misses the need 
to bind-mount to a blacklisted path when a parent of a blacklisted path is 
mounted.  Even if it did catch that case, the blacklisted path on the host 
needs to be a directory or bind-mounting the empty directory onto the host file 
is going to fail, preventing the container from launching.  That's what I meant 
by stating we need to support bind-mounting an empty file or we're effectively 
only supporting blacklisted paths that are directories.

bq. File creation is easy, and it is easy to get it wrong too. Someone on the 
host OS might add additional files into files in /tmp or get erased by accident.

This isn't an issue if we create the file in the YARN container's working 
directory, and that also conveniently solves the cleanup issue when the 
container completes.

bq. The default umask might not be the same as OS default, therefore extra 
ownership system calls are required to secure the newly created files. Those 
operation can slow down the system.

This is adding two system calls to the container executor, a creat followed by 
a chmod system call.  The slowdown will not be perceptible unless there are I/O 
issues with the container's working directory, and if that's the case then 
there are going to be many more issues than this as the container tries to run.

bq. I think %s is extra characters counted toward required length for NULL.

The %s characters are not accounted for either.  Here's the relevant code again:
{code}
        tmp_buffer_2 = (char *) alloc_and_clear_memory(strlen(empty_dir) + 
strlen(mount_target) + strlen(bind_mount), sizeof(char));
        sprintf(tmp_buffer_2, "type=bind,source=%s,target=%s,readonly", 
empty_dir, mount_target);
        quote_and_append_arg(&tmp_buffer, &tmp_buffer_size, "--mount ", 
tmp_buffer_2);
{code}

The size of tmp_buffer_2 is computed as the lengths of three strings sans their 
NUL terminators.  Those same three strings are copied into tmp_buffer_2 along 
with the constant portion of the format string.  sprintf is not allocating a 
new destination buffer here.  tmp_buffer_2 needs to be big enough to hold not 
only the contents of {{empty_dir}}, {{mount_target}}, and {{bind_mount}} but 
also {{type=bind,source=}} and {{,target=}} and {{,readonly}} along with a 
terminating NUL.  All of those constant characters in the format string are not 
part of the size calculation for tmp_buffer_2 which means the sprintf is going 
to blow past the end of the allocated buffer and corrupt the heap.  In addition 
the call to quote_and_append_arg could blow past the end of tmp_buffer since 
tmp_buffer is only a 1K buffer and the paths involved could be quite long.


> Add support for a volume blacklist for docker containers
> --------------------------------------------------------
>
>                 Key: YARN-7197
>                 URL: https://issues.apache.org/jira/browse/YARN-7197
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Shane Kumpf
>            Assignee: Eric Yang
>         Attachments: YARN-7197.001.patch, YARN-7197.002.patch, 
> YARN-7197.003.patch, YARN-7197.004.patch, YARN-7197.005.patch
>
>
> Docker supports bind mounting host directories into containers. Work is 
> underway to allow admins to configure a whilelist of volume mounts. While 
> this is a much needed and useful feature, it opens the door for 
> misconfiguration that may lead to users being able to compromise or crash the 
> system. 
> One example would be allowing users to mount /run from a host running 
> systemd, and then running systemd in that container, rendering the host 
> mostly unusable.
> This issue is to add support for a default blacklist. The default blacklist 
> would be where we put files and directories that if mounted into a container, 
> are likely to have negative consequences. Users are encouraged not to remove 
> items from the default blacklist, but may do so if necessary.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to