[ 
https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310726#comment-16310726
 ] 

Miklos Szegedi commented on YARN-7516:
--------------------------------------

[~eyang], [~ebadger], [~vinodkv], thank you for the patch and the reviews.
[~ebadger], to answer your question, I have the following opinion about this 
jira:
1. This jira is a simple measure to block untrusted registries, as [~vinodkv] 
suggested above.
2. If I wanted to secure my cluster and not just the registry but the 
individual images in Docker on Hadoop, I would probably rely on the content 
signing feature of Docker. Ref: 
https://docs.docker.com/engine/security/trust/content_trust/#content-trust-operations-and-keys
 It signs the actual content. Even if someone gets access to the registry or 
the channel but not the signing key, they cannot compromise the system. Its 
advantage is that it does not require any change in Hadoop, AFAIK. Hadoop 
remains simple. Setting that up is non-trivial though.
3. I would never allow privileged containers in my own cluster. Native 
(non-Docker) Hadoop did not allow that, it was not part of the design. If 
something really needs root privileges, I would probably create a SUID script 
or executable as simple as possible, just like container executor. I would make 
sure it does only what it needs to do and run it from a YARN container without 
a Docker container. This reduces the attack surface.
4. If the question is how to allow images that are able to run as privileged 
and provide a simple and secure interface, I would probably list the allowed 
Docker images with digest SHA256 hash values (image@6bff...) in 
container-executor.cfg. Hadoop remains simple. It protects the node local HDFS 
data by disallowing root escalation, even if the yarn user is compromised.
5. If we were at the initial design step and we needed a little bit more 
complex but secure and scalable solution, I would just stick a single public 
key into container-executor.cfg maybe together with a flag whether privileged 
containers are allowed at all (ref. 3. above). The client would send the whole 
Docker JSON including the image, volumes, privileged flag etc. in the launch 
context. The RM would verify, if the user has permission to do the Docker 
command, so not all users would be allowed to run privileged for example. It 
would then sign the JSON with the corresponding private key into a delegation 
token or JWT and pass the signature with the JSON to the node in the launch 
context. The node manager would then forward the JSON and the signature to the 
container executor or another executable as suggested in YARN-7506. The 
container executor would verify the JWT or delegation token signature with the 
public key in container-executor.cfg and forward the JSON to the Docker 
command, if the verification is successful. The signature means that the RM 
allowed the request. Hadoop remains simple, security is centralized in the RM. 
RM can even have a REST API to dynamically adjust privileges. The privilege 
check might even be programmed as patterns in the JSON, minimizing the changes 
to Hadoop. I admit, this is probably a suggestion too late.
6. Just a side note how much we need to worry about buffer overflows, I am more 
concerned about actual security design problems affecting both C and Java. Most 
of the 2.5 million lines of Hadoop is Java. Ref: 
https://www.openhub.net/p/Hadoop.




> Security check for untrusted docker image
> -----------------------------------------
>
>                 Key: YARN-7516
>                 URL: https://issues.apache.org/jira/browse/YARN-7516
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: YARN-7516.001.patch, YARN-7516.002.patch, 
> YARN-7516.003.patch, YARN-7516.004.patch
>
>
> Hadoop YARN Services can support using private docker registry image or 
> docker image from docker hub.  In current implementation, Hadoop security is 
> enforced through username and group membership, and enforce uid:gid 
> consistency in docker container and distributed file system.  There is cloud 
> use case for having ability to run untrusted docker image on the same cluster 
> for testing.  
> The basic requirement for untrusted container is to ensure all kernel and 
> root privileges are dropped, and there is no interaction with distributed 
> file system to avoid contamination.  We can probably enforce detection of 
> untrusted docker image by checking the following:
> # If docker image is from public docker hub repository, the container is 
> automatically flagged as insecure, and disk volume mount are disabled 
> automatically, and drop all kernel capabilities.
> # If docker image is from private repository in docker hub, and there is a 
> white list to allow the private repository, disk volume mount is allowed, 
> kernel capabilities follows the allowed list.
> # If docker image is from private trusted registry with image name like 
> "private.registry.local:5000/centos", and white list allows this private 
> trusted repository.  Disk volume mount is allowed, kernel capabilities 
> follows the allowed list.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to