[ https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310726#comment-16310726 ]
Miklos Szegedi commented on YARN-7516: -------------------------------------- [~eyang], [~ebadger], [~vinodkv], thank you for the patch and the reviews. [~ebadger], to answer your question, I have the following opinion about this jira: 1. This jira is a simple measure to block untrusted registries, as [~vinodkv] suggested above. 2. If I wanted to secure my cluster and not just the registry but the individual images in Docker on Hadoop, I would probably rely on the content signing feature of Docker. Ref: https://docs.docker.com/engine/security/trust/content_trust/#content-trust-operations-and-keys It signs the actual content. Even if someone gets access to the registry or the channel but not the signing key, they cannot compromise the system. Its advantage is that it does not require any change in Hadoop, AFAIK. Hadoop remains simple. Setting that up is non-trivial though. 3. I would never allow privileged containers in my own cluster. Native (non-Docker) Hadoop did not allow that, it was not part of the design. If something really needs root privileges, I would probably create a SUID script or executable as simple as possible, just like container executor. I would make sure it does only what it needs to do and run it from a YARN container without a Docker container. This reduces the attack surface. 4. If the question is how to allow images that are able to run as privileged and provide a simple and secure interface, I would probably list the allowed Docker images with digest SHA256 hash values (image@6bff...) in container-executor.cfg. Hadoop remains simple. It protects the node local HDFS data by disallowing root escalation, even if the yarn user is compromised. 5. If we were at the initial design step and we needed a little bit more complex but secure and scalable solution, I would just stick a single public key into container-executor.cfg maybe together with a flag whether privileged containers are allowed at all (ref. 3. above). The client would send the whole Docker JSON including the image, volumes, privileged flag etc. in the launch context. The RM would verify, if the user has permission to do the Docker command, so not all users would be allowed to run privileged for example. It would then sign the JSON with the corresponding private key into a delegation token or JWT and pass the signature with the JSON to the node in the launch context. The node manager would then forward the JSON and the signature to the container executor or another executable as suggested in YARN-7506. The container executor would verify the JWT or delegation token signature with the public key in container-executor.cfg and forward the JSON to the Docker command, if the verification is successful. The signature means that the RM allowed the request. Hadoop remains simple, security is centralized in the RM. RM can even have a REST API to dynamically adjust privileges. The privilege check might even be programmed as patterns in the JSON, minimizing the changes to Hadoop. I admit, this is probably a suggestion too late. 6. Just a side note how much we need to worry about buffer overflows, I am more concerned about actual security design problems affecting both C and Java. Most of the 2.5 million lines of Hadoop is Java. Ref: https://www.openhub.net/p/Hadoop. > Security check for untrusted docker image > ----------------------------------------- > > Key: YARN-7516 > URL: https://issues.apache.org/jira/browse/YARN-7516 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Eric Yang > Assignee: Eric Yang > Attachments: YARN-7516.001.patch, YARN-7516.002.patch, > YARN-7516.003.patch, YARN-7516.004.patch > > > Hadoop YARN Services can support using private docker registry image or > docker image from docker hub. In current implementation, Hadoop security is > enforced through username and group membership, and enforce uid:gid > consistency in docker container and distributed file system. There is cloud > use case for having ability to run untrusted docker image on the same cluster > for testing. > The basic requirement for untrusted container is to ensure all kernel and > root privileges are dropped, and there is no interaction with distributed > file system to avoid contamination. We can probably enforce detection of > untrusted docker image by checking the following: > # If docker image is from public docker hub repository, the container is > automatically flagged as insecure, and disk volume mount are disabled > automatically, and drop all kernel capabilities. > # If docker image is from private repository in docker hub, and there is a > white list to allow the private repository, disk volume mount is allowed, > kernel capabilities follows the allowed list. > # If docker image is from private trusted registry with image name like > "private.registry.local:5000/centos", and white list allows this private > trusted repository. Disk volume mount is allowed, kernel capabilities > follows the allowed list. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org