[ 
https://issues.apache.org/jira/browse/YARN-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257375#comment-16257375
 ] 

Miklos Szegedi commented on YARN-7506:
--------------------------------------

Thank you for the comments.
[~ebadger], the main reason a Java root process is more secure than 
container-executor is that it protects against exploitable buffer overflows. 
This is why I raised the suggestion. I was not sure why this approach was not 
followed before, this is why I raised this jira. It is also easier to use for 
most Hadoop developers, as you mentioned.
[~vinodkv], this jira already builds on the experiences of YARN-6623, I would 
rather consider it as a subtask of YARN-5673, if even considered. Now that you 
mentioned, a possible solution for YARN-5673 considering this (YARN-7506) 
suggestion would be to have a root Java based container executor framework that 
loads Java or native C modules. However, Docker has its own unique design with 
the CLI and the socket and no native system call dependencies, that it could be 
handled separately.
bq. Side note: One more important consideration in the container-executor 
design was to not have long running root processes as it may increase the 
attack scope. Assuming that is still intact.
Suggestion 1. above does not require any long running root user process. 2. 
does, however the only surface would be the proxied docker socket and config 
file that is protected with file system permissions just like the 
container-executor executable.
[~eyang]
bq. Both docker and hadoop use "trusted" users...
I have to remind about the rule of defense in depth. In case of defense in 
depth, there is no trusted user. Every input is evil and each component 
(container-executor in this case) has to do its proper error checking.
bq. YARN user tap directly into docker.sock goes against our original 
philosophy of having both "trusted" user and root to perform validation.
Indeed. I agree.
bq. Root power may be used for validation logic when trusted user can not 
validate, such as symlink to local file system access that YARN-6623 solved.
Indeed, and I would mention volume white and blacklists, that the yarn user 
cannot validate because of the defense in depth rule.
bq. We can consider to keep most of logic in Java as long as root privileges is 
not required.
I disagree here. Most of the functionality that YARN-6623 implemented requires 
that root does the validation, so if done in Java, it should be in a Java root 
process.
bq. The performance gain from tapping into docker socket is saving the cost of 
one fork but we would lose a lot of validations done by docker CLI.
The validations are important indeed, but making validations is much more 
difficult on command line options than on easily parseable JSON as the recent 
issues showed.
bq. If it can be helped, calling root cli is preferred than calling root owned 
network socket.
There is a solution for that. We could still use the CLI from Java node manager 
running as yarn on a unix socket writable to yarn that is proxied and security 
filtered with a root java process running in the background and that works on 
the original socket. (See attached diagram)
bq. I don't fully agree with YARN-5673 modules API design. The description is 
another plug-in architecture to enable more functionality with root power. I 
think this is a slippy slope to enable more risks in container-executor.
I agree, I also raised my concerns there.
bq. It is best to avoid running java as root. Java runtime includes a lot of 
third party code, which can be unpredictable with root power.
That is a risk. I would minimize the number of non-JDK dependencies, if java 
root process is chosen. I still think it may be more favorable in this case.

I summarized the options in the attached diagram. That shows which one is the 
most simple.

> Overhaul the design of the Linux container-executor regarding Docker and 
> future runtimes
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-7506
>                 URL: https://issues.apache.org/jira/browse/YARN-7506
>             Project: Hadoop YARN
>          Issue Type: Wish
>          Components: nodemanager
>            Reporter: Miklos Szegedi
>              Labels: Docker, container-executor
>         Attachments: YARN-Docker control options.pdf
>
>
> I raise this topic to discuss a potential improvement of the container 
> executor tool in node manager.
> container-executor has two main purposes. It executes Linux *system calls not 
> available from Java*, and it executes tasks *available to root that are not 
> available to the yarn user*. Historically container-executor did both by 
> doing impersonation. The yarn user is separated from root because it runs 
> network services, so *the yarn user should be restricted* by design. Because 
> of this it has it's own config file container-executor.cfg writable by root 
> only that specifies what actions are allowed for the yarn user. However, the 
> requirements have changed with Docker and that raises the following questions:
> 1. The Docker feature of YARN requires root permissions to *access the Docker 
> socket* but it does not run any system calls, so could the Docker related 
> code in container-executor be *refactored into a separate Java process ran as 
> root*? Java would make the development much faster and more secure. 
> 2. The Docker feature only needs the Docker unix socket. It is not a good 
> idea to let the yarn user directly access the socket, since that would 
> elevate its privileges to root. However, the Java tool running as root 
> mentioned in the previous question could act as a *proxy on the Docker 
> socket* operating directly on the Docker REST API *eliminating the need to 
> use the Docker CLI*. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to