[ https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257268#comment-16257268 ]
Miklos Szegedi commented on YARN-5673: -------------------------------------- [~eyang], Thank you for your comment. I have a few questions. Could you elaborate, why does clean up container (4.) require to be written in C? Similarly I do not think step 5. mentioned above requires it either. I agree that only steps that need root privileges or system calls need to be here everything else could go to node manager or a Java process run as root. I even have some concerns, that the docker code needs to reside in container executor. See YARN-7506 for reference, to discuss. bq. We should try to contain root power by minimize the workflow and type that we support. I absolutely agree. However, the tool already does this, does not it? > [Umbrella] Re-write container-executor to improve security, extensibility, > and portability > ------------------------------------------------------------------------------------------ > > Key: YARN-5673 > URL: https://issues.apache.org/jira/browse/YARN-5673 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager > Reporter: Varun Vasudev > Assignee: Varun Vasudev > Attachments: container-executor Re-write Design Document.pdf > > > As YARN adds support for new features that require administrator > privileges(such as support for network throttling and docker), we’ve had to > add new capabilities to the container-executor. This has led to a recognition > that the current container-executor security features as well as the code > could be improved. The current code is fragile and it’s hard to add new > features without causing regressions. Some of the improvements that need to > be made are - > *Security* > Currently the container-executor has limited security features. It relies > primarily on the permissions set on the binary but does little additional > security beyond that. There are few outstanding issues today - > - No audit log > - No way to disable features - network throttling and docker support are > built in and there’s no way to turn them off at a container-executor level > - Code can be improved - a lot of the code switches users back and forth in > an arbitrary manner > - No input validation - the paths, and files provided at invocation are not > validated or required to be in some specific location > - No signing functionality - there is no way to enforce that the binary was > invoked by the NM and not by any other process > *Code Issues* > The code layout and implementation themselves can be improved. Some issues > there are - > - No support for log levels - everything is logged and this can’t be turned > on or off > - Extremely long set of invocation parameters(specifically during container > launch) which makes turning features on or off complicated > - Poor test coverage - it’s easy to introduce regressions today due to the > lack of a proper test setup > - Duplicate functionality - there is some amount of code duplication > - Hard to make improvements or add new features due to the issues raised above > *Portability* > - The container-executor mixes platform dependent APIs with platform > independent APIs making it hard to run it on multiple platforms. Allowing it > to run on multiple platforms also improves the overall code structure . > One option is to improve the existing container-executor, however it might be > easier to start from scratch. That allows existing functionality to be > supported until we are ready to switch to the new code. > This umbrella JIRA is to capture all the work required for the new code. I'm > going to work on a design doc for the changes - any suggestions or > improvements are welcome. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org