[ https://issues.apache.org/jira/browse/YARN-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942361#comment-16942361 ]
shanyu zhao commented on YARN-9834: ----------------------------------- [~eyang] Please see my response inline: {quote}The current patch does not work for recovery.{quote} Yes this is listed in as limitations in the attached design doc. It does not work for Yarn node manager recovery, which by default is turned off. If trying to enable both local pool users and Yarn node manager recovery, Yarn node manager will exit with error message. {quote}Past, present, and future usage of the same pool user may have conflicts. Detached user process can monitor data generated by future user.{quote} Like I already explained in our meeting. There is no concrete example of the "conflict". The design will restrict pool users from creating folders outside of Yarn node manager usercache folder. The design guarantees the local files are all deleted before the local pool user can be reused. {quote}YARN container does not have full isolation of working directories, path traversal to user home directory is possible. Docker and runc container can provide better isolation to prevent accidental leakage of file owned by pool users.{quote} As I explained above. There will be no user directory (/home/<user>), and one local pool user cannot access other local pool user's usercache folder. We have e2e test cases to cover all kinds of scenarios like this. {quote}The patch is vulnerable to user job to play tricks with config flag to trigger code path designed for pool user.{quote} How could user job play tricks to modify Yarn node manager's configuration? If they could, they could just steal node manager's keytab file and talk to Yarn resource manager directly. {quote}Group membership association. Whether the file should be written with primary group, or the current directory group owner. This should be handled with care by application.{quote} What files are you talking about? Local files does not have any domain group membership association. This is expected because we are not sharing files for the same domain user across applications. Please see one of the limitation section in the design doc. We do not support PRIVATE visibility resources and will treat it as APPLICATION visibility. As for your recommendations: {quote}Implement the new incompatible security model in a separate container executor to prevent adding security holes to Linux container executor.{quote} I already mentioned the whole idea of this work is based on extending LinuxContainerExecutor. This feature is protected by a configuration and turned off by default, just like an existing configuration "yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users". I do not see how can this add security holes to Linux container executor. {quote}Use Docker container/runc or chroot to provide isolation and remove path traversal.{quote} We cannot run Docker container inside Docker container. We are relying on running Yarn container process as different local user for the isolation, which is based on the existing Yarn Secure Container implementation in LinuxContainerExecutor. {quote}Wangda Tan recommended to implement Auxiliary service to reactively change the execution model rather than direct modification to LinuxContainerExecutor to prevent security bugs leaks into LinuxContainerExecutor through a series of if else statements.{quote} We can look into Aux services, but as I said, my understanding of Aux services is that it can only provide service to Yarn node manager, e.g. maybe creating users or something, but in the end, we need modification to LinuxContainerExecutor to launch Yarn container process so that we can use Yarn Secure Container in a light-weight setup. Finally, I want to stress here, that whoever might be interested in running Yarn on Kubernetes with security setup, this JIRA provides a secure alternative way to enable Yarn Secure Container without the need for container domain join and winbind/SSSD service inside Docker containers. {quote} > Allow using a pool of local users to run Yarn Secure Container in secure mode > ----------------------------------------------------------------------------- > > Key: YARN-9834 > URL: https://issues.apache.org/jira/browse/YARN-9834 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 3.1.2 > Reporter: shanyu zhao > Assignee: shanyu zhao > Priority: Major > Attachments: YarnSecureContainerWithPoolOfLocalUsers.pdf > > > Yarn Secure Container allows separation of different user's local files and > container processes running on the same node manager. This depends on an out > of band service such as SSSD/Winbind to sync all domain users to local > machine that runs Yarn node manager. *Hadoop code only works with local > users*. > Winbind/SSSD user sync has lots of overhead, especially for large > corporations. Also if running Yarn node manager inside Kubernetes cluster > (meaning node managers running inside Docker container), it doesn't make > sense for each Docker container to domain join with Active Directory and sync > a whole copy of domain users to the Docker container. > We need an optional light-weighted approach to enable Yarn Secure Container > in secure mode, as an alternative to AD domain join and SSSD/Winbind based > user-sync service. > Today, class LinuxContainerExecutor already supports running Yarn container > process as one designated local user in non-secure mode. > *We can add new configurations to Yarn, such that with LinuxContainerExecutor > we can pre-create a pool of local users on each Yarn node manager. At > runtime, Yarn node manager allocates a local user to run the container > process, for the domain user that submits the application*. When all > containers of that user are finished and all files belonging to that user are > deleted, we can release the allocation and allow other users to use the same > local user to run their Yarn containers. > Please look at attached design doc for more details. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org