[ https://issues.apache.org/jira/browse/YARN-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942349#comment-16942349 ]
Eric Yang edited comment on YARN-9834 at 10/1/19 10:22 PM: ----------------------------------------------------------- There was a community meeting called to discuss this issue, people presented were: [~wangda] [~shanyu] [~eyang] [~ashvin] and two others (apologies for those who I can find their names from JIRA.) The discussed items are: # The current patch does not work for recovery. # Past, present, and future usage of the same pool user may have conflicts. Detached user process can monitor data generated by future user. # YARN container does not have full isolation of working directories, path traversal to user home directory is possible. Docker and runc container can provide better isolation to prevent accidental leakage of file owned by pool users. # The patch is vulnerable to user job to play tricks with config flag to trigger code path designed for pool user. # Group membership association. Whether the file should be written with primary group, or the current directory group owner. This should be handled with care by application. Recommendation: # Implement the new incompatible security model in a separate container executor to prevent adding security holes to Linux container executor. # Use Docker container/runc or chroot to provide isolation and remove path traversal. # [~wangda] recommended to implement Auxiliary service to reactively change the execution model rather than direct modification to LinuxContainerExecutor to prevent security bugs leaks into LinuxContainerExecutor through a series of if else statements. Pull request #1446 is taking the shortest route to implement the pool user concept, but there are too many loopholes in the current implementation. Until those concerns are addressed, -1 on this patch. was (Author: eyang): There was a community meeting called to discuss this issue, people presented were: [~wangda] [~shanyu] [~eyang] [~ashvin] and two others (apologies for those who I can find their names from JIRA.) The discussed items are: # The current patch does not work for recovery. # Past, present, and future usage of the same pool user may have conflicts. Detached user process can monitor data generated by future user. # YARN container does not have full isolation of working directories, path traversal to user home directory is possible. Docker and runc container can provide better isolation to prevent accidental leakage of file owned by pool users. # The patch is vulnerable to user job to play tricks with config flag to trigger code path designed for pool user. # Group membership association. Whether the file should be written with primary group, or the current directory group owner. This should be handled with care by application. Recommendation: # Implement the new working model in a separate container executor to prevent adding security holes to Linux container executor. # Use Docker container/runc or chroot to provide isolation and remove path traversal. # [~wangda] recommended to implement Auxiliary service to reactively change the execution model rather than direct modification to LinuxContainerExecutor to prevent security bugs leaks into LinuxContainerExecutor through a series of if else statements. Pull request #1446 is taking the shortest route to implement the pool user concept, but there are too many loopholes in the current implementation. Until those concerns are addressed, -1 on this patch. > Allow using a pool of local users to run Yarn Secure Container in secure mode > ----------------------------------------------------------------------------- > > Key: YARN-9834 > URL: https://issues.apache.org/jira/browse/YARN-9834 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 3.1.2 > Reporter: shanyu zhao > Assignee: shanyu zhao > Priority: Major > Attachments: YarnSecureContainerWithPoolOfLocalUsers.pdf > > > Yarn Secure Container allows separation of different user's local files and > container processes running on the same node manager. This depends on an out > of band service such as SSSD/Winbind to sync all domain users to local > machine that runs Yarn node manager. *Hadoop code only works with local > users*. > Winbind/SSSD user sync has lots of overhead, especially for large > corporations. Also if running Yarn node manager inside Kubernetes cluster > (meaning node managers running inside Docker container), it doesn't make > sense for each Docker container to domain join with Active Directory and sync > a whole copy of domain users to the Docker container. > We need an optional light-weighted approach to enable Yarn Secure Container > in secure mode, as an alternative to AD domain join and SSSD/Winbind based > user-sync service. > Today, class LinuxContainerExecutor already supports running Yarn container > process as one designated local user in non-secure mode. > *We can add new configurations to Yarn, such that with LinuxContainerExecutor > we can pre-create a pool of local users on each Yarn node manager. At > runtime, Yarn node manager allocates a local user to run the container > process, for the domain user that submits the application*. When all > containers of that user are finished and all files belonging to that user are > deleted, we can release the allocation and allow other users to use the same > local user to run their Yarn containers. > Please look at attached design doc for more details. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org