[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353115#comment-16353115 ]
Vrushali C commented on YARN-3895: ---------------------------------- Here is the design after several rounds of discussions in the community. Thanks [~jlowe] , [~jrottinghuis] [~lohit] for discussing with us (me, [~rohithsharma] and [~varun_saxena]). - We will go with the domain concept as in ATSv1. Entities will be written with a TimelineDomain (like in ATSv1) and there will be putDomain calls just like ATSv1. - The domain information will be persisted to the backend in a domain table. - The domain information will also be retained in the TimelineCollector. This now makes the Timeline Collector stateful. - If a timeline collector goes down (for whatever reason) and comes back up, it knows which app ids it had in memory. The collector will in this specific case, “refresh” it’s ACLs state by reading back from HBase, the domain ids for those app ids. - Each time an entity is received by the collector, it looks up the app id + domain id in it’s memory and appends the TimelineDomain to entity. - The entity when written to HBase has not only the domain id but also the Timeline Domain information. - Thus, each row in HBase will have the ACLs info which can be used for filtering at read time. - When a read request comes in, the user and user’s group will be sent to the HBase cluster in the scan/get request and a check will be performed on the region server to determine if this user is allowed to read that entity or not based on the user & group membership. - Since we want to evaluate group of group memberships, this check will be a UserGroupInformation check just like it’s done in any other yarn ACL evaluation. This implies, the yarn cluster AND the HBase cluster have to have the same username & group ldap mappings so that evaluation checks will work as expected. - I believe this would be done within a coprocessor but I will check if there is any other way to run java code as part of scan column value filter operation. - If the querying user is an yarn admin, then no checks are necessary. - In case the ACLs for a domain ids need to be updated, that will mean scanning through the set of entities for that application id and updating the domain information for those. - The domain table will have domain id as row key and other fields in the TimelineDomain object as columns. Perhaps only one column family is fine. Details per table in HBase: - Domain table schema Rowkey : domain id ColumnFamily: i (stands for info) Columns: (listing a few here, there can be others) - application_id - created time - description - modified time - owner - readers - writers (not used but can be stored for completeness) We can consider setting compression for this table at a high level, since we do not anticipate reading frequently from this table. - Entity table, SubApplication table, Application table. can store the domain id as a column and the fields in the domain object as separate columns. - FlowRun table. We can start with doing a union of ACLs for all applications within a flow run. - FlowActivity table. We can start by doing a union of ACLs for all runs in a flow in that time frame. This may turn out to a bit more involved. Let’s discuss on the jira we file for this. thanks Vrushali > Support ACLs in ATSv2 > --------------------- > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Affects Versions: YARN-2928 > Reporter: Varun Saxena > Assignee: Vrushali C > Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org