[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479454#comment-16479454 ] Vrushali C commented on YARN-3895: -- To recap some of the discussion in the weekly call today between [~rohithsharma] [~haibochen] and me: - For Application level data, Application ACLS are to be used for read authorization - For System entities like container events, Application ACLs to be used for read authorization - For User entities, timeline domain information to be used for read authorization > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Vrushali C >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355850#comment-16355850 ] Vrushali C commented on YARN-3895: -- Thanks [~jlowe] ! {quote}I am a bit confused about the application_id column for a domain table entry. What if the domain doesn't apply to the entire application (i.e.: just to one DAG within a multi-DAG app) {quote} Ah yes, good catch. The app id should be excluded from the domain table. I had added it in thinking it would be easier to know which app ids / entities would need to be updated in case of updates to a particular domain. But I think we do not *absolutely need* the app ids there; we could do a table scan to update the info in entities. I will think over a bit more about this case when we have to update the domain info, but I think we can presently move ahead along the lines of thought that updates to a domain is an infrequent occurrence that does not require a super fast response time. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Vrushali C >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353974#comment-16353974 ] Jason Lowe commented on YARN-3895: -- Thanks for the detailed writeup, Vrushali! I am a bit confused about the application_id column for a domain table entry. What if the domain doesn't apply to the entire application (i.e.: just to one DAG within a multi-DAG app) or what if a domain applies to more than one application? Tez does not use a domain across applications, but I'm curious if this design will preclude a domain crossing applications (e.g.: a domain is setup for an entire Oozie flow, and all applications in that flow use that one domain for ACLs). > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Vrushali C >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353115#comment-16353115 ] Vrushali C commented on YARN-3895: -- Here is the design after several rounds of discussions in the community. Thanks [~jlowe] , [~jrottinghuis] [~lohit] for discussing with us (me, [~rohithsharma] and [~varun_saxena]). - We will go with the domain concept as in ATSv1. Entities will be written with a TimelineDomain (like in ATSv1) and there will be putDomain calls just like ATSv1. - The domain information will be persisted to the backend in a domain table. - The domain information will also be retained in the TimelineCollector. This now makes the Timeline Collector stateful. - If a timeline collector goes down (for whatever reason) and comes back up, it knows which app ids it had in memory. The collector will in this specific case, “refresh” it’s ACLs state by reading back from HBase, the domain ids for those app ids. - Each time an entity is received by the collector, it looks up the app id + domain id in it’s memory and appends the TimelineDomain to entity. - The entity when written to HBase has not only the domain id but also the Timeline Domain information. - Thus, each row in HBase will have the ACLs info which can be used for filtering at read time. - When a read request comes in, the user and user’s group will be sent to the HBase cluster in the scan/get request and a check will be performed on the region server to determine if this user is allowed to read that entity or not based on the user & group membership. - Since we want to evaluate group of group memberships, this check will be a UserGroupInformation check just like it’s done in any other yarn ACL evaluation. This implies, the yarn cluster AND the HBase cluster have to have the same username & group ldap mappings so that evaluation checks will work as expected. - I believe this would be done within a coprocessor but I will check if there is any other way to run java code as part of scan column value filter operation. - If the querying user is an yarn admin, then no checks are necessary. - In case the ACLs for a domain ids need to be updated, that will mean scanning through the set of entities for that application id and updating the domain information for those. - The domain table will have domain id as row key and other fields in the TimelineDomain object as columns. Perhaps only one column family is fine. Details per table in HBase: - Domain table schema Rowkey : domain id ColumnFamily: i (stands for info) Columns: (listing a few here, there can be others) - application_id - created time - description - modified time - owner - readers - writers (not used but can be stored for completeness) We can consider setting compression for this table at a high level, since we do not anticipate reading frequently from this table. - Entity table, SubApplication table, Application table. can store the domain id as a column and the fields in the domain object as separate columns. - FlowRun table. We can start with doing a union of ACLs for all applications within a flow run. - FlowActivity table. We can start by doing a union of ACLs for all runs in a flow in that time frame. This may turn out to a bit more involved. Let’s discuss on the jira we file for this. thanks Vrushali > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Vrushali C >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341427#comment-16341427 ] Vrushali C commented on YARN-3895: -- Hi [~jlowe] [~jeagles] I discussed with [~lohit] once again this morning. Based on the scale of domain ids, I wanted to revise the storage design. We now propose to have a domain table, the row key being domain id and there will be two columns one for users and another for groups. And for created time and other things that exist in the TimelineDomain object. So at read time, just like ATSv1 does, first get all the entities satisfying the query criteria, then look for domain ids. And for each domain id in the response, check the domain table if the user/group has permissions. For wildcard of ‘*’, no check is necessary, since it means all users and groups have permissions? Similarly if the querying user is an admin, no check is done. Also, all this is not executed in non-secure mode. This will work functionally correctly but this is going to be a bit slow depending on the number of domain ids found in the entity response set. If there is only one domain id, then only one more get request to hbase. With each additional domain id, the query response time will increase slightly. We can batch the gets to domain table but even so, it will be a few seconds tending to minutes depending on number of calls needed, since multiple calls to hbase translate to multiple hdfs calls. I have been scratching my head on this read performance. The only other option I see is, that the collector keeps the domain id & user/groups info in memory and writes it out with each entity. That way we end up with a denormalized dataset and read queries will be as fast as they can get with hbase. The domain table will still exist and the collector can read from it if it happens to go down and comes back up. Which way do you think might end up working better for applications like Tez? Storage scalability wise, I think either of the two options would be fine with hbase. And the expiration / TTL can be set in either case as well. And as such, for optimizing read / write performance, we can pre-split the domain table and try to balance the row keys to ensure that they go to different Region Servers so we don’t end up hot-spotting one single RS for reads and writes of currently running applications. thanks Vrushali > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341310#comment-16341310 ] Vrushali C commented on YARN-3895: -- I see, thanks [~jlowe] and [~jeagles] . I was earlier under the beliefe that the list of domain ids may not be that big. But if it's approximately one per dag or one app, then that is a lot of domain ids and putting so many in one hbase cell value is not going to work well. Let me rethink the backend storage layout for domain ids in this case. One question, in a common scenario, do you have the user (the doAs DAG user) who wrote the entity be the one to query it? Or is another user in that group query for the entity a more common occurence? Put another way, is the writer user frequently the same as the reader user? If so, perhaps querying the groups_domain table for domain id can be deferred to after we get the entities. It looks like getting the domain id from the entity and checking for that domain id for the user / group is perhaps a better way to query data. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
***UNCHECKED*** [jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341258#comment-16341258 ] Jason Lowe commented on YARN-3895: -- After chatting about this with [~jeagles] offline, we think the proposal could work well but only if the number of domains remains small. The only way we see that happening is if domains are de-duplicated if they reference an equivalent set of ACLs so the total number of domains remains small. It's not clear yet how this de-duplication would occur, especially if the write path can never do reads and if domains are allowed to be updated asynchronously (e.g.: admin wants to add another user to an existing domain). Wildcard ACLs could be solved by treating every user as being in the '*' group as well and always adding every de-duplicated domain ID to the '*' group when created. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341217#comment-16341217 ] Jason Lowe commented on YARN-3895: -- bq. How many domains would there be? I would expect application frameworks to use domains just as they use ATS v1 domains today. That means one domain per application (or sub-DAG if they are switching ACLs per DAG like the server-user-on-behalf-of-multiple-users case). So there are going to be a lot of them. I suspect frameworks are just going to create a new domain for their specific ACLs rather than searching for an existing domain that matches their ACL needs. That also avoids the problem of someone later updating the reused domain thinking they were just updating the original app ACLs and inadvertently changed the ACLs of newer apps that reused. That may or may not be desired. A 1-to-1 mapping of domain per app (or sub-DAG) is a natural fit to the granularity of ACL control on the YARN side. bq. Gets back a list of domain ids this group has permissions for. This may be pretty big? Yeah, this result is going to be huge in practice. Also how would wildcard ACLs in a domain be supported, or are they not allowed? > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340325#comment-16340325 ] Vrushali C commented on YARN-3895: -- Hi [~jlowe] We discussed this between [~rohithsharma], [~lohit], [~varun_saxena] and I. It basically comes down to whether we want to take a performance hit at read time or write time. Given that writing out extra details at write time seems like the worse option when running at scale, we thought of taking the approach which may be a slight hit on the read path but has some optimizations. Here is our proposal. Extremely short summary: We will go with the domain concept that comes with ATSv1. So each entity is written with a domain id. At read time, the check is made to ensure the querying user has permissions to read the data based on domain id. Design Details: Domain ID storage: - domains are published by the AM, just as they are done in ATSv1. - subsequent entity writes include the domain id per write, same as ATSv1. - domain ids are written to two tables in hbase. - one table is user_domain table and the other is groups_domain table. - the user_domain table has the rowkey as cluster id + username and a column whose value is the list of domain ids for that user. - Similarly the groups_domain table rowkey of cluster id + group name and a column whose value stores the list of domain ids for that group. So, for each user or group in the timeline domain object who is a reader or the owner, the domain id is added to that user's row in the user_domain or groups_domain table. The domain id is first written to the cell with tags. Now, there will be a coprocessor which checks if the domain id already exists in the value in the domain column. If yes, no-op, nothing to do. If the domain id does not already exist, meaning it is a new one, it will be appended to the value list. - Expiration/ removal of domain ids. If this list of domain ids has the potential to grow very big, we can consider storing a TTL for each domain id. We can store the TTLs per domain id in these user_domain and group_domain tables and have the coprocessor look at cleanup at the time of major compaction. If the list of domain ids is small enough, expiration / TTL is not required to be implemented. What do you think? How many domains would there be? Read Query time: We propose to have the reader api authorization to work in the following fashion. - A read query for an entity comes in from a user. - The timeline reader will create 3 threads and issue three parallel requests to hbase. - One request is a Get from the user_domain table for this querying user. Gets back a list of domain ids this user has permissions for. - Another request is a Get from the groups_domain table for the group that this querying user belongs to. Gets back a list of domain ids this group has permissions for. This may be pretty big? - Third request is to get the entities that are being asked for . Now, given the domain ids in the entity response, a check is made if the domain id exists in the user_domain response or the groups_domain response. This dataset is accordingly returned as the query response. I believe ATSv1 does a get all entities and then queries the domain table to see if this domain id relates to that querying user. This model may not work efficiently in hbase in case of multiple domain ids, doing too many gets will make the timeline reader response slow. But, as an additional api option, if the domain id is passed into the query, we can check for existence of that domain id directly in the user_domain or groups_domain table and proceed accordingly. Also, if the user who is querying is an admin user, we can skip all the checks and just get the entities. And of course, if security is not enabled, no additional gets from user_domain and groups_domain table are required. What do you think of this approach? > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336260#comment-16336260 ] Vrushali C commented on YARN-3895: -- Thanks [Jason Lowe|applewebdata://C29F7BBC-6971-4859-917C-995869395317/jira/secure/ViewProfile.jspa?name=jlowe] for the discussion. Let me summarize some points from our discussions so far. * Goal of jira: Design a way for authorization during reads of timeline entities * Design objectives: * store data in denormalized fashion since hbase reads would work well with that. Avoid joins across tables * Write out ACLs as few times as possible. Ideally once per DAG (once per application) Background: * ATSv1 / 1.5 does read authorization via domain ids. A domain id is published once per DAG or once per application and all entities written with that domain id are authorized at read time accordingly. Current design proposal summary: * ATSv2 uses HBase and if we were to follow a design similar to ATSv1/1.5, then that would mean doing a join across two tables (domain/ACLs table and the entity table). This will not be ideal in terms of read performance. Correctness will not be an issue here, response latencies would be a concern. * To counteract the read latencies, one idea is to do reads from collector at write time. There are few things that might be a concern here. The collector would now open connections to more region servers to read from other tables. When running at scale, we would like the write path needs to be along the lines of “fire-and-forget” . Doing reads from collector would likely causes high latencies during writes as well as increased network connections when running at scale for the yarn cluster as well as the HBase cluster. Also, doing a read then write does not lower the size of data being sent from collector to region server. * There is another thought along the lines of caching the ACLs in the collector and attaching them to each entity while writing it out. The ACLs would also be stored in an ACLs table. Now, in the case of collector going down and coming back up, it can do a read from the ACLs table for the applications it is collecting data from. This read is a one-off case when the collector goes down and comes back up. The ACLs are still stored in a denormalized way with the entity and reads do not query this ACLs table. * This case still does not reduce the size of data being sent with each entity. * Also, for updating ACLs for entities, we plan to provide an API or an admin call which would go over the tables and write out the ACLs again. I will think over this a bit more and discuss with others and get back soon. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336059#comment-16336059 ] Rohith Sharma K S commented on YARN-3895: - bq. If desired this could be changed from a read-time lookup to a write-time lookup I vaguely remember that decision made for improving write performance is not to do additional look up to backend from collectors. bq. The collector could then cache these ACL IDs so very few writes would require a lookup. This is one of the option we were discussing but currently fault tolerance for collectors are not there. IAC, NM restart will loose cached ACLs. To recover this, collectors need to read from back end which complexity increase from collectors. Currently collectors are write only module. May be only ACLs details can be stored in LocalFS and recovered. However, if NM node is lost new AM will be launched and new set of ACLs are written from AM. bq. what's the plan to update ACLs after the application completed? We discussed this bit and thought to introduce new REST end point in TimelineReader for update ACLs for completed applications. This could be performed only by TimelineReader admin. As far as Acls story is concerned we kept this as low priority. bq. Isn't this essentially sending the ACLs on most posts? If we need to avoid HBase double lookups on reads then the ACL has to be in the entity row data, correct? Its true that most of the time new entities like vertex, vertex-attempts are published. Keeping ACLs in row key in existing hbase tables such as entity_table or sub__application_table increases complexities of building row key at write and read time. Currently these tables has combination of 5-7 keys. I would be lenient for double look up at read time than keeping ACLs details in row key. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335849#comment-16335849 ] Jason Lowe commented on YARN-3895: -- bq. Yes, doing a lookup in two tables at read time (regular entity table and 'domain' or 'ACLs' table) would be very slow in HBase. If desired this could be changed from a read-time lookup to a write-time lookup. In other words, the collector could be responsible for translating/expanding the ACL identifier into the actual ACLs when writing the row. The collector could then cache these ACL IDs so very few writes would require a lookup. It is _very_ likely that the ACL ID isn't changing between entity posts. This would mean that ACLs could not be easily updated once specified, as all existing rows would need to be updated, but that's going to be true even if we don't have a domain/ACL ID for indirection on writes given the proposal to replicate it on each entity row. bq. How much big would be ACL's size? ACLs aren't going to be hundreds of kilobytes, but it could get larger than what is typical if it is an explicit list of many users and/or groups. That's one of the reasons ATS v1 made this indirect via domains, so ACLs are only sent once per DAG and a very small bit of info for each post ties the entity to its corresponding ACL. Also, as alluded to above, what's the plan to update ACLs after the application completed? I assume this would have to be a full rewrite of every ACL column on every entity posted by the application. I don't expect that to be a common occurrence, but will it be supported or only via HBase admin intervention to doctor the database? bq. The ACLs details need to sent one time per entity-id. ACLs object will contains only reader details which is similar to TimelineDomain#reader field. Any update for entity-id need not to send acls details again. Isn't this essentially sending the ACLs on most posts? If we need to avoid HBase double lookups on reads then the ACL has to be in the entity row data, correct? For Tez I believe a large chunk of the posting is going to be new entities and not updates to existing ones. An application like Tez will end up sending full ACLs on about 50% of its posts. (I think most entities have just a start event and a stop event.) > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335409#comment-16335409 ] Rohith Sharma K S commented on YARN-3895: - {quote}I could see cases where the ACLs are not that small, potentially larger than an average entity. {quote} I am wondering when this could happen? How much big would be ACL's size?. This can be limited to some bytes per entity. In similar situation in MR/Native service AM while publishing AM configurations, we restrict per entity object to configurable bytes. I think clients should use this methods while publishing entities. {quote}Just as that would be cumbersome to store per cell, it would be cumbersome to build and parse per entity. {quote} The ACLs details need to sent one time per entity-id. ACLs object will contains only reader details which is similar to TimelineDomain#reader field. Any update for entity-id need not to send acls details again. Further, any update on ACLs can send later as well but either older one will be overwritten or can be appended which we can decide it. Sending ACLs info one time per entity-id makes bigger impact? From Hbase, storing it in column shouldn't be much issue. Benefit we get is we can apply substring filter when acls is enabled in Hbase. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335102#comment-16335102 ] Vrushali C commented on YARN-3895: -- Thanks for your response [~jlowe]. {quote}Having a tidy ID to reference a set of ACLs would eliminate this concern, but it would add some necessary indirection lookups on the reader side. {quote} Yes, doing a lookup in two tables at read time (regular entity table and 'domain' or 'ACLs' table) would be very slow in HBase. Hence we wanted to denormalize it and store per entity. But I understand the point about it being too cumbersome while creating each entity. I will think over this a bit more and also discuss with [~rohithsharma]and [~varun_saxena] and get back on this. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334907#comment-16334907 ] Jason Lowe commented on YARN-3895: -- {quote} So, we store these kind of doAs query related entities in a table called subApp table. The rowkey in this table contains both the subAppUserId as well as the AM user ID. Although we do not check if the AM is allowed to write as some user, the entity for this pair of {AM user, subAppUser ID } will be in it’s own row. The row key also has the cluster id, entity type, entity id and entity id prefix. {quote} I think that's reasonable. One user shouldn't be able to doctor the data of another, and it sounds like this will prevent that. bq. With every timeline entity, we propose to have a TimelineEntityACLs object inside it. My concern here would be the size of the TimelineEntityACLs object. If that object is itself fully definitive of the ACLs without referencing some other authoritative object (i.e.: like the domain IDs did in ATS 1), then I could see cases where the ACLs are not that small, potentially larger than an average entity. Just as that would be cumbersome to store per cell, it would be cumbersome to build and parse per entity. Having a tidy ID to reference a set of ACLs would eliminate this concern, but it would add some necessary indirection lookups on the reader side. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334386#comment-16334386 ] Rohith Sharma K S commented on YARN-3895: - Hi [~jlowe], does this approach looks reasonable? > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331223#comment-16331223 ] Vrushali C commented on YARN-3895: -- Hi [~jlowe] We ([~varun_saxena] [~rohithsharma] and I) had a discussion around these points and wanted to share our thoughts. {quote} bq. How does the collector authenticate that the AM is allowed to proxy as that user, or can any AM forge data as other users simply by stating the data is from so-and-so? {quote} So, we store these kind of doAs query related entities in a table called subApp table. The rowkey in this table contains both the subAppUserId as well as the AM user ID. Although we do not check if the AM is allowed to write as some user, the entity for this pair of \{AM user, subAppUser ID } will be in it’s own row. The row key also has the cluster id, entity type, entity id and entity id prefix. Sub App Row key format: {code:java} {subAppUserId!clusterId!entityType!entityPrefix!entityId!userId}{code} Therefore, although a rogue AM could write a lot of data as other doAs users, it would still go to it’s own rows. {quote}bq. It's less clear to me how this is going to work for the case of an AM running as one user but working on behalf of multiple other users across multiple sub-apps. The YARN application only has one set of ACLs, set when it is submitted by the service user. {quote} So in the case of AM running as one user and executing doAs queries, we are now thinking of the following enhancement to earlier proposal: - With every timeline entity, we propose to have a TimelineEntityACLs object inside it. (This TimelineEntityACLs does not exist yet in the current code.) - The AM can populate this TimelineEntityACLs object with the ApplicationACLs and it will be part of the TimelineEntity it is writing. When there exist additional DAG ACLs as in the case of doAs queries, they can also be added to TimelineEntityACLs of that entity. - In this way, each timeline entity can have it’s allowed users/groups. At the backend, we think we can store these allowed users and allowed groups as column values in the tables per entity and at query time, we can confirm if the user making the query is part of allowed users list or is in a group that is part of allowed groups list. We started thinking of storing it in columns rather than cell tags since the ACLs would be too much info to store for each cell. Since each timeline entity is in it’s own row, each row having columns for allowed users and allowed groups should work per entity. Would appreciate your feedback on this updated approach. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Major > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324450#comment-16324450 ] Vrushali C commented on YARN-3895: -- Thanks [~jlowe] , I will think over this and get back. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324140#comment-16324140 ] Jason Lowe commented on YARN-3895: -- I think Application ACLs could work fine for the straightforward case of a user running their own app. As you mentioned, it already reflects how YARN handles the ACLs for the AHS and log server today. It's less clear to me how this is going to work for the case of an AM running as one user but working on behalf of multiple other users across multiple sub-apps. The YARN application only has one set of ACLs, set when it is submitted by the service user. Those permissions are going to be restricted to just the service user, most likely. Then the service user runs a sub-app (e.g.: a DAG) on behalf of another user. In that case the ACLs may need to change (e.g.: be permissive to more groups, etc.). The YARN app ACL isn't changing at this point, it was set at time of submit, so how does the AM inform the collector of the ACL change? Similarly, even if the AM wrapped some of its execution in a doAs for the other user, how does the collector know the user has changed? Did the AM somehow disconnect and reconnect to the collector? How does the collector authenticate that the AM is allowed to proxy as that user, or can any AM forge data as other users simply by stating the data is from so-and-so? I'm not that familiar with HBase, but it looks like the ACLs are per cell and then it seems pretty straightforward how ACLs could change across sub-apps and implement the proper restrictions on the read path. It's the write path in the multiple-sub-apps-for-multiple-users-by-one-service-user case that I'm not seeing how the security works. If we're basing it on the YARN app ACL, that isn't changing across sub-apps but in many cases will need to do so. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323324#comment-16323324 ] Vrushali C commented on YARN-3895: -- We had a discussion today and wanted to summarize some points (most might be repeated from conversations above): - we will use Application ACLs for getting the user & group information while writing the entities. - this will be stored in hbase within each cell as part of it's cell tags - each time a query for reading this data comes in, we will use the user ACLs at the hbase region server in a coprocessor to determine if the user is allowed to read this data or not. - admin users are always allowed to read all data - this would imply coprocessors on each table [~jlowe] what do you think about this approach for read side authorization? This does not make use of any domain concept (as in v1.5). This is along the lines of security in yarn via ACLs. This should also work in the case of AM running as one user but executing DAGs as other users. The callerUGI during the write entity in such situations will have both users (AM user and doAs user) and we will store both. So, at ready time, query by AM user as well as the doAs user will be allowed for this data. Also any other user who is part of that group should be able read it. At the backend side, there is the thing about storing this info per cell in hbase. It is a lot of repeated information. IIUC, hbase security and visibility labels work with the same logic but in that case, hbase admin commands are used to grant permissions to specific hbase users/labels. I will think over if we can optimize how many times this is stored per Column Family. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321429#comment-16321429 ] Vrushali C commented on YARN-3895: -- Hello [~rohithsharma] [~varun_saxena] [~haibo.chen] I was thinking a little bit about ACLs and read side authorization. I have some thoughts and wanted to share them. Everything is not fully hashed out perfectly but I think this might work. When the data is written, at that time, we can use hbase cell tags to store the allowed users as well as groups. Just like we are storing things right now for flow run, we will do the same for entities and applications & subapps. While querying, we can pass in the querying user/group info via “Attributes” in the Get/Scan. This can be accessed in the coprocessor via “getAttributes” of the Get/Scan. Then the coprocessor checks if current user who is querying is equal to allowed user or if the current group is part of allowed groups list in the cell tags. We can default to read allowed for all if no tags are present. Also, we could indicate that the user who is querying is a yarn_admin user, so allow all reads. This should work for all our regular tables like entity, application as well as sub-application. For sub app table, we store AM user as well as do-As user (and their groups) in the cell tags. So at query time, we can see if the querying user is one of AM user or doAs user. That way we protect the data from other users even if they run with the same AM user. For the flow run table, we can perhaps do a union or something across all entries. I am still thinking over it. Here is an old thread in the hbase-users mailing list in which James Taylor from Phoenix has also mentioned that Phoenix is (or at least was) doing the same thing https://mail-archives.apache.org/mod_mbox/hbase-user/201302.mbox/browser We can later check with the HBase folks if this much extra data in the cell tags could be a concern but my gut feeling is that it’s not. Cell tags are used by hbase security as well as Phoenix for passing around information and making decisions at server side. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299641#comment-16299641 ] Vrushali C commented on YARN-3895: -- We can consider application ACLs in the submission context. These ACLs will be at application level (not applicable for offline collectors). We can allow all writes but only allowed readers will be able to read. Since only authorized users can write. Let us try to target 3.1 for this. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167358#comment-16167358 ] Varun Saxena commented on YARN-3895: Yeah we can think about doing this by 3.1 Lets finalize design in a week or two. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167342#comment-16167342 ] Rohith Sharma K S commented on YARN-3895: - I just got update from Vinod that 3.1 release plan is Dec end or Jan mid! Discussion thread is [3.1 ReleaseDiscussion|https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27705.html] and [3.1 ReleaseWiki|https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3+release+status+updates] Considering targeting for 3.1 release, we have 2.5 months to code freeze. Its good time to start of it! > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167326#comment-16167326 ] Rohith Sharma K S commented on YARN-3895: - thanks [~varun_saxena] for summarizing the discussions. I could not able to attend the call :-( bq. Most importantly, considering GA would be released on Nov 1, we would need to get this in by 15th October. Do we have enough time? This is more like an additional feature. Or delay it till 3.1? This detailed discussion helped us to estimate minimum effort required to complete this feature. We might not be able to make out for GA as it is very nearer hardly we get 15 days!. But as you all know that 3.1 release is planned somewhere mid Mar/April, 2018. At least we need to target for it. If we can start brain storming on it from now and put up a documentation, we get sufficient time and end up in feasible solution. Request to keep up this momentum as-is and will discuss it on weekly calls. [~varun_saxena] shall we create shared documentation to add brain stormed details? > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166816#comment-16166816 ] Varun Saxena commented on YARN-3895: Points to consider including what we discussed on the call. # Other than read ACLs'. I think we need to have ACLs' restricting modifying an entity as well i.e. on the write side. Otherwise we may allow some other client to modify an entity it does not own and change its read ACLs'. # We can use the application ACLs' passed in AM launch context(available in NM) and store it in App collector. Will have to pass this info when collector runs outside NM. If ACLs' are not provided during entity publish, we can automatically use these app ACLs' as ACLs' for entities. So for MR kind of use cases, application ACLs' might suffice while for Tez DAGs' in Hive LLAP use case, entities within an application may have different ACLs' which can be specified during entity publish. # If we store ACLs' with each entity, storage size would increase because of repetition of ACLs'. Should we store some, possibly short ID? Who will generate a unique ID in the cluster? # The suggestion above is along the lines of TimelineDomain in ATSv1. Refer to [link|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/yarn/api/records/timeline/TimelineDomain.html]. Domain encapsulates set of reader and writer ACLs' and follows the same format as other YARN ACLs' i.e. if a user belongs to a group and group is within the same domain to which entity belongs to, we will have access to the entity. Domain is like a group of user groups and users. # We can have domain or ACL table in HBase with id as row key. Domains should be created beforehand i.e. before publishing entities. # Domain ID can be used as ACL ID but as I said above it will be responsibility of client to generate a unique ID and then use it consistently while publishing entities. # We should consider caching these ACLs' otherwise querying domain table every time might be suboptimal. # How to decide flow run/flow level ACLs'? Union of app ACLs' maybe. This needs some thought. Typically all apps within a flow should have same ACL though. # A point brought up by Joep. For federation use case, some special handling required when containers run across clusters? Not sure. Most importantly, considering GA would be released on Nov 1, we would need to get this in by 15th October. Do we have enough time? This is more like an additional feature. Or delay it till 3.1? > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166151#comment-16166151 ] Varun Saxena commented on YARN-3895: Actually, we were thinking of including this in 3.1.0. If we do need it in 3.1, let's brainstorm about the approach ASAP. We would need something like ACL groups for sure. That would make it easy to specify ACLs'. A certain kind of entities would typically have same kind of user access, in addition to default access given to application owner, query executor, etc. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166135#comment-16166135 ] Rohith Sharma K S commented on YARN-3895: - I think we should target this for GA! Though YARN-6820 provides basic whitelisting users for read access, it is not full solution. Request folks to put up your approaches for discussions! Primarily I can think of couple of approaches which need to be discus complexities in detail! # User can submit acls during submission of application only which is currently supported for application. The same acls can be stored under application table which can be referred while reading entities. These acls belong to per application entities. All the entities under application have these acls. This approach works well for flow model but not for Tez kind of model. # How about accepting ACLs via TimelineEntity itself.? Each entity has ACLS who should be read! Note that acls is for reading data only. # At last, atsv2 can also have group concept where in each group of entities has their own acls. To to this way, probably let introduce new API that accept acls per group to store acls at back end. The concern is how are we going to store at back end? What should be the row key for new table!! cc :/ [~jlowe] [~vrushalic] [~varun_saxena] [~jianhe] [~vinodkv] [~jrottinghuis] [~haibo.chen] > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3895) Support ACLs in ATSv2
[ https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086424#comment-16086424 ] Vrushali C commented on YARN-3895: -- Filed YARN-6820 for adding in a basic read size restriction. > Support ACLs in ATSv2 > - > > Key: YARN-3895 > URL: https://issues.apache.org/jira/browse/YARN-3895 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355 > > This JIRA is to keep track of authorization support design discussions for > both readers and collectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org