[ 
https://issues.apache.org/jira/browse/YARN-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16336260#comment-16336260
 ] 

Vrushali C commented on YARN-3895:
----------------------------------

Thanks [Jason 
Lowe|applewebdata://C29F7BBC-6971-4859-917C-995869395317/jira/secure/ViewProfile.jspa?name=jlowe]
 for the discussion. Let me summarize some points from our discussions so far.
 * Goal of jira: Design a way for authorization during reads of timeline 
entities 
 * Design objectives:
 * store data in denormalized fashion since hbase reads would work well with 
that. Avoid joins across tables
 * Write out ACLs as few times as possible. Ideally once per DAG (once per 
application)

Background:
 * ATSv1 / 1.5 does read authorization via domain ids. A domain id is published 
once per  DAG or once per application and all entities written with that domain 
id are authorized at read time accordingly. 

Current design proposal summary:
 * ATSv2 uses HBase and if we were to follow a design similar to ATSv1/1.5, 
then that would mean doing a join across two tables (domain/ACLs table and the 
entity table). This will not be ideal in terms of read performance. Correctness 
will not be an issue here, response latencies would be a concern.
 * To counteract the read latencies, one idea is to do reads from collector at 
write time. There are few things that might be a concern here. The collector 
would now open connections to more region servers to read from other tables. 
When running at scale, we would like the write path needs to be along the lines 
of “fire-and-forget” .  Doing reads from collector would likely causes high 
latencies during writes as well as increased network connections when running 
at scale for the yarn cluster as well as the HBase cluster.  Also, doing a read 
then write does not lower the size of data being sent from collector to region 
server.
 * There is another thought along the lines of caching the ACLs in the 
collector and attaching them to each entity while writing it out.  The ACLs 
would also be stored in an ACLs table. Now, in the case of collector going down 
and coming back up, it can do a read from the ACLs table for the applications 
it is collecting data from. This read is a one-off case when the collector goes 
down and comes back up. The ACLs are still stored in a  denormalized way with 
the entity and reads do not query this ACLs table.
 * This case still does not reduce the size of data being sent with each entity.
 * Also, for updating ACLs for entities, we plan to provide an API or an admin 
call which would go over the tables and write out the ACLs again.

I will think over this a bit more and discuss with others and get back soon.

> Support ACLs in ATSv2
> ---------------------
>
>                 Key: YARN-3895
>                 URL: https://issues.apache.org/jira/browse/YARN-3895
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
>            Priority: Major
>              Labels: YARN-5355
>
> This JIRA is to keep track of authorization support design discussions for 
> both readers and collectors. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to