[ 
https://issues.apache.org/jira/browse/YARN-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122159#comment-16122159
 ] 

Vrushali C commented on YARN-6861:
----------------------------------

[~rohithsharma] [~varun_saxena] and I discussed this in our weekly call today. 
The discussion was about the naming of these apis. The consensus was, as of 
now, we will proceed with this patch.

We discussed if we should call it something other than 
/users/<user>/entities/<entityid>/ to indicate that these are entities that are 
being queried for without knowledge of the yarn application id. 

At present, these apis will return sub-application entities. For example, a 
query that an user "userA" runs on a Tez setup. This user is different from the 
user, say user "userYARN" who is running the Tez AM. 

Note 1: 
Entities from only such queries will go to two places in the backend: 
- in the entity table within the context of  an application: {code}   userYARN 
/ cluster/ flow / flowrun id / appid / entity  {code}
- in the sub application table outside the context of an application:   {code} 
sub app userA / cluster / entity  {code}

Note 2: 
In this same example, the Tez AM itself writes some lifecycle events and 
metrics of it's containers. These will go only to entity table for user 
"userYARN". 

The reader APIs in this patch are going to return data that belongs to the 
context of entities stored outside of an application, that is, from the sub 
application table. 

The reader APIs like GET /ws/v2/timeline/clusters/{cluster name}/apps/{app 
id}/entities/{entity type}
 or GET /ws/v2/timeline/apps/{app id}/entities/{entity type} will return all 
entities, that is, entities written  in "Note 1" as well as written in "Note 
2". 

The  reader APIs in this patch will return a subset of entities, those written 
in "Note 1". 

The point we discussed was that when we move on to having user level (and queue 
level) aggregations, we would need reader APIs to return that data. For 
example, an API that returns say megabytemillis (or all MR counters) for a user 
within a time range, say like last week. These APIs help understand usage of a 
user or queue on the cluster. This data is aggregated data and those APIs could 
like have similar API format /users/<userid>/entities perhaps. In this case, we 
could call the API /usersummary/<userid>/entities. 

As of now, we will proceed with this patch.





> Reader API for sub application entities
> ---------------------------------------
>
>                 Key: YARN-6861
>                 URL: https://issues.apache.org/jira/browse/YARN-6861
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelinereader
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>         Attachments: YARN-6861-YARN-5355.001.patch, 
> YARN-6861-YARN-5355.002.patch
>
>
> YARN-6733 and YARN-6734 writes data into sub application table. There should 
> be a way to read those entities.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to