[ https://issues.apache.org/jira/browse/YARN-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122159#comment-16122159 ]
Vrushali C edited comment on YARN-6861 at 8/10/17 7:22 PM: ----------------------------------------------------------- [~rohithsharma] [~varun_saxena] and I discussed this in our weekly call today. The discussion was about the naming of these apis. The consensus was, as of now, we will proceed with this patch. We discussed if we should call it something other than /users/<user>/entities/<entityid>/ to indicate that these are entities that are being queried for without knowledge of the yarn application id. At present, these apis will return sub-application entities. For example, a query that an user "userA" runs on a Tez setup. This user is different from the user, say user "userYARN" who is running the Tez AM. Note 1: Entities from only such queries will go to two places in the backend: - in the entity table within the context of an application: {code} userYARN / cluster/ flow / flowrun id / appid / entity {code} - in the sub application table outside the context of an application: {code} userA / cluster / entity {code} Note 2: In this same example, the Tez AM itself writes some lifecycle events and metrics of it's containers. These will go only to entity table for user "userYARN". The reader APIs in this patch are going to return data that belongs to the context of entities stored outside of an application, that is, from the sub application table. The reader APIs like GET /ws/v2/timeline/clusters/{cluster name}/apps/{app id}/entities/{entity type} or GET /ws/v2/timeline/apps/{app id}/entities/{entity type} will return all entities, that is, entities written in "Note 1" as well as written in "Note 2". The reader APIs in this patch will return a subset of entities, those written in "Note 1". The point we discussed was that when we move on to having user level (and queue level) aggregations, we would need reader APIs to return that data. For example, an API that returns say megabytemillis (or all MR counters) for a user within a time range, say like last week. These APIs help understand usage of a user or queue on the cluster. This data is aggregated data and those APIs could like have similar API format /users/<userid>/entities perhaps. In this case, we could call the API /usersummary/<userid>/entities. As of now, we will proceed with this patch. was (Author: vrushalic): [~rohithsharma] [~varun_saxena] and I discussed this in our weekly call today. The discussion was about the naming of these apis. The consensus was, as of now, we will proceed with this patch. We discussed if we should call it something other than /users/<user>/entities/<entityid>/ to indicate that these are entities that are being queried for without knowledge of the yarn application id. At present, these apis will return sub-application entities. For example, a query that an user "userA" runs on a Tez setup. This user is different from the user, say user "userYARN" who is running the Tez AM. Note 1: Entities from only such queries will go to two places in the backend: - in the entity table within the context of an application: {code} userYARN / cluster/ flow / flowrun id / appid / entity {code} - in the sub application table outside the context of an application: {code} sub app userA / cluster / entity {code} Note 2: In this same example, the Tez AM itself writes some lifecycle events and metrics of it's containers. These will go only to entity table for user "userYARN". The reader APIs in this patch are going to return data that belongs to the context of entities stored outside of an application, that is, from the sub application table. The reader APIs like GET /ws/v2/timeline/clusters/{cluster name}/apps/{app id}/entities/{entity type} or GET /ws/v2/timeline/apps/{app id}/entities/{entity type} will return all entities, that is, entities written in "Note 1" as well as written in "Note 2". The reader APIs in this patch will return a subset of entities, those written in "Note 1". The point we discussed was that when we move on to having user level (and queue level) aggregations, we would need reader APIs to return that data. For example, an API that returns say megabytemillis (or all MR counters) for a user within a time range, say like last week. These APIs help understand usage of a user or queue on the cluster. This data is aggregated data and those APIs could like have similar API format /users/<userid>/entities perhaps. In this case, we could call the API /usersummary/<userid>/entities. As of now, we will proceed with this patch. > Reader API for sub application entities > --------------------------------------- > > Key: YARN-6861 > URL: https://issues.apache.org/jira/browse/YARN-6861 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader > Reporter: Rohith Sharma K S > Assignee: Rohith Sharma K S > Attachments: YARN-6861-YARN-5355.001.patch, > YARN-6861-YARN-5355.002.patch > > > YARN-6733 and YARN-6734 writes data into sub application table. There should > be a way to read those entities. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org