[ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514243#comment-15514243 ]
Varun Saxena edited comment on YARN-5585 at 9/22/16 7:25 PM: ------------------------------------------------------------- Summarizing the solution we decided upon in the call. * We will now return entities from entity table in a lexicographic order of entity IDs' * To achieve a different sort order, we will provide a mechanism for applications to provide an entity ID prefix which can be set in the TimelineEntity object while publishing the entity. * This entityId prefix will be part of the row key in entity table. As the name suggests, it will be present just before the entity ID. Applications can choose to provide no entity ID prefix if they are happy with the lexicographic sort order. So the row key now will be {{cluster!user!flow!flowrun!app!entitytype!\{entityidprefix\}!\{entityid\}}} * Entity ID will also be stored under a column qualifier too (being done already). * Entity ID prefix can be a number (say long) as numbers generally provide a natural sort ordering. However, this needs to be finalized. Keep it as a string ? * When querying multiple entities, we will return the top N entities decided by limit in a lexicographic order of entity ID prefix + entity ID (i.e. if entity ID prefix is supplied). fromID filter can now be something like fromIDPrefix (say) or a similar filter which provides prefix + ID to support pagination. * While querying a single entity, prefix can be supplied as a query param. If supplied, it will be a Get, otherwise we need to have a Scan with SingleColumnValueFilter on entity ID (this will be comparatively slower). We can have a separate REST endpoint to distinguish between prefix based queries and non prefix based queries. We need to distinguish between the case where for an entity prefix has not been specified on the write path and prefix not just supplied at the read path (even if it was supplied at the write path). This needs to be finalized. * Prefix will also be returned as part of TimelineEntity object in response. cc [~jrottinghuis], [~sjlee0], [~vrushalic], [~gtCarrera9]. Hope this covers everything. The reason this solution was chosen was that we thought in UI use cases a single entity read would typically be followed listing of multiple entities and hence prefix would be known. This does not mean however, that we will not provide a mechanism to fetch entity if prefix wasn't given. We can use a single column value filter then. Moreover, this solution overall had lesser write or read penalty compared to solutions listed above. was (Author: varun_saxena): Summarizing the solution we decided upon in the call. * We will now return entities from entity table in a lexicographic order of entity IDs' * To achieve a different sort order, we will provide a mechanism for applications to provide an entity ID prefix which can be set in the TimelineEntity object while writing the entity to backend. * This entityId prefix will be part of the row key in entity table. As the name suggests, it will be present just before the entity ID. Applications can choose to provide no entity ID prefix if they are happy with the lexicographic sort order. So the row key now will be {{cluster!user!flow!flowrun!app!entitytype!\{entityidprefix\}!\{entityid\}}} * Entity ID will also be stored under a column qualifier too (being done already). * Entity ID prefix can be a number (say long) as numbers generally provide a natural sort ordering. However, this needs to be finalized. Keep it as a string ? * When querying multiple entities, we will return the top N entities decided by limit in a lexicographic order of entity ID prefix + entity ID (i.e. if entity ID prefix is supplied). fromID filter can now be something like fromIDPrefix (say) or a similar filter which provides prefix + ID to support pagination. * While querying a single entity, prefix can be supplied as a query param. If supplied, it will be a Get, otherwise we need to have a Scan with SingleColumnValueFilter on entity ID (this will be comparatively slower). We can have a separate REST endpoint to distinguish between prefix based queries and non prefix based queries. We need to distinguish between the case where for an entity prefix has not been specified on the write path and prefix not just supplied at the read path (even if it was supplied at the write path). This needs to be finalized. * Prefix will also be returned as part of TimelineEntity object in response. cc [~jrottinghuis], [~sjlee0], [~vrushalic], [~gtCarrera9]. Hope this covers everything. The reason this solution was chosen was that we thought in UI use cases a single entity read would typically be followed listing of multiple entities and hence prefix would be known. This does not mean however, that we will not provide a mechanism to fetch entity if prefix wasn't given. We can use a single column value filter then. Moreover, this solution overall had lesser write or read penalty compared to solutions listed above. > [Atsv2] Add a new filter fromId in REST endpoints > ------------------------------------------------- > > Key: YARN-5585 > URL: https://issues.apache.org/jira/browse/YARN-5585 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader > Reporter: Rohith Sharma K S > Assignee: Rohith Sharma K S > Priority: Critical > Attachments: YARN-5585.v0.patch > > > TimelineReader REST API's provides lot of filters to retrieve the > applications. Along with those, it would be good to add new filter i.e fromId > so that entities can be retrieved after the fromId. > Current Behavior : Default limit is set to 100. If there are 1000 entities > then REST call gives first/last 100 entities. How to retrieve next set of 100 > entities i.e 101 to 200 OR 900 to 801? > Example : If applications are stored database, app-1 app-2 ... app-10. > *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is > no way to achieve this. > So proposal is to have fromId in the filter like > *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to > app-10. > Since ATS is targeting large number of entities storage, it is very common > use case to get next set of entities using fromId rather than querying all > the entites. This is very useful for pagination in web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org