[ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469802#comment-15469802 ]
Rohith Sharma K S commented on YARN-5585: ----------------------------------------- bq. Are we selecting entities whose ID is less than start value, or we're filtering them out? According to your description fromId = app-5 should return something like app-6 to 10, right? I think it's very important to clearly define the exact meaning of "fromId"? *fromId* is to the users to pass as an query parameter in REST URL similar to limit. When entities are being retrieved from storage i.e HBase, entities whose ID is less than start value are given to HBase client. Then HBase client process this ResultScanner and return entites. Ex : Assume that *entity-1 entity-2.. entity-10* are stored in HBase in a row. Current Behavior without fromId : # When REST call is made to obtaining entities , then out put get it as *entity-10 entity-9... entity-2, entity-1*. # When REST call is made along with filter {{limit=5}}, then out put get it as *entity-10, entity-9... entity-6*. Note that limit is not applied at storage level. Rather limit is applied on scanned rows i.e HBase ResultScanner gives *ALL* the rows i.e entities1 to entities-10. And {{TimelineEntityReader#readEntities}} limit number of rows to be given to user. After patch i.e fromId as filter : # When REST call is made along with filter {{limit=5}} and {{fromIid=entity-6}}, then *HBase it self gives rows which are less than entity-6* i.e entity-5 to entity-1. It is much more optimization rather that processing all the rows at HBaseclient i.e at {{TimelineEntityReader#readEntities}} Basically to the user, fromId is nothing but starting point for next set of entities. bq. Because we're selecting entities starting from a given ID, can we directly pass in the fromID's key when creating the scan? In this way seems like we saved one filter? For example, if fromId is not provided, we may want to scan from cluster!user!flow!flowrun!appId!type, but if fromId is provided, we can start from cluster!user!flow!flowrun!appId!type!fromId (or the next available entity)? This is good point. But as you said in earlier comment that entities are not stored in-order. It can be like entites-9,entitis-5,entites-6,entites-2...entities-10. So, IIUC this can not be achieved bq. For pagination on containers, why do we need to care about actual creation time when the entity ids have already been sorted? This said, supporting paginations for generic timeline entities should not be blocked by YARN-5094? Any entities with creationTime set will get descending order of entityId. If creationtime is not set than there result is reverse order i.e ascending order of entityId. This is because of implementation of {{TimelineEntitiy#compareTo}}. So, say {{limit=2 and fromId=enitytId-6}} then from storage rows retrieved are i.e entity-5 to entity-1. And to the user, REST output get as entity-1 and entity-2 rather than getting entity-5 and entity-4. This is because of {{TimelineEntityReader#readEntities}} implementation. YARN-5094 blocks for testing YARN-CONTAINER entities because most of the events are -1 creation time which always result will be first N number of containers when fromId is used. I have tested for TEZ application where fromId works right way. > [Atsv2] Add a new filter fromId in REST endpoints > ------------------------------------------------- > > Key: YARN-5585 > URL: https://issues.apache.org/jira/browse/YARN-5585 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader > Reporter: Rohith Sharma K S > Assignee: Rohith Sharma K S > Attachments: YARN-5585.v0.patch > > > TimelineReader REST API's provides lot of filters to retrieve the > applications. Along with those, it would be good to add new filter i.e fromId > so that entities can be retrieved after the fromId. > Example : If applications are stored database, app-1 app-2 ... app-10. > *getApps?limit=5* gives app-1 to app-10. But to retrieve next 5 apps, it is > difficult. > So proposal is to have fromId in the filter like > *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to > app-10. > This is very useful for pagination in web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org