[ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513781#comment-15513781 ]
Sangjin Lee commented on YARN-5585: ----------------------------------- Thanks for your comments [~varun_saxena]. Yes, we should discuss this during the call and report back here. Before we go into how to implement, I think we need to have a consensus on the requirements first. Querying for entities is a fairly generic thing, and IMO there should be a clear expectation of in what order they should be queried. It affects *which* entities get selected as well as in what order they are sorted. As I mentioned, I don't think it would be desirable to leave this order completely arbitrary, or things could get quite confusing really quickly. My preference for this sorting order is either the entity id (descending) order or the chronological order. I think the entity id order is the simplest and easiest to understand, and for the most part identical to the chronological order. YARN entities are mostly compliant (so are MR entities), and it would not be unreasonable to ask frameworks to maintain entity id's that way. Even if that is not feasible, there would be a very consistent understanding how entities would be returned to the reader. That's the default sorting order in the current YARN RM web UI too. Can tez adopt a stricter entity id scheme? If not, at least would it be acceptable if entities are consistently returned in that order? If we go with the chronological order (created time), then I would want it to be consistent. Then we should do it not only for framework entities but also YARN entities and change the row key schema for all. And I think that may require the secondary lookup table (yes, I understand this would be only for lookups and not for data). Another point about sorting within the timeline reader code. If the query is specified with a limit, the limit is passed to the hbase client, and as such it will only return that number of entities (or fewer), right? I don't think hbase will return more than the specified limit, no? Then I don't understand how you would get a *different* set of tez entities than what you expected. For example, if there are entity 1 through 10, and your limit was 5, I would expect hbase to return 6 through 10 still. The reader code may rearrange them so that 6 is at the top, but I don't expect hbase to return anything other than 6 through 10. [~rohithsharma], could you confirm? Did I understand this right? Also, apart from fixing the sorting in {{TimelineEntity.compareTo()}}, I am not sure if we need to re-sort the entities that are returned by hbase again in the timeline reader code. The result set from hbase should return them in the right order, right? Then I think we should simply return them in the same order without applying any further sorting. In other words, instead of using a sorted set, we should use the insertion-order set. Thoughts? [~varun_saxena] > [Atsv2] Add a new filter fromId in REST endpoints > ------------------------------------------------- > > Key: YARN-5585 > URL: https://issues.apache.org/jira/browse/YARN-5585 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader > Reporter: Rohith Sharma K S > Assignee: Rohith Sharma K S > Priority: Critical > Attachments: YARN-5585.v0.patch > > > TimelineReader REST API's provides lot of filters to retrieve the > applications. Along with those, it would be good to add new filter i.e fromId > so that entities can be retrieved after the fromId. > Current Behavior : Default limit is set to 100. If there are 1000 entities > then REST call gives first/last 100 entities. How to retrieve next set of 100 > entities i.e 101 to 200 OR 900 to 801? > Example : If applications are stored database, app-1 app-2 ... app-10. > *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is > no way to achieve this. > So proposal is to have fromId in the filter like > *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to > app-10. > Since ATS is targeting large number of entities storage, it is very common > use case to get next set of entities using fromId rather than querying all > the entites. This is very useful for pagination in web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org