[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15469802#comment-15469802
 ] 

Rohith Sharma K S commented on YARN-5585:
-----------------------------------------

bq. Are we selecting entities whose ID is less than start value, or we're 
filtering them out? According to your description fromId = app-5 should return 
something like app-6 to 10, right? I think it's very important to clearly 
define the exact meaning of "fromId"?
*fromId* is to the users to pass as an query parameter in REST URL similar to 
limit.  When entities are being retrieved from storage i.e HBase, entities 
whose ID is less than start value are given to HBase client. Then HBase client 
process this ResultScanner and return entites. 
Ex : Assume that *entity-1 entity-2.. entity-10* are stored in HBase in a row. 
Current Behavior without fromId : 
# When REST call is made to obtaining entities , then out put get it as 
*entity-10 entity-9... entity-2, entity-1*. 
# When REST call is made along with filter {{limit=5}}, then out put get it as 
*entity-10, entity-9... entity-6*.  Note that limit is not applied at storage 
level.  Rather limit is applied on scanned rows i.e HBase ResultScanner gives 
*ALL* the rows i.e entities1 to entities-10. And  
{{TimelineEntityReader#readEntities}} limit number of rows to be given to user. 

After patch i.e fromId as filter : 
# When REST call is made along with filter {{limit=5}} and 
{{fromIid=entity-6}}, then *HBase it self gives rows which are less than 
entity-6* i.e entity-5 to entity-1. It is much more optimization rather that 
processing all the rows at HBaseclient i.e at 
{{TimelineEntityReader#readEntities}}

Basically to the user, fromId is nothing but starting point for next set of 
entities.

bq. Because we're selecting entities starting from a given ID, can we directly 
pass in the fromID's key when creating the scan? In this way seems like we 
saved one filter? For example, if fromId is not provided, we may want to scan 
from cluster!user!flow!flowrun!appId!type, but if fromId is provided, we can 
start from cluster!user!flow!flowrun!appId!type!fromId (or the next available 
entity)?
This is good point. But as you said in earlier comment that entities are not 
stored in-order. It can be like 
entites-9,entitis-5,entites-6,entites-2...entities-10. So, IIUC this can not be 
achieved

bq. For pagination on containers, why do we need to care about actual creation 
time when the entity ids have already been sorted? This said, supporting 
paginations for generic timeline entities should not be blocked by YARN-5094?
Any entities with creationTime set will get descending order of entityId. If 
creationtime is not set than there result is reverse order i.e ascending order 
of entityId. This is because of implementation of 
{{TimelineEntitiy#compareTo}}. So, say {{limit=2 and fromId=enitytId-6}} then 
from storage rows retrieved are i.e entity-5 to entity-1. And to the user, REST 
output get as entity-1 and entity-2 rather than getting entity-5 and entity-4.  
This is because of {{TimelineEntityReader#readEntities}} implementation.  
YARN-5094 blocks for testing YARN-CONTAINER entities because most of the events 
are -1 creation time which always result will be first N number of containers 
when fromId is used. I have tested for TEZ application where fromId works right 
way. 


> [Atsv2] Add a new filter fromId in REST endpoints
> -------------------------------------------------
>
>                 Key: YARN-5585
>                 URL: https://issues.apache.org/jira/browse/YARN-5585
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelinereader
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>         Attachments: YARN-5585.v0.patch
>
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-10. But to retrieve next 5 apps, it is 
> difficult.
> So proposal is to have fromId in the filter like 
> *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to 
> app-10. 
> This is very useful for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to