[ 
https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15675693#comment-15675693
 ] 

Varun Saxena edited comment on YARN-5739 at 11/18/16 4:29 AM:
--------------------------------------------------------------

[~vrushalic], FirstKeyOnlyFilter will return the first KV from each row and 
KeyOnlyFilter only the key going by the description for each filter. So 
shouldn't KeyOnlyFilter be enough ? I had removed FirstKeyOnlyFilter and then 
ran the tests which Li had written and those passed.

bq. This filter is used to limit the number of results to a specific page size. 
So it will terminate the scanning once the number of filter-passed rows is > 
the given page size on that particular Region Server.
Which should be fine I guess. We apply limit (coming as a query param on reader 
side) using this filter on the reader side elsewhere as well. Because we only 
need one row. Even if we get one row per Region Server it will be a superset 
and once result set is created it will be sorted to ensure we get keys in order 
and we will fetch only the first one.
However setCaching should be fine in our use case. But not sure why we are not 
using it to apply limit and using PageFilter instead. Do you know pros and cons 
of one over other ?


was (Author: varun_saxena):
[~vrushalic], FirstKeyOnlyFilter will return the first KV from each row and 
KeyOnlyFilter only the key going by the description for each filter. So 
shouldn't KeyOnlyFilter be enough ?

bq. This filter is used to limit the number of results to a specific page size. 
So it will terminate the scanning once the number of filter-passed rows is > 
the given page size on that particular Region Server.
Which should be fine I guess. We apply limit (coming as a query param on reader 
side) using this filter on the reader side elsewhere as well. Because we only 
need one row. Even if we get one row per Region Server it will be a superset 
and once result set is created it will be sorted to ensure we get keys in order 
and we will fetch only the first one.
However setCaching should be fine in our use case. But not sure why we are not 
using it to apply limit and using PageFilter instead. Do you know pros and cons 
of one over other ?

> Provide timeline reader API to list available timeline entity types for one 
> application
> ---------------------------------------------------------------------------------------
>
>                 Key: YARN-5739
>                 URL: https://issues.apache.org/jira/browse/YARN-5739
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelinereader
>            Reporter: Li Lu
>            Assignee: Li Lu
>         Attachments: YARN-5739-YARN-5355.001.patch, 
> YARN-5739-YARN-5355.002.patch
>
>
> Right now we only show a part of available timeline entity data in the new 
> YARN UI. However, some data (especially library specific data) are not 
> possible to be queried out by the web UI. It will be appealing for the UI to 
> provide an "entity browser" for each YARN application. Actually, simply 
> dumping out available timeline entities (with proper pagination, of course) 
> would be pretty helpful for UI users. 
> On timeline side, we're not far away from this goal. Right now I believe the 
> only thing missing is to list all available entity types within one 
> application. The challenge here is that we're not storing this data for each 
> application, but given this kind of call is relatively rare (compare to 
> writes and updates) we can perform some scanning during the read time. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to