[ https://issues.apache.org/jira/browse/YARN-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883828#comment-13883828 ]
Zhijie Shen commented on YARN-925: ---------------------------------- [~sinchii], thanks for taking care of the filters. I had quick look at the patch. IMO, it's on the right track. However, the major task of this issue is to optimize the filtering in the implementation of application history store, in particular FileSystemApplicationHistoryStore. The current patch still reads each individual history file and loads the full historical information of an application, followed by a number of filtering conditions. It doesn't make the difference with doing this filtering in ApplicationHistoryManager. Given a million history files, it will be a disaster to read all of them. By pushing the filters back to the implementation of application history store, I suppose that the implementation knows best about how the historic data is stored, and we can do optimization here. In the FS implementation, ideally, we should build an index in some way, and only read the historical files that hit the filters. > Augment HistoryStorage Reader Interface to Support Filters When Getting > Applications > ------------------------------------------------------------------------------------ > > Key: YARN-925 > URL: https://issues.apache.org/jira/browse/YARN-925 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Mayank Bansal > Assignee: Mayank Bansal > Fix For: YARN-321 > > Attachments: YARN-925-1.patch, YARN-925-2.patch, YARN-925-3.patch, > YARN-925-4.patch, YARN-925-5.patch, YARN-925-6.patch, YARN-925-7.patch, > YARN-925-8.patch > > > We need to allow filter parameters for getApplications, pushing filtering to > the implementations of the interface. The implementations should know the > best about optimizing filtering. -- This message was sent by Atlassian JIRA (v6.1.5#6160)