[ https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527905#comment-13527905 ]
Michael McCandless commented on LUCENE-4598: -------------------------------------------- +1, looks great! And it looks like it's a bit faster than before: {noformat} Task QPS base StdDev QPS comp StdDev Pct diff LowTerm 28.35 (1.4%) 29.42 (0.8%) 3.8% ( 1% - 6%) HighTerm 2.46 (0.6%) 2.57 (0.5%) 4.8% ( 3% - 5%) MedTerm 13.09 (1.4%) 13.92 (0.5%) 6.4% ( 4% - 8%) {noformat} I think we could speed things up more if this code "owned" the iteration, eg with some sort of "bulk accumulate" method, rather than StandardFacetAccumulator going through CategoryListIterator down to PayloadIterator, per hit. This way it could first iterate by segment (on the outer loop), then, inside iterate on all docs in that segment, etc. But save that for another day ... > Change PayloadIterator to not use top-level reader API > ------------------------------------------------------ > > Key: LUCENE-4598 > URL: https://issues.apache.org/jira/browse/LUCENE-4598 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet > Reporter: Michael McCandless > Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch > > > Currently the facet module uses MultiFields.* to pull the D&PEnum in > PayloadIterator, to access the payloads that store the facet ords. > It then makes heavy use of .advance and .getPayload to visit all docIDs in > the result set. > I think we should get some speedup if we go segment by segment instead ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org