[
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527905#comment-13527905
]
Michael McCandless commented on LUCENE-4598:
--------------------------------------------
+1, looks great!
And it looks like it's a bit faster than before:
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
LowTerm 28.35 (1.4%) 29.42 (0.8%)
3.8% ( 1% - 6%)
HighTerm 2.46 (0.6%) 2.57 (0.5%)
4.8% ( 3% - 5%)
MedTerm 13.09 (1.4%) 13.92 (0.5%)
6.4% ( 4% - 8%)
{noformat}
I think we could speed things up more if this code "owned" the iteration, eg
with some sort of "bulk accumulate" method, rather than
StandardFacetAccumulator going through CategoryListIterator down to
PayloadIterator, per hit. This way it could first iterate by segment (on the
outer loop), then, inside iterate on all docs in that segment, etc. But save
that for another day ...
> Change PayloadIterator to not use top-level reader API
> ------------------------------------------------------
>
> Key: LUCENE-4598
> URL: https://issues.apache.org/jira/browse/LUCENE-4598
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Reporter: Michael McCandless
> Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in
> the result set.
> I think we should get some speedup if we go segment by segment instead ...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]