[jira] [Commented] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

Michael McCandless (JIRA) Mon, 10 Dec 2012 04:27:26 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527905#comment-13527905
 ]


Michael McCandless commented on LUCENE-4598:
--------------------------------------------

+1, looks great!

And it looks like it's a bit faster than before:

{noformat}
                    Task    QPS base      StdDev    QPS comp      StdDev        
        Pct diff
                 LowTerm       28.35      (1.4%)       29.42      (0.8%)    
3.8% (   1% -    6%)
                HighTerm        2.46      (0.6%)        2.57      (0.5%)    
4.8% (   3% -    5%)
                 MedTerm       13.09      (1.4%)       13.92      (0.5%)    
6.4% (   4% -    8%)
{noformat}

I think we could speed things up more if this code "owned" the iteration, eg 
with some sort of "bulk accumulate" method, rather than 
StandardFacetAccumulator going through CategoryListIterator down to 
PayloadIterator, per hit. This way it could first iterate by segment (on the 
outer loop), then, inside iterate on all docs in that segment, etc.  But save 
that for another day ...
                
> Change PayloadIterator to not use top-level reader API
> ------------------------------------------------------
>
>                 Key: LUCENE-4598
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4598
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Michael McCandless
>         Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in 
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in 
> the result set.
> I think we should get some speedup if we go segment by segment instead ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

Reply via email to