[ 
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558303#comment-13558303
 ] 

Michael McCandless commented on LUCENE-4600:
--------------------------------------------

I re-ran CountingFacetsCollector (base) vs 
PostCollectionCountingFacetsCollector (comp):
{noformat}
                    Task    QPS base      StdDev    QPS comp      StdDev        
        Pct diff
                HighTerm       30.15      (1.4%)       30.97      (1.1%)    
2.7% (   0% -    5%)
                 LowTerm      153.06      (0.4%)      158.26      (0.7%)    
3.4% (   2% -    4%)
                 MedTerm       50.69      (0.9%)       52.29      (0.9%)    
3.2% (   1% -    5%)
                PKLookup      238.04      (1.3%)      236.79      (1.8%)   
-0.5% (  -3% -    2%)
{noformat}
I think the cutover away from DISI made it faster ... and it's surprising this 
(allocate bit set, set the bits, revisit the set bits in the end) is faster 
than count-as-you-go.

                
> Explore facets aggregation during documents collection
> ------------------------------------------------------
>
>                 Key: LUCENE-4600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4600
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Michael McCandless
>         Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, 
> LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, LUCENE-4600.patch, 
> LUCENE-4600.patch
>
>
> Today the facet module simply gathers all hits (as a bitset, optionally with 
> a float[] to hold scores as well, if you will aggregate them) during 
> collection, and then at the end when you call getFacetsResults(), it makes a 
> 2nd pass over all those hits doing the actual aggregation.
> We should investigate just aggregating as we collect instead, so we don't 
> have to tie up transient RAM (fairly small for the bit set but possibly big 
> for the float[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to