[jira] [Commented] (LUCENE-5425) Make creation of FixedBitSet in FacetsCollector overridable

Michael McCandless (JIRA) Tue, 04 Feb 2014 09:36:16 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890880#comment-13890880
 ]


Michael McCandless commented on LUCENE-5425:
--------------------------------------------

I compared trunk to the 2nd patch here, on wikimediumall (33.3 M docs), Date 
faceting, using DirectDVFormat for facets:

{noformat}
                    Task    QPS base      StdDev    QPS comp      StdDev        
        Pct diff
                HighTerm       21.28      (2.8%)       18.96      (1.2%)  
-10.9% ( -14% -   -7%)
                 LowTerm       83.98      (3.6%)       75.31      (1.3%)  
-10.3% ( -14% -   -5%)
                 MedTerm       30.96      (2.6%)       27.86      (1.2%)  
-10.0% ( -13% -   -6%)
            OrHighNotLow       12.47      (3.8%)       11.23      (3.6%)  
-10.0% ( -16% -   -2%)
               OrHighMed       13.39      (3.6%)       12.07      (3.4%)   
-9.9% ( -16% -   -2%)
               OrHighLow        9.94      (4.0%)        8.97      (3.7%)   
-9.8% ( -16% -   -2%)
            OrHighNotMed       15.52      (3.4%)       14.13      (3.3%)   
-8.9% ( -15% -   -2%)
           OrHighNotHigh        7.16      (4.4%)        6.58      (3.8%)   
-8.1% ( -15% -    0%)
              OrHighHigh        4.81      (4.3%)        4.42      (3.9%)   
-8.1% ( -15% -    0%)
           OrNotHighHigh        5.83      (4.7%)        5.49      (4.3%)   
-5.9% ( -14% -    3%)
              AndHighLow      292.07      (1.5%)      274.97      (2.2%)   
-5.9% (  -9% -   -2%)
               MedPhrase      143.35      (4.7%)      135.01      (4.5%)   
-5.8% ( -14% -    3%)
            HighSpanNear        6.52      (4.8%)        6.23      (4.2%)   
-4.5% ( -12% -    4%)
              HighPhrase        3.57      (5.9%)        3.42      (5.8%)   
-4.4% ( -15% -    7%)
             MedSpanNear       26.30      (3.1%)       25.55      (2.7%)   
-2.8% (  -8% -    3%)
              AndHighMed       29.54      (1.6%)       28.81      (1.5%)   
-2.5% (  -5% -    0%)
             AndHighHigh       23.98      (1.5%)       23.41      (1.4%)   
-2.4% (  -5% -    0%)
            OrNotHighMed       18.00      (5.6%)       17.59      (4.5%)   
-2.3% ( -11% -    8%)
         LowSloppyPhrase       37.65      (1.9%)       36.84      (1.6%)   
-2.2% (  -5% -    1%)
               LowPhrase       11.98      (2.0%)       11.76      (2.3%)   
-1.8% (  -6% -    2%)
             LowSpanNear        9.57      (3.0%)        9.39      (2.5%)   
-1.8% (  -7% -    3%)
                 Prefix3       75.35      (1.4%)       74.68      (2.3%)   
-0.9% (  -4% -    2%)
        HighSloppyPhrase        3.13      (6.9%)        3.11      (6.3%)   
-0.9% ( -13% -   13%)
                  IntNRQ        4.12      (2.6%)        4.08      (4.2%)   
-0.8% (  -7% -    6%)
         MedSloppyPhrase        3.25      (3.9%)        3.22      (3.7%)   
-0.7% (  -8% -    7%)
                Wildcard       17.39      (2.8%)       17.39      (2.9%)    
0.0% (  -5% -    5%)
            OrNotHighLow       23.17      (6.9%)       23.35      (5.8%)    
0.8% ( -11% -   14%)
                  Fuzzy1       63.79      (1.8%)       64.51      (1.8%)    
1.1% (  -2% -    4%)
                  Fuzzy2       44.03      (2.1%)       44.71      (2.1%)    
1.5% (  -2% -    5%)
                 Respell       46.73      (2.9%)       47.46      (2.8%)    
1.6% (  -3% -    7%)
{noformat}

Looks like there is some penalty for the added abstraction ... but I agree w/ 
Rob: we can just have the common-case Facets impl (FastTaxonomyFacetCounts) 
specialize for the normal case when it's a FixedBitSet we are iterating over ...

> Make creation of FixedBitSet in FacetsCollector overridable
> -----------------------------------------------------------
>
>                 Key: LUCENE-5425
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5425
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>    Affects Versions: 4.6
>            Reporter: John Wang
>         Attachments: facetscollector.patch, facetscollector.patch, 
> fixbitset.patch
>
>
> In FacetsCollector, creation of bits in MatchingDocs are allocated per query. 
> For large indexes where maxDocs are large creating a bitset of maxDoc bits 
> will be expensive and would great a lot of garbage.
> Attached patch is to allow for this allocation customizable while maintaining 
> current behavior.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5425) Make creation of FixedBitSet in FacetsCollector overridable

Reply via email to