64, etc.)

Michael McCandless (JIRA) Sun, 05 Aug 2012 16:34:05 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428936#comment-13428936
 ]


Michael McCandless commented on LUCENE-3892:
--------------------------------------------

I just committed an optimization to BlockPF DocsEnum.advance, inlining
the scanning step (still have to do D&PEnum and EverythingEnum):

{noformat}
                Task    QPS base StdDev base     QPS for  StdDev for      Pct 
diff
              IntNRQ       12.46        1.45       11.60        0.04  -16% -    
5%
            Wildcard       54.36        2.75       52.72        0.38   -8% -    
2%
             Prefix3       85.43        4.97       83.08        0.47   -8% -    
3%
              Fuzzy2       63.86        2.13       62.44        1.79   -8% -    
4%
             Respell       62.75        1.52       61.42        2.02   -7% -    
3%
              Fuzzy1       75.68        1.65       74.69        1.44   -5% -    
2%
         LowSpanNear        9.24        0.20        9.13        0.19   -5% -    
3%
            PKLookup      192.89        2.91      190.66        2.43   -3% -    
1%
        HighSpanNear        1.71        0.05        1.69        0.05   -6% -    
4%
         MedSpanNear        4.80        0.11        4.76        0.12   -5% -    
4%
           MedPhrase       12.57        0.27       12.56        0.21   -3% -    
3%
     MedSloppyPhrase        6.57        0.11        6.56        0.11   -3% -    
3%
           LowPhrase       21.55        0.35       21.55        0.28   -2% -    
2%
     LowSloppyPhrase        7.25        0.16        7.28        0.12   -3% -    
4%
          HighPhrase        1.81        0.11        1.82        0.10  -10% -   
13%
    HighSloppyPhrase        1.94        0.10        1.96        0.05   -6% -    
9%
             LowTerm      512.53        5.66      518.31        2.30    0% -    
2%
             MedTerm      196.09        4.68      198.76        0.30   -1% -    
3%
            HighTerm       35.53        0.95       36.11        0.03   -1% -    
4%
           OrHighMed       23.34        0.83       23.85        0.70   -4% -    
9%
           OrHighLow       26.91        0.98       27.53        0.82   -4% -    
9%
          OrHighHigh       11.27        0.41       11.53        0.34   -4% -    
9%
         AndHighHigh       21.24        0.05       23.79        0.13   11% -   
12%
          AndHighLow      553.19        8.47      621.35        4.01    9% -   
14%
          AndHighMed       57.45        0.13       67.78        0.70   16% -   
19%
{noformat}

                
> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, 
> Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3892
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3892
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>              Labels: gsoc2012, lucene-gsoc-12
>             Fix For: 4.1
>
>         Attachments: LUCENE-3892-BlockTermScorer.patch, 
> LUCENE-3892-blockFor&hardcode(base).patch, 
> LUCENE-3892-blockFor&packedecoder(comp).patch, 
> LUCENE-3892-blockFor-with-packedints-decoder.patch, 
> LUCENE-3892-blockFor-with-packedints-decoder.patch, 
> LUCENE-3892-blockFor-with-packedints.patch, 
> LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, 
> LUCENE-3892-handle_open_files.patch, 
> LUCENE-3892-pfor-compress-iterate-numbits.patch, 
> LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, 
> LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, 
> LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, 
> LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)

Reply via email to