[ 
https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3892:
---------------------------------------

    Attachment: LUCENE-3892-non-specialized.patch

I created a non-specialized (ie single method to handle all numBits
cases) packed int decoder that decodes directly from byte[].  Baseline
is current BlockPF (FOR w/ specialized decoder), comp is w/ the patch
(using non-specialized decoder):

{noformat}
                Task    QPS base StdDev base     QPS for  StdDev for      Pct 
diff
          AndHighMed       69.04        0.77       36.41        1.91  -50% -  
-43%
          AndHighLow      649.70       17.03      346.71       18.22  -50% -  
-42%
         LowSpanNear        9.88        0.25        5.53        0.06  -45% -  
-42%
           MedPhrase       13.25        0.26        7.74        0.07  -43% -  
-39%
     LowSloppyPhrase        7.59        0.15        4.54        0.13  -43% -  
-37%
           LowPhrase       22.29        0.31       13.77        0.08  -39% -  
-36%
         AndHighHigh       23.55        0.12       15.22        0.63  -38% -  
-32%
     MedSloppyPhrase        6.88        0.12        4.60        0.16  -36% -  
-29%
    HighSloppyPhrase        1.98        0.07        1.38        0.05  -35% -  
-25%
            HighTerm       36.11        0.01       25.31        0.87  -32% -  
-27%
         MedSpanNear        5.02        0.16        3.56        0.03  -31% -  
-26%
             MedTerm      198.76        0.34      142.92        4.34  -30% -  
-25%
          HighPhrase        1.83        0.08        1.32        0.02  -31% -  
-23%
           OrHighLow       27.32        1.10       20.55        0.54  -29% -  
-19%
           OrHighMed       23.65        0.93       17.83        0.44  -29% -  
-19%
          OrHighHigh       11.42        0.46        8.72        0.20  -28% -  
-18%
        HighSpanNear        1.74        0.06        1.38        0.01  -24% -  
-17%
              IntNRQ       11.61        0.01        9.26        0.02  -20% -  
-20%
             LowTerm      513.60        2.26      411.60        7.65  -21% -  
-18%
             Prefix3       82.36        1.05       67.48        1.29  -20% -  
-15%
            Wildcard       52.63        0.44       43.45        0.81  -19% -  
-15%
              Fuzzy1       74.74        1.02       70.03        0.80   -8% -   
-3%
            PKLookup      192.60        3.94      191.87        2.07   -3% -    
2%
              Fuzzy2       62.50        1.29       62.74        1.10   -3% -    
4%
             Respell       61.69        1.04       62.79        0.84   -1% -    
4%
{noformat}

So... is it's clear all our the specializing does help!

                
> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, 
> Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3892
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3892
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>              Labels: gsoc2012, lucene-gsoc-12
>             Fix For: 4.1
>
>         Attachments: LUCENE-3892-BlockTermScorer.patch, 
> LUCENE-3892-blockFor&hardcode(base).patch, 
> LUCENE-3892-blockFor&packedecoder(comp).patch, 
> LUCENE-3892-blockFor-with-packedints-decoder.patch, 
> LUCENE-3892-blockFor-with-packedints-decoder.patch, 
> LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-bulkVInt.patch, 
> LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, 
> LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, 
> LUCENE-3892-pfor-compress-iterate-numbits.patch, 
> LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, 
> LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, 
> LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, 
> LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to