[
https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Han Jiang updated LUCENE-3892:
------------------------------
Attachment: LUCENE-3892-blockFor&packedecoder(comp).patch
LUCENE-3892-blockFor&hardcode(base).patch
Previous experiments showed a net loss with packed ints API, however there're
slight difference e.g. all-value-the-same case is not handled equally. I
suppose these two patches should make the comparison fair enough.
Base: BlockForPF + hardwired decoder
Comp: BlockForPF + PackedInts.Decoder
{noformat}
Task QPS base StdDev base QPS comp StdDev comp Pct
diff
AndHighHigh 25.66 0.31 22.61 1.21 -17% -
-6%
AndHighMed 74.17 1.45 59.48 3.62 -26% -
-13%
Fuzzy1 95.60 1.51 96.06 2.22 -3% -
4%
Fuzzy2 28.67 0.50 28.51 0.75 -4% -
3%
IntNRQ 33.31 0.60 30.73 1.51 -13% -
-1%
OrHighHigh 17.58 0.59 16.22 1.18 -17% -
2%
OrHighMed 34.42 0.93 32.14 2.33 -15% -
2%
PKLookup 217.08 4.25 213.76 1.37 -4% -
1%
Phrase 6.10 0.12 5.34 0.07 -15% -
-9%
Prefix3 77.27 1.26 70.42 2.87 -13% -
-3%
Respell 92.91 1.34 92.61 1.83 -3% -
3%
SloppyPhrase 5.35 0.16 5.00 0.29 -14% -
1%
SpanNear 6.05 0.15 5.47 0.07 -12% -
-6%
Term 37.62 0.32 33.08 1.70 -17% -
-6%
TermBGroup1M 17.45 0.64 16.40 0.73 -13% -
1%
TermBGroup1M1P 25.20 0.69 23.47 1.24 -14% -
0%
TermGroup1M 18.53 0.65 17.40 0.76 -13% -
1%
Wildcard 44.39 0.49 40.51 1.69 -13% -
-3%
{noformat}
> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta,
> Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
> Key: LUCENE-3892
> URL: https://issues.apache.org/jira/browse/LUCENE-3892
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Labels: gsoc2012, lucene-gsoc-12
> Fix For: 4.1
>
> Attachments: LUCENE-3892-BlockTermScorer.patch,
> LUCENE-3892-blockFor&hardcode(base).patch,
> LUCENE-3892-blockFor&packedecoder(comp).patch,
> LUCENE-3892-blockFor-with-packedints-decoder.patch,
> LUCENE-3892-blockFor-with-packedints-decoder.patch,
> LUCENE-3892-blockFor-with-packedints.patch,
> LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch,
> LUCENE-3892-handle_open_files.patch,
> LUCENE-3892-pfor-compress-iterate-numbits.patch,
> LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch,
> LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch,
> LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch,
> LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]