Replying to dev@ because Jira keeps being unavailable: Seems like we should default BlockPostingsFormat to COMPACT.
Mike McCandless http://blog.mikemccandless.com On Fri, Aug 10, 2012 at 8:14 AM, Adrien Grand (JIRA) <[email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432716#comment-13432716 > ] > > Adrien Grand commented on LUCENE-3892: > -------------------------------------- > > I ran the comparison between acceptableOverheadRatio=PackedInts.COMPACT (0%) > and PackedInts.DEFAULT (20%) and it seems to be much faster with > PackedInts.COMPACT: > > {noformat} > base=COMPACT, challenger=DEFAULT > Task QPS base StdDev base QPS def StdDev def Pct > diff > IntNRQ 81.83 5.43 74.14 2.94 -18% - > 0% > HighTerm 146.55 10.34 133.57 9.02 -20% - > 4% > LowPhrase 93.91 1.63 86.90 1.67 -10% - > -4% > MedTerm 824.58 43.48 766.35 38.78 -16% - > 3% > LowSloppyPhrase 83.29 1.99 77.65 1.18 -10% - > -3% > OrHighMed 94.15 5.28 88.34 4.54 -15% - > 4% > OrHighHigh 100.63 5.42 94.57 4.20 -14% - > 3% > OrHighLow 128.62 7.21 120.92 6.07 -15% - > 4% > HighPhrase 13.05 0.45 12.29 0.39 -11% - > 0% > Prefix3 217.06 6.82 205.05 4.62 -10% - > 0% > MedPhrase 27.50 0.97 26.33 0.79 -10% - > 2% > Wildcard 183.20 4.87 175.58 3.89 -8% - > 0% > LowTerm 1763.31 43.24 1693.31 39.29 -8% - > 0% > HighSloppyPhrase 10.05 0.48 9.67 0.40 -11% - > 5% > AndHighHigh 111.59 1.15 107.45 1.66 -6% - > -1% > LowSpanNear 56.16 1.32 54.25 1.01 -7% - > 0% > AndHighMed 423.44 7.40 409.32 5.10 -6% - > 0% > MedSpanNear 33.14 0.91 32.32 0.74 -7% - > 2% > AndHighLow 2177.50 30.79 2134.05 28.64 -4% - > 0% > Fuzzy1 95.34 2.41 93.66 2.32 -6% - > 3% > HighSpanNear 5.28 0.17 5.21 0.11 -6% - > 3% > MedSloppyPhrase 18.41 0.72 18.19 0.70 -8% - > 6% > Fuzzy2 37.73 1.31 37.31 1.14 -7% - > 5% > Respell 109.71 3.09 108.64 2.76 -6% - > 4% > PKLookup 257.32 6.64 260.00 7.15 -4% - > 6% > {noformat} > >> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, >> Simple9/16/64, etc.) >> ------------------------------------------------------------------------------------- >> >> Key: LUCENE-3892 >> URL: https://issues.apache.org/jira/browse/LUCENE-3892 >> Project: Lucene - Core >> Issue Type: Improvement >> Reporter: Michael McCandless >> Labels: gsoc2012, lucene-gsoc-12 >> Fix For: 4.1 >> >> Attachments: LUCENE-3892-BlockTermScorer.patch, >> LUCENE-3892-blockFor&hardcode(base).patch, >> LUCENE-3892-blockFor&packedecoder(comp).patch, >> LUCENE-3892-blockFor-with-packedints-decoder.patch, >> LUCENE-3892-blockFor-with-packedints-decoder.patch, >> LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-blockpfor.patch, >> LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, >> LUCENE-3892-for&pfor-with-javadoc.patch, >> LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, >> LUCENE-3892-pfor-compress-iterate-numbits.patch, >> LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, >> LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, >> LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, >> LUCENE-3892_settings.patch, LUCENE-3892_settings.patch >> >> >> On the flex branch we explored a number of possible intblock >> encodings, but for whatever reason never brought them to completion. >> There are still a number of issues opened with patches in different >> states. >> Initial results (based on prototype) were excellent (see >> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html >> ). >> I think this would make a good GSoC project. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
