[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-3892: --------------------------------------- Attachment: LUCENE-3892-non-specialized.patch I created a non-specialized (ie single method to handle all numBits cases) packed int decoder that decodes directly from byte[]. Baseline is current BlockPF (FOR w/ specialized decoder), comp is w/ the patch (using non-specialized decoder): {noformat} Task QPS base StdDev base QPS for StdDev for Pct diff AndHighMed 69.04 0.77 36.41 1.91 -50% - -43% AndHighLow 649.70 17.03 346.71 18.22 -50% - -42% LowSpanNear 9.88 0.25 5.53 0.06 -45% - -42% MedPhrase 13.25 0.26 7.74 0.07 -43% - -39% LowSloppyPhrase 7.59 0.15 4.54 0.13 -43% - -37% LowPhrase 22.29 0.31 13.77 0.08 -39% - -36% AndHighHigh 23.55 0.12 15.22 0.63 -38% - -32% MedSloppyPhrase 6.88 0.12 4.60 0.16 -36% - -29% HighSloppyPhrase 1.98 0.07 1.38 0.05 -35% - -25% HighTerm 36.11 0.01 25.31 0.87 -32% - -27% MedSpanNear 5.02 0.16 3.56 0.03 -31% - -26% MedTerm 198.76 0.34 142.92 4.34 -30% - -25% HighPhrase 1.83 0.08 1.32 0.02 -31% - -23% OrHighLow 27.32 1.10 20.55 0.54 -29% - -19% OrHighMed 23.65 0.93 17.83 0.44 -29% - -19% OrHighHigh 11.42 0.46 8.72 0.20 -28% - -18% HighSpanNear 1.74 0.06 1.38 0.01 -24% - -17% IntNRQ 11.61 0.01 9.26 0.02 -20% - -20% LowTerm 513.60 2.26 411.60 7.65 -21% - -18% Prefix3 82.36 1.05 67.48 1.29 -20% - -15% Wildcard 52.63 0.44 43.45 0.81 -19% - -15% Fuzzy1 74.74 1.02 70.03 0.80 -8% - -3% PKLookup 192.60 3.94 191.87 2.07 -3% - 2% Fuzzy2 62.50 1.29 62.74 1.10 -3% - 4% Respell 61.69 1.04 62.79 0.84 -1% - 4% {noformat} So... is it's clear all our the specializing does help! > Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, > Simple9/16/64, etc.) > ------------------------------------------------------------------------------------- > > Key: LUCENE-3892 > URL: https://issues.apache.org/jira/browse/LUCENE-3892 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Labels: gsoc2012, lucene-gsoc-12 > Fix For: 4.1 > > Attachments: LUCENE-3892-BlockTermScorer.patch, > LUCENE-3892-blockFor&hardcode(base).patch, > LUCENE-3892-blockFor&packedecoder(comp).patch, > LUCENE-3892-blockFor-with-packedints-decoder.patch, > LUCENE-3892-blockFor-with-packedints-decoder.patch, > LUCENE-3892-blockFor-with-packedints.patch, LUCENE-3892-bulkVInt.patch, > LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, > LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, > LUCENE-3892-pfor-compress-iterate-numbits.patch, > LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, > LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, > LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, > LUCENE-3892_settings.patch, LUCENE-3892_settings.patch > > > On the flex branch we explored a number of possible intblock > encodings, but for whatever reason never brought them to completion. > There are still a number of issues opened with patches in different > states. > Initial results (based on prototype) were excellent (see > http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html > ). > I think this would make a good GSoC project. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org