[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13430423#comment-13430423 ]
Han Jiang commented on LUCENE-3892: ----------------------------------- Thanks Adrien! Your codes are really clean! At first glance, I think we should still support all-value-the-same case? For some applications(like index with payloads), that might be helpful. And, I'm a little confused about your performance test. Did you use BlockPF before r1370179 as a baseline, and compare it with your latest commit? Here, I tested these two PF under latest versions(r1370345). {noformat} Task QPS base StdDev base QPS comp StdDev comp Pct diff AndHighHigh 124.53 9.36 100.46 3.31 -27% - -9% AndHighLow 2141.08 63.93 1922.73 36.32 -14% - -5% AndHighMed 281.48 36.49 218.68 13.10 -35% - -5% Fuzzy1 84.33 2.56 83.94 1.67 -5% - 4% Fuzzy2 30.49 1.13 30.48 0.71 -5% - 6% HighPhrase 9.08 0.28 7.56 0.20 -21% - -11% HighSloppyPhrase 5.46 0.21 4.88 0.23 -17% - -2% HighSpanNear 10.12 0.21 9.21 0.30 -13% - -3% HighTerm 176.52 6.13 146.13 5.43 -22% - -11% IntNRQ 59.56 1.98 51.05 1.33 -19% - -9% LowPhrase 40.02 1.03 32.75 0.37 -21% - -15% LowSloppyPhrase 59.59 2.85 51.49 1.33 -19% - -6% LowSpanNear 73.86 3.17 61.98 1.45 -21% - -10% LowTerm 1755.38 15.56 1622.61 26.87 -9% - -5% MedPhrase 25.99 0.47 21.01 0.17 -21% - -16% MedSloppyPhrase 30.52 0.89 24.77 0.55 -22% - -14% MedSpanNear 22.26 0.43 18.73 0.47 -19% - -12% MedTerm 651.90 18.97 573.34 19.25 -17% - -6% OrHighHigh 26.75 0.33 23.53 0.50 -14% - -9% OrHighLow 151.69 2.13 134.17 3.19 -14% - -8% OrHighMed 102.48 1.48 90.73 2.01 -14% - -8% PKLookup 216.59 5.70 215.99 2.99 -4% - 3% Prefix3 166.00 0.78 145.25 1.29 -13% - -11% Respell 82.01 3.01 82.80 1.66 -4% - 6% Wildcard 151.66 2.22 141.14 1.57 -9% - -4% {noformat} Strange that it isn't working well on my computer. And results are similar when I change MMapDirectory to NIOFSDirectory. > Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, > Simple9/16/64, etc.) > ------------------------------------------------------------------------------------- > > Key: LUCENE-3892 > URL: https://issues.apache.org/jira/browse/LUCENE-3892 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Labels: gsoc2012, lucene-gsoc-12 > Fix For: 4.1 > > Attachments: LUCENE-3892-BlockTermScorer.patch, > LUCENE-3892-blockFor&hardcode(base).patch, > LUCENE-3892-blockFor&packedecoder(comp).patch, > LUCENE-3892-blockFor-with-packedints-decoder.patch, > LUCENE-3892-blockFor-with-packedints-decoder.patch, > LUCENE-3892-blockFor-with-packedints.patch, > LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, > LUCENE-3892-handle_open_files.patch, > LUCENE-3892-pfor-compress-iterate-numbits.patch, > LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, > LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch, > LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, > LUCENE-3892_settings.patch, LUCENE-3892_settings.patch > > > On the flex branch we explored a number of possible intblock > encodings, but for whatever reason never brought them to completion. > There are still a number of issues opened with patches in different > states. > Initial results (based on prototype) were excellent (see > http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html > ). > I think this would make a good GSoC project. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org