[ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Han Jiang updated LUCENE-3892: ------------------------------ Attachment: LUCENE-3892-blockFor-with-packedints.patch An initial try with PackedInts in current trunk version. I replaced all the int[] buffer with long[] buffer so we can use the API directly. I don't quite understand the Writer part, so we have to save each long value one by one. However, it is the Reader part we are concerned: {format} Task QPS base StdDev base QPS packedStdDev packed Pct diff AndHighHigh 29.60 1.56 23.78 0.51 -25% - -13% AndHighMed 74.68 3.92 53.15 2.31 -35% - -21% Fuzzy1 88.23 1.21 87.13 1.41 -4% - 1% Fuzzy2 30.09 0.45 29.47 0.47 -5% - 1% IntNRQ 41.96 3.88 38.16 2.48 -22% - 6% OrHighHigh 17.56 0.34 15.45 0.15 -14% - -9% OrHighMed 34.71 0.76 30.77 0.53 -14% - -7% PKLookup 111.00 1.90 110.52 1.59 -3% - 2% Phrase 9.03 0.23 7.62 0.41 -22% - -8% Prefix3 123.56 8.42 110.94 5.43 -20% - 1% Respell 102.37 1.11 101.79 1.38 -2% - 1% SloppyPhrase 3.97 0.19 3.52 0.07 -17% - -4% SpanNear 8.24 0.18 7.22 0.25 -17% - -7% Term 45.16 3.15 37.47 2.32 -27% - -5% TermBGroup1M 17.19 1.09 15.86 0.77 -17% - 3% TermBGroup1M1P 23.47 1.66 20.43 1.16 -23% - -1% TermGroup1M 19.20 1.14 17.73 0.84 -16% - 2% Wildcard 42.75 3.27 36.75 1.96 -24% - -1% {format} Maybe we should try PACKED_SINGLE_BLOCK for some special value of numBits, instead of using PACKED all the time? Thanks to Adrien, we have a more direct API in LUCENE-4239, I'm trying that now. > Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, > Simple9/16/64, etc.) > ------------------------------------------------------------------------------------- > > Key: LUCENE-3892 > URL: https://issues.apache.org/jira/browse/LUCENE-3892 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Michael McCandless > Labels: gsoc2012, lucene-gsoc-12 > Fix For: 4.1 > > Attachments: LUCENE-3892-BlockTermScorer.patch, > LUCENE-3892-blockFor-with-packedints.patch, > LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, > LUCENE-3892-for&pfor-with-javadoc.patch, > LUCENE-3892-for&pfor-with-javadoc.patch, > LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892-for&pfor.patch, > LUCENE-3892-handle_open_files.patch, > LUCENE-3892-pfor-compress-iterate-numbits.patch, > LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for.patch, > LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, > LUCENE-3892_for_unfold_method.patch, LUCENE-3892_pfor.patch, > LUCENE-3892_pfor.patch, LUCENE-3892_pfor.patch, > LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, > LUCENE-3892_settings.patch, LUCENE-3892_settings.patch > > > On the flex branch we explored a number of possible intblock > encodings, but for whatever reason never brought them to completion. > There are still a number of issues opened with patches in different > states. > Initial results (based on prototype) were excellent (see > http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html > ). > I think this would make a good GSoC project. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org