hi Michael you posted a patch here https://issues.apache.org/jira/browse/LUCENE-2723 I am not familiar with patch. do I need download LUCENE-2723.patch(there are many patches after this name, do I need the latest one?) and LUCENE-2723_termscorer.patch and patch them (patch -p1 <LUCENE-2723.patch)? I just check out the latest source code from http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene
2010/12/14 Michael McCandless <luc...@mikemccandless.com>: > Likely you are seeing the startup cost of hotspot compiling the PFOR code? > > Ie, does your test first "warmup" the JRE and then do the real test? > > I've also found that running -Xbatch produces more consistent results > from run to run, however, those results may not be as fast as running > w/o -Xbatch. > > Also, it's better to test on actual data (ie a Lucene index's > postings), and in the full context of searching, because then we get a > sense of what speedups a real app will see... micro-benching is nearly > impossible in Java since Hotspot acts very differently vs the "real" > test. > > Mike > > On Tue, Dec 14, 2010 at 2:50 AM, Li Li <fancye...@gmail.com> wrote: >> Hi >> I tried to integrate PForDelta into lucene 2.9 but confronted a problem. >> I use the implementation in >> http://code.google.com/p/integer-array-compress-kit/ >> it implements a basic PForDelta algorithm and an improved one(which >> called NewPForDelta, but there are many bugs and I have fixed them), >> But compare it with VInt and S9, it's speed is very slow when only >> decode small number of integer arrays. >> e.g. when I decoded int[256] arrays which values are randomly >> generated between 0 and 100, if decode just one array. PFor(or >> NewPFor) is very slow. when it continuously decodes many arrays such >> as 10000, it's faster than s9 and vint. >> Another strange phenomena is that when call PFor decoder twice, the >> 2nd times it's faster. Or I call PFor first then NewPFor, the NewPFor >> is faster. reverse the call sequcence, the 2nd called decoder is >> faster >> e.g. >> ct.testNewPFDCodes(list); >> ct.testPFor(list); >> ct.testVInt(list); >> ct.testS9(list); >> >> NewPFD decode: 3614705 >> PForDelta decode: 17320 >> VINT decode: 16483 >> S9 decode: 19835 >> when I call by the following sequence >> >> ct.testPFor(list); >> ct.testNewPFDCodes(list); >> ct.testVInt(list); >> ct.testS9(list); >> >> PForDelta decode: 3212140 >> NewPFD decode: 19556 >> VINT decode: 16762 >> S9 decode: 16483 >> >> My implementation is -- group docIDs and termDocFreqs into block >> which contains 128 integers. when SegmentTermDocs's next method >> called(or read readNoTf).it decodes a block and save it to a cache. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org