[ https://issues.apache.org/jira/browse/LUCENE-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-1410: --------------------------------------- Attachment: LUCENE-1410.patch Thanks for the new PForDelta impl! Now we are swimming in int encoders... :) I attached a new patch, to get this patch working on the bulkpostings branch (https://svn.apache.org/repos/asf/lucene/dev/branches/bulkpostings). I also swapped in assert / throw execption instead of System.out.println in some places. All lucene core tests pass w/ this codec. Some notes/questions: * The compress looks somewhat costly? You're allocating several new int[], and also the search for the best bit size looks costly (though, so does the search for the other prototype pfor we have, committed on the branch). * Decompress also allocates several new int[]. I think we have to reuse this int[] buffers and minimize copying to get good perf... * The compressor was assuming that the compressed ints would always fit inside the input buffer, but in my case anyway this was not true. I'm using a block size of 128 and in one case it wanted 136 ints in the output, which seems odd since it should just leave the ints uncompressed in that case and use only 129 ints (+1 int for the header). I changed the code to just return the compressed buffer instead of trying to copy over the input buffer. * It's nice that this new pfor impl handles all 0s, but, Lucene will never hit this (at least today; if we subtracted 1 from our freqs then we may hit it). So we can same some CPU by not checking if bits==0. We can also avoid null checks, checking for input length 0, etc. -- we can trade this safety for perf. Such checks can be done as asserts instead. * How come only certain bit sizes are valid? EG 17,18,19 all seem to round up to 20? * How come bits cannot go up to 31? Or maybe you just use a full int if it's over 28? Seems like a good idea... * It's neat that you separately encode exc positions & values (high bits), leaving low bits in the slot. Does this give better perf than linking them together? (I think pfor1 links). Although all tests pass if I run w/ -Dtests.codec=PatchedFrameOfRef2, if I try to build a big wikipedia index I hit this: {noformat} java.lang.ArrayIndexOutOfBoundsException: 386 at org.apache.lucene.util.pfor2.Simple16.s16Compress(Simple16.java:68) at org.apache.lucene.util.pfor2.PForDelta.compressBlockByS16(PForDelta.java:270) at org.apache.lucene.util.pfor2.PForDelta.compressOneBlockCore(PForDelta.java:221) at org.apache.lucene.util.pfor2.PForDelta.compressOneBlock(PForDelta.java:82) at org.apache.lucene.index.codecs.pfordelta2.PForDeltaFixedIntBlockCodec.encodeOneBlockWithPForDelta(PForDeltaFixedIntBlockCodec.java:87) at org.apache.lucene.index.codecs.pfordelta2.PForDeltaFixedIntBlockCodec$PForDeltaIntFactory$2.flushBlock(PForDeltaFixedIntBlockCodec.java:151) at org.apache.lucene.index.codecs.intblock.FixedIntBlockIndexOutput.write(FixedIntBlockIndexOutput.java:129) at org.apache.lucene.index.codecs.sep.SepPostingsWriterImpl.startDoc(SepPostingsWriterImpl.java:204) at org.apache.lucene.index.codecs.PostingsConsumer.merge(PostingsConsumer.java:79) at org.apache.lucene.index.codecs.TermsConsumer.merge(TermsConsumer.java:97) at org.apache.lucene.index.codecs.FieldsConsumer.merge(FieldsConsumer.java:49) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:634) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:147) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3281) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2804) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:339) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:407) {noformat} But, then I remembered, I think Simple16 cannot represent values >= 2^28? Which is a problem here since in general Lucene indexes can have such values (though they should be rarish). Maybe I'm hitting one here and this is what happens? Anyway, I think somehow we need to merge pfor1 and pfor2 to get the best of both words. I like that pfor2 uses Simple16 for encoding exceptions, and how it encodes exceptions (if indeed this is better perf), but I like that pfor1 (currently committed on the branch) is fast decode (specializes the N bit cases) and allows pure FOR (ie no exceptions, you encode to max bit size, which wastes bits but is fast). I'd like to commit this as "pfor2" (on the bulkpostings branch); over time we can iterate to merge the two impls, before landing on trunk. > PFOR implementation > ------------------- > > Key: LUCENE-1410 > URL: https://issues.apache.org/jira/browse/LUCENE-1410 > Project: Lucene - Java > Issue Type: New Feature > Components: Index > Reporter: Paul Elschot > Priority: Minor > Fix For: Bulk Postings branch > > Attachments: autogen.tgz, for-summary.txt, > LUCENE-1410-codecs.tar.bz2, LUCENE-1410.patch, LUCENE-1410.patch, > LUCENE-1410.patch, LUCENE-1410.patch, LUCENE-1410b.patch, LUCENE-1410c.patch, > LUCENE-1410d.patch, LUCENE-1410e.patch, TermQueryTests.tgz, TestPFor2.java, > TestPFor2.java, TestPFor2.java > > Original Estimate: 21840h > Remaining Estimate: 21840h > > Implementation of Patched Frame of Reference. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org