[ https://issues.apache.org/jira/browse/LUCENE-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419283#comment-13419283 ]
Han Jiang edited comment on LUCENE-4239 at 7/20/12 4:49 PM: ------------------------------------------------------------ Thank you Adrien! We'll work easier with this Decoder/Encoder interface. However, This patch isn't passing ant-compile under latest trunk, seems that encoder/decoder methods for Packed64SingleBlockBulkOperation32 are missing? Anyway, we're not using docId up to 32 bits currently, I'll test the performance later. * We're still using IntBuffer just because IndexInput/Ouput don't provide a read/writeInts() method :). Since we still have to handle IndexInput/Output at upper level, we prefer to use direct int[] rather than IntBuffer now. Actually, we had a patch making PackedIntsDecompress handle int[] instead, you can have a glance at it: http://pastebin.com/euvtBD8P. Performance test show little difference between these two versions, and we should choose a clean & simple impl right? * As for PFor, we may have to encode another small block of ints with packed format when blockSize<128 and blockSize%32 != 0. Current impl will use numBits=8,16,32 to simplify decoder. However, we may consider to use other numBits in near future, I'm afraid this will be a bottleneck when decoder is not hardcoded. So... as a second shot, maybe you can provide us methods like: encode(int[] values, long[] blocks, int iterations), decode(long[] blocks, int[] values, int iterations)? was (Author: billy): Thank you Adrien! We'll work easier with this Decoder/Encoder interface. However, This patch isn't passing ant-compile under latest trunk, seems that encoder/decoder methods for Packed64SingleBlockBulkOperation32 are missing? Anyway, we're not using docId up to 32 bits currently, I'll test the performance later. Since we have to handle IndexInput/Output at upper level, we prefer to use direct int[] rather than IntBuffer. Actually, we had a patch making PackedIntsDecompress handle int array instead: https://issues.apache.org/jira/secure/attachment/12532888/LUCENE-3892_for_int%5B%5D.patch (the file name was ForDecompressImpl.java). Performance test shows little difference between these two versions, but as int[] is clear and simple, I think that should be what we hope to use. So... maybe you can provide us methods like: encode(int[] values, long[] blocks, int iterations), decode(long[] blocks, int[] values, int iterations)? > Provide access to PackedInts' low-level blocks <-> values conversion methods > ---------------------------------------------------------------------------- > > Key: LUCENE-4239 > URL: https://issues.apache.org/jira/browse/LUCENE-4239 > Project: Lucene - Java > Issue Type: Improvement > Components: core/other > Reporter: Adrien Grand > Assignee: Adrien Grand > Priority: Minor > Fix For: 4.0 > > Attachments: LUCENE-4239.patch > > > In LUCENE-4161 we started to make the {{PackedInts}} API more flexible so > that codecs could use it whenever they need to (un)pack integers. There are > two posting formats in progress (For and PFor, LUCENE-3892) that perform a > lot of integer (un)packing but the current API still has limits : > - it only works with long[] arrays, whereas these codecs need to manipulate > int[] arrays, > - the packed reader iterators work great for unpacking long sequences of > integers, but they would probably cause a lot of overhead to decode lots of > short integer sequences such as the ones that can be generated by For and > PFor. > I've been looking at the For/PFor branch and it has a > {{PackedIntsDecompress}} class > (http://svn.apache.org/repos/asf/lucene/dev/branches/pforcodec_3892/lucene/core/src/java/org/apache/lucene/codecs/pfor/PackedIntsDecompress.java) > which is very similar to {{oal.util.packed.BulkOperation}} > (package-private), so maybe we should find a way to expose this class so that > the For/PFor branch can directly use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org