[ 
https://issues.apache.org/jira/browse/LUCENE-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419283#comment-13419283
 ] 

Han Jiang edited comment on LUCENE-4239 at 7/20/12 4:49 PM:
------------------------------------------------------------

Thank you Adrien! We'll work easier with this Decoder/Encoder interface.

However, This patch isn't passing ant-compile under latest trunk, seems that 
encoder/decoder methods for Packed64SingleBlockBulkOperation32 are missing? 
Anyway, we're not using docId up to 32 bits currently, I'll test the 
performance later.

* We're still using IntBuffer just because IndexInput/Ouput don't provide a 
read/writeInts() method :). Since we still have to handle IndexInput/Output at 
upper level, we prefer to use direct int[] rather than IntBuffer now. Actually, 
we had a patch making PackedIntsDecompress handle int[] instead, you can have a 
glance at it: http://pastebin.com/euvtBD8P. Performance test show little 
difference between these two versions, and we should choose a clean & simple 
impl right?

* As for PFor, we may have to encode another small block of ints with packed 
format when blockSize<128 and blockSize%32 != 0. Current impl will use 
numBits=8,16,32 to simplify decoder. However, we may consider to use other 
numBits in near future, I'm afraid this will be a bottleneck when decoder is 
not hardcoded.


So... as a second shot, maybe you can provide us methods like: encode(int[] 
values, long[] blocks, int iterations), decode(long[] blocks, int[] values, int 
iterations)? 


                
      was (Author: billy):
    Thank you Adrien! We'll work easier with this Decoder/Encoder interface.

However, This patch isn't passing ant-compile under latest trunk, seems that 
encoder/decoder methods for Packed64SingleBlockBulkOperation32 are missing? 
Anyway, we're not using docId up to 32 bits currently, I'll test the 
performance later.

Since we have to handle IndexInput/Output at upper level, we prefer to use 
direct int[] rather than IntBuffer. Actually, we had a patch making 
PackedIntsDecompress handle int array instead: 
https://issues.apache.org/jira/secure/attachment/12532888/LUCENE-3892_for_int%5B%5D.patch
 (the file name was ForDecompressImpl.java). Performance test shows little 
difference between these two versions, but as int[] is clear and simple, I 
think that should be what we hope to use.

So... maybe you can provide us methods like: encode(int[] values, long[] 
blocks, int iterations), decode(long[] blocks, int[] values, int iterations)? 
                  
> Provide access to PackedInts' low-level blocks <-> values conversion methods
> ----------------------------------------------------------------------------
>
>                 Key: LUCENE-4239
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4239
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/other
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: LUCENE-4239.patch
>
>
> In LUCENE-4161 we started to make the {{PackedInts}} API more flexible so 
> that codecs could use it whenever they need to (un)pack integers. There are 
> two posting formats in progress (For and PFor, LUCENE-3892) that perform a 
> lot of integer (un)packing but the current API still has limits :
>  - it only works with long[] arrays, whereas these codecs need to manipulate 
> int[] arrays,
>  - the packed reader iterators work great for unpacking long sequences of 
> integers, but they would probably cause a lot of overhead to decode lots of 
> short integer sequences such as the ones that can be generated by For and 
> PFor.
> I've been looking at the For/PFor branch and it has a 
> {{PackedIntsDecompress}} class 
> (http://svn.apache.org/repos/asf/lucene/dev/branches/pforcodec_3892/lucene/core/src/java/org/apache/lucene/codecs/pfor/PackedIntsDecompress.java)
>  which is very similar to {{oal.util.packed.BulkOperation}} 
> (package-private), so maybe we should find a way to expose this class so that 
> the For/PFor branch can directly use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to